JXTA
Peer to peer
-
Peer to peer systems consist of a number of "nodes" which may
be the same (e.g. all PCs) or differ (some PCs, some workstations)
-
They all make the same services availaable
-
A mode can act as a consumer of a service (client) or producer
of service (server)
-
Common examples: Napster, Gnutella
-
Peer to peer infrastructure: JXTA
Napster
-
Napster is a protocol for sharing MP3 files between users.
-
This is the most visible of the recent peer-to-peer (P2P)
systems.
-
Because it involves transfer of MP3 files it has attracted
attention for two reasons:
-
MP3 files are large, and transfer of them
has become a
significant part of internet traffic
-
The MP3 files are usually illegal copies of copyrighted music,
actions.
-
The Napster site has been effectively emasculated by
legal actions
Napster and P2P
-
P2P models differ from client-server models in that each node is
capable of acting either as a client or as a server or both.
-
For Napster,
this means that each node can act either as a consumer of MP3
files or as a source of MP3 files.
Napster protocol
-
Napster uses a protocol on top of TCP
-
A Napster server listens on ports 8888 and 7777
-
each message to/from the server is in the form of
<length><type><data>
where <length>
and <type>
are 2 bytes each.
-
The
<data>
portion of the message is a plain ASCII string.
-
This is a binary format: each line does NOT end in
\r\n
-
The complete protocol is at
http://opennap.sourceforge.net/napster.txt
Napster protocol types
Napster directory discovery
-
Peers need to be able to find other peers, and find out what they have on them
-
The napster protocol uses a central registry
-
All services
add their files to this registry.
-
All clients query this registry.
-
there is only one Napster, at a fixed address
www.napster.com
,
so directory discovery is trivial.
-
The central registry is also the means whereby Napster was closed -
single point of failure
Napster service discovery and invocation
-
The napster idea of a service is an MP3 files.
-
Clients can make
queries by filename on the napster registry, and are returned
the names of nodes holding these files.
-
the files stay on the client machine,
never passing through the server.
-
The server provides the ability to search for particular
files and initiate a direct transfer between the clients.
Napster network implications
-
The Napster server is a potential bottleneck for requests.
-
The size of the MP3 files traded is a more serious issue and
can consume a large part of network bandwidth.
Gnutella - scope
-
Gnutella is a fully-distributed information-sharing technology.
-
Gnutella client software is basically a mini search engine and
file serving system in one.
-
When you search for something on the
Gnutella Network, that search is transmitted to everyone in
your Gnutella Network "horizon".
-
If anyone has anything matching your search, it will respond to you.
Gnutella P2P
-
Gnutella is a peer-to-peer system for file sharing
-
every client on the
GnutellaNet is also a server, so you not only can find stuff,
but you can also make things available for the benefit of others.
Gnutella protocol
-
This is a binary format
-
Each message is preceded by a descriptor header
-
bytes 0-15: unique network id
-
byte 16: message type (0x00 = ping, 0x01 = pong, 0x02 = push, etc)
-
byte 17: TTL (time to live, hops allowed)
-
byte 18: hops (already travelled)
-
bytes 19-22: payload length
-
Messages contain routing information (TTL) as well as type and payload
-
Spec is at
http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf
Gnutella peer discovery
-
Once a peer has connected successfully to the network, it communicates
with other peers by sending and receiving Gnutella protocol descriptors.
-
Pings and queries used to discover hosts and files, respectively, are
broadcast;
-
other message types, including responses, are routed.
Gnutella ping pong
-
A peer uses "Ping" to probe hosts on the network. Another peer will
forward incoming Ping and Query descriptors to all of its directly
connected peers, except the one that delivered the incoming Ping or
Query.
-
Pong is the response to a Ping. Includes the address of a connected
Gnutella servent
and information regarding the files
it is sharing on the network.
Gnutella startup
-
while the protocol is designed for a user to set up connections with his
"friends", there is no infrastructure in place for
finding new friends.
-
Instead, the Gnutella site offers a "default" set
of friends with which users can start.
-
Most users will never change
this file if the service is functional.
-
This means that the actual
network tends to be a hierarchical system.
File discovery and retrieval
-
A query is the primary mechanism for searching the distributed network.
-
A servent receiving
a Query descriptor will respond with a QueryHit if a match is
found against
-
Push is used by a peer which is behind a firewall.
-
The file download protocol is HTTP.
-
The Gnutella protocol provides support for the HTTP Range parameter, so
that interrupted downloads may be resumed
Gnutella network implications
-
Gnutella uses a UUID (Unique Universal Identifier) and
TTL (time to live) to control routing.
-
The UUID is used to avoid loops
-
TTL controls the number of hops that are made
-
The actual downloads are done by point-to-point connections,
meaning that the IP addresses of server and reader are both
revealed to each other.
Gnutella network implications
-
Gnutella traffic has passed the point where peers connected by modems
are able to keep up. This has caused a change in the characteristics
of network traffic. Prior to this, each peer would be connected
to about 1000 other peers. Afterwards, it dropped to a few hundred.
Peer to peer issues
-
Peers have a "horizon" of other peers they know about
-
Messages may be "broadcast" to all peers in the horizon
-
The horizon may include thousands of peers, so network
traffic is non-trivial
-
There may be centralised lookup services, or distributed ones
-
Centralised lookup services are a bottleneck and a point of
control
-
Distributed lookup services need to be boot-strapped
-
"Services" may be moved from one peer to another
(e.g. mobile files)
-
Service information/location may be moved from one lookup
service to another
-
Since each peer is a server, it will be listening on a server
socket. So it is not possible to have two peers on the same
machine
Tunnelling
-
Many current/new protocols would be blocked if they used arbitrary ports
-
Most network admins will only allow limited access through the firewall
-
Often port 80 is left open for "harmless" Web traffic
-
Sometimes access through port 80 is blocked, and requests have to go via
a proxy on e.g. port 8080
-
Many protocols use "tunnelling" to pass their traffic on port 80
Tunneling data
JXTA - scope
-
JXTA is a recently publicised P2P system
-
It is intended as a research tool at present
-
It is basically an infrastructure on which services
(such as file sharing) can be built,
-
JXTA is intended to be language and network independent.
-
The current
implementation (v1.0) is in Java on top of TCP/IP but this is not
inherent in the design.
JXTA layer
-
TCP/IP already supply a transport layer and search services such as DNS
-
These are essentially designed for relatively static systems
-
In P2P systems, peers are much more dynamic
-
peers may only last for a short time
-
they may change their IP address through DHCP configuration
-
The existence of firewalls and routers means that peer-to-peer communication may have
to use a number of protocols and tunnelling techniques to get through
-
JXTA effectively builds another transport layer above TCP/IP, with its own
addressing, routing, lookup services (overkill?)
JXTA concepts
-
UUID identifiers for each entity, which may be bound to external
information such as a network address
-
Advertisements which are XML documents. These can be
used to describe any resource such as nodes or services
-
Peers. These are "nodes" of the JXTA network, and can speak the
JXTA peer protocols.
-
Peer groups. These allow grouping of peers in any useful fashion.
They are deliberately not clearly specified,
and could represent e.g. a collection
of services, a geographical group, etc
JXTA concepts
-
Messages. These are datagram style messages, so can be used on
unreliable, asynchronous and uni-directional transports such as IP.
The format of messages is not prescribed, so can be used to carry
any information
-
Pipes. These connect between peers and are used to send messages.
They can be one-to-one, many-to-many, etc.
-
Pipes are bound
at runtime, allowing the possibility of being rebound if errors
occur. Pipes are used as the single communication mechanism
JXTA peer discovery
Peer discovery can be done in a variety
of ways:
-
By multicast on (usually) a local network.
-
By unicast connection to a repository of peers.
This can be used to bootstrap a peer and inform it of
world-wide peers.
The current implementation has a hard-coded set of repositories,
but this will be fixed
-
By requesting peer lists that are known by another peer
-
By offering a peer list to another peer
JXTA Service discovery
-
The lowest level of searching for services is by the Peer Discovery
Protocol. Peers are distinguished by being "ordinary" peers or by
being "rendezvous" peers.
-
Ordinary peers keep information about the
services they offer.
-
Rendezvous peers cache service adverts so that
they act as proxies for service adverts
JXTA Service discovery
Searching involves
-
Ask all the peers one hop away if they have the service
-
Ask the known rendezvous servers if they know of the service
-
The rendezvous servers may ask other rendezvous servers if
they know of the service
-
The peer asking for the service must backoff for a certain
time before making a repeat search (cf ARP requests)
Service invocation
-
Services are described using WSDL.
-
Services are invoked using JXTA pipes.
-
The invocation mechanism through a pipe could be e.g. SOAP
Network implications
-
JXTA does currently not prescribe the scope of messages.
-
To avoid message floods, message
transmission is limited
in the current implementation to the peer groups
the peer belongs to
-
In effect, this is bounding multicast scope to
the peer group rather than to the local LAN.
-
Every peer belongs to the World Group, though
JXTA Server example
This uses the configuration file
MembershipAuthenticator=NullAuthenticator
MembershipIdentity=somebody
InitialNetPeerGroupAppCode=Server
InitialNetPeerGroupAppCodeURL=http:/www.jxta.org/download/jxta.jar
The server is
JXTA Client example
This uses the configuration file
MembershipAuthenticator=NullAuthenticator
MembershipIdentity=somebody
InitialNetPeerGroupAppCode=Client
InitialNetPeerGroupAppCodeURL=http:/www.jxta.org/download/jxta.jar
The client is
Jan Newmarch (http://jan.newmarch.name)
jan@newmarch.name
Last modified: Mon Oct 7 11:51:49 EST 2002
Copyright ©Jan Newmarch