P2P API

In this article I describe several sets of APIs for P2P. For a software engineer, designing a good API is very important. A good abstraction of the underlying concept leads to a good API. For example, the data model of storing key-value pair in a P2P network is usually abstracted as a hash table. Programmers like seeing existing semantics in an API because they are already familiar with those semantics. This article uses API ideas from OpenDHT, JXTA, Adobe Flash Player, and IETF P2P-SIP draft.

A good API
  1. should take the form of use-cases
  2. is very general and concise: does one thing and does it well
  3. is self-explanatory and is similar to existing concepts, models, practices
  4. is independent of implementation details
At the high level, there are three abstractions for P2P APIs: data storage, peer connectivity and group membership. There are several special cases among these abstractions.

Data Storage
A distributed hash table (DHT) is a form of structured P2P network with data stored using hash table abstraction. In particular, it provides put, get and remove API methods. Programmers are familiar with container semantics of hash-tables. In Python, this looks like:

a = DHT()
a['key1'] = 'value1'
print a['key1']
del a['key1']

Let us apply this to a use-case of P2P-SIP user location storage. In particular, the key is the user identifier of the form 'kundan@example.net' and value is the user location of the form 'kns10@192.1.2.3:5062'. With this we see several problems in the existing container semantics of the API listed above. Firstly, a user can have several locations, in which case a call request is sent to all those locations as in SIP forking proxy behavior. The following modification to the API causes more confusion because set takes a single value whereas get returns multiple.
a['kundan@example.net'] = 'kns10@192.1.2.3:5062'
print a['henning@example.net'] # prints all contacts of henning

To solve this we can assume a 'set' semantics for a[k]
a['key1'] += 'value1'
a['key2'] += 'value2'
print a['key1'] # print a list of values
a['key1'] -= 'value1' # remove specific key1-value1
del a['key1'] # remove all values for key1

Another problem is that this API is not secure or authenticated. A secure DHT API based on public-key infrastructure can sign the stored value on set and del.
a = DHT(privatekey=...)   # supply the private key of the owner
print a['key1'] # print all values for given key
print a(owner=...)['key1'] # print values only by given owner
A third problem is that the API is not extensible to pure event-based languages such as Flash ActionScript. In ActionScript, there are no blocking operations, hence set and get much be defined asynchronously. ActionScript already has SharedObject abstraction to deal with remote shared storage of objects. This can be reused for the API as follows:
so = DistributedSharedObject.getRemote(privatekey=...)
so.setProperty('key1', 'value1') # sets the key1-value1 pair
so.retrieve('key2') # initiates get
so.addEventListener('sync', syncHandler)
so.addEventListener('propertyChange', changeHandler)
function syncHandler(event:SyncEvent):void {
... # put is completed
}
function changeHandler(event:PropertyChangeEvent):void {
if (event.property == 'key2') # get is completed
trace(so.data['key2'])
}

Another problem is that the API doesn't take into account the time-to-live of key-value pair. This can be solved by supplying a default timeout in the constructor.
d = DHT(privatekey=..., timeout=3600)   # default TTL of one hour

Note that a data storage API can be built on top of a routing and connectivity API. Since we would like to separate the abstractions (data storage and peer connectivity/routing), we will define separate sets of APIs for these.
d = DHT(net=..., .privatekey=..., timeout=...) # use the given connectivity (net) object

Peer connectivity and routing
The connectivity and routing layer deals with maintenance of P2P network. Programmers are familiar with socket abstraction and there has been effort to map P2P to socket abstraction [2].
s0 = ServerSocket(...)  # a P2P node is created
s0.bind(identity=..., credentials=....) # node joins the P2P network
The bind method is similar to the JXTA semantics, and similar to the attach method as proposed in IETF P2P-SIP work. Actual communication with a specific peer can happen over a connected socket. A connected socket is returned in the connect or accept API methods on the server-socket.
s1 = s0.connect(remote=...) # connect to the given remote identity
s2, remote = s0.accept() # receive an incoming connection from remote
The connection procedure takes care of negotiating connectivity checks using ICE (or similar algorithm) to allow NAT and firewall traversal.

Once connected, the socket can be used to send or receive any message to the peer.
s1.send(data=..., timeout=...)
data = s1.recv()
The original server-socket can be used to send or receive messages to specific peers without explicit connection.
s0.sendto(data=..., remote=...)  # send to specific peer
data, remote = s0.recvfrom(timeout=...)
Sometimes, a node needs to send a message to any available node close to a given identifier. This can be implemented using overloaded sendto method that takes either a remote address or a key identifier. The latter performs P2P routing to the given destination key identifier.
s0.sendto(data=..., key=...)
data, remote, local = s0.recvfrom(...)
if local == s0.sockname: # received for this node identifier
...
else: # received to the given key presumably close this node
key = local
A connected socket can be disconnected or a node removed from the P2P network using the close method.
s1.close()  # close the connection
s0.close() # remove the node
These abstractions allow building a distributed object location and routing APIs [1] using locality aware decentralized directory service.

Group membership
Group membership APIs are similar to the multicast and anycast socket APIs. A node can join or leave a group, and a message can be sent via multicast or anycast to one or more nodes in a group. Let us define a group identifier similar to that in JXTA. We extend the previous socket abstraction to create a new group.
s3 = s0.join(group=...)  # join a given group
The join method returns a semi-connected socket object which can be used to send or receive packets on the given group.
s3.send(data=...)        # send to every node in the group
s3.sendany(data=...) # send to at most one node in the group
data, remote, key = s3.recv() # receive a multicast or anycast data
The socket API gives an intuitive abstraction to understand the P2P concepts. The actual implementation of the group membership may be more complex than simple routing and connectivity. Closing a group socket leaves the group membership.

We have seen how a P2P API can evolve to accommodate various concepts into existing well known abstractions such as hash table and socket.

References:
1. Towards a common API for structured peer-to-peer overlays (2003) http://oceanstore.cs.berkeley.edu/publications/papers/pdf/iptps03-api.pdf
2. The Socket API in JXTA 2.0 http://java.sun.com/developer/technicalArticles/Networking/jxta2.0/

No comments: