Graduate Networks, UCSD

CSE222 – Spring 2009

Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications May 29, 2009

This paper introduces Chord, whose main functionality is to efficiently locate a node in a peer to peer (P2P) network. Given a key, Chord maps it onto a node. It adapts efficiently with nodes leaving and joining the network frequently. Main contributions are:

  1. A modified version of consistent hashing: Chord uses consistent hashing to map keys to nodes. Consistent hashing provides load balancing by evenly distributing keys to nodes (each node is assigned roughly K/N keys), and it also has the nice property that keys don’t change too much as nodes join or leave the network (only O(1/N) fraction of the keys are moved to a new location as a result).  One modification that the authors made is to remove the need to be aware of all other nodes in the network. Based on their implementation, a node needs to know about only O(log N) other nodes in a network of N nodes. Therefore, lookups require O(log N) messages to other nodes. Chord can still function with awareness of fewer nodes, but its performance would suffer (each node only needs to know about one other node, its successor node).
  2. Notion of successor nodes, and finger tables: Chord is able to achieve its scalability and good properties through the use of finger tables. First, the keys are arranged in order on an identifier circle. A function such as SHA1 could be used to hash node IP addresses to keys on the identifier circle. The successor of a key is then the first node on the identifier circle with key equal or greater than the search key. Finger tables are maintained on each node n. The table is a list of successors of n in distances exponential in 2: the first key on the the table is the successor of n. The i-th entry on the table is the id of the first node that succceeds n by at least 2^i-1 on the identifier circle.  To lookup a key, the current node compares the key with its own id and if it’s not a match, it uses the finger table to find the node with id closest to the search key, and forwards the search to that node. In this manner, the distance to the search key halves each time a node is contacted, and thus at most O(log N) nodes are contacted before reaching the destination node.
  3. Chord simplifies P2P systems by providing: load balancing, decentralization, logarithmic scalability, high availability, and flexible naming. Chord avoids a single point of failure, since all nodes in the network have equal responsibility. To join a P2P network, a node only needs to know about one other Chord node in the network; there is no servers that are required. This, along with the fact that keys are distributed evenly on the nodes, provides availability and decentralization. Further, nodes keep track of several successors, so in case of failure they can update the finger tables easily. Scalability is achieved without requiring hierarchy, because nodes only need to keep track of O(log N) other nodes; lookups run in predictable time. Chord requires no naming structure, as opposed to services like DNS. So it offers a very flexible naming scheme. Another nice thing about the design is that it is simple and handles concurrent join and failures very well.

Chord has a very clever design. While it’s hard to critique it, I feel like frequent node join/leaves could hinder its performance. The keys of one or more nodes need to be redistributed once a node joins or leaves the network. If the joins/leaves are too frequent, then the nodes have to redistribute keys too frequently, thus hindering their performance.

Future research: it would be good to take into accoun the distance/latency between the nodes, and rather than selecting a successor based on closest hash, take latency into account as well