06 The Chord P2P Network

本篇文档主要介绍了Chord 和Tapestry的定义、Chord的基本特点、一致性哈希、一致性哈希的性质、Chord中结点的查找、插入新的结点的过程与算法、一种更加高效的插入算法及其例子。

1. The Chord P2P Network Some slides have been borrowed from the original presentation by the authors

2. Chord vs.Tapestry • The topology of the Chord network, as defined by the successor pointers, must satisfy a well-defined structure. • Tapestry (uses Plaxton routing) requires the root of the object to be placed in a designated node, but the object can be placed locally. In contrast, Chord requires the object to be placed at a designated node.

3. Main features of Chord  Load balancing via Consistent Hashing • Small routing tables: log n • Small routing delay: log n hops • Fast join/leave protocol (polylog time)

4. Consistent Hashing Assigns both nodes and objects from an m-bit key. Order these nodes around an identifier circle (what does a circle mean here?) according to the order of their keys (0 .. 2m-1). This ring is known as the Chord Ring. Object with key k is assigned to the first node whose key is ≥ k (called the successor node of key k)

5. Consistent Hashing D120 (0) N105 D20 N=128 Circular 7-bit N32 ID space N90 D80 Example: Node 90 is the “successor” of document 80.

6.Consistent Hashing [Karger 97] Property 1 If there are N nodes and K keys, then with high probability, each node is responsible for (1+epsilon )K/N keys. Property 2 When a node joins or leaves the network, the responsibility of at most O(K/N) keys changes hand (only to or from the node that is joining or leaving. When K is large, the impact on individual nodes is quite small.

7. Consistent hashing Object with Key 5 K5 Node 105 N105 K20 Circular 7-bit ID space N32 N90 K80 object with key k is stored at its successor (node with key ≥

8. The log N Fingers (0) 1/4 1/2 Distance of N80’s neighbors from N80 1/8 Circular (log N)-bit ID space 1/16 1/32 1/64 1/128 N80 Each node knows of only log N other nodes.

9.Finger i points to successor of n+2i N120 112 ¼ ½ 1/8 1/16 1/32 1/64 1/128 N80

10. Chord Finger Table (0) N32’s N113 Finger Table 33..33 N40 N102 34..35 N40 36..39 N40 N=128 N32 40..47 N40 N85 48..63 N52 N80 N40 64..95 N70 96..31 N102 N79 N52 N70 Finger table actually contains N60 ID and IP address Node n’s i-th entry: first node ≥ n + 2i-1

11. Lookup N70’s Finger Table 71..71 N79 Greedy routing 72..73 N79 (0) 74..77 N79 78..85 N80 N113 86..101 N102 N32’s N102 Finger Table 102..5 N102 6..69 N32 33..33 N40 N3234..35 N40 N80’s 36..39 N40 Finger Table N85 40..47 N40 81..81 N85 N40 N80 48..63 N52 82..83 N85 64..95 N70 84..87 N85 N79 N52 96..31 N102 88..95 N102 N70 N60 96..111 N102 112..15 N113 16..79 N32 Node 32, lookup(82): 32  70  80  85.

12. New Node Joins (0) N20’s N113 Finger Table N20 1 21..21 N102 2 22..23 3 24..27 N32 4 28..35 5 36..51 N80 N40 6 52..83 7 84..19 N52 N70 N60 Assume N20 knows one of the existing nodes.

13. New Node Joins (2) (0) N20’s N113 Finger Table N20 21..21 N32 N102 22..23 N32 24..27 N32 N32 28..35 N32 36..51 N40 N80 N40 52..83 N52 84..19 N102 N52 N70 N60 Node 20 asks that node for successor to 21, 22, …, 52, 84.

14. The Join procedure The new node id asks a gateway node n to find the successor of id n.(find_successor(id) if id = (n, successor) *n<id<successor of n* then return successor else forward the query around the circle fi Needs O(n) messages. This is slow.

15. Steps in join Linked list insert n n id id Successor(n) Finally But the transition does not happen immediately

16. A More Efficient Join // ask n to find the successor of id if id = (n, successor] then return successor else n’= closest_ preceding_node (id) return n’.find_successor(id) fi // search for the highest predecessor of id n. closest_preceding_node(id) for i = log N downto 1 if (finger[i] is between (n,id) return finger[i]

17. Example N20 wants to (0) find out the N113 successor of N20 key 65 N102 N32 N80 N40 N52 N70 N60 K65

18. After join move objects (0) N20’s N113 Finger Table N20 21..21 N32 N102 22..23 N32 D114..20 24..27 N32 N32 28..35 N32 36..51 N40 N80 N40 52..83 N52 84..19 N102 N52 Notify nodes that must include N70 N60 N20 in their table. N113[1]=N20, not N32. Node 20 moves documents from node 32.

19. Three steps in join Step 1. Initialize predecessor and fingers of the new node. (Knowledge of predecessor is useful in stabilization ) Step 2. Update the predecessor and the fingers of the existing nodes. (Thus notify nodes that must include N20 in their table. N113[1] = N20, not N32. Step 3. Transfer objects to the new node as appropriate.

20. Concurrent Join n1 n1 New node n’ New node n’ New node n New node n n2 n2 [Before] [After]

21. Stabilization Periodic stabilization is needed to integrate the new node into the network and restore the invariant. n1 n1 New node n New node n n2 n2 Predecessor.successor(n1) ≠ n1, so n1 adopts predecessor.successor(n1) = n as its new successor

22. The complexity of join With high probability, any node joining or leaving an N-node Chord network will use O(log 2N) messages to re-establish the Chord routing invariants and finger tables.

23. Chord Summary • Log(n) lookup messages and table space. • Well-defined location for each ID. • No search required. • Natural load balance. • No name structure imposed. • Minimal join/leave disruption.