DistOS 2014W Lecture 18: Difference between revisions
No edit summary |
|||
(6 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
==Distributed Hash Tables (March 18)== | |||
* [http://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables] | |||
* [http://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)] | |||
== Distributed Hash Table Overview == | == Distributed Hash Table Overview == | ||
Line 4: | Line 10: | ||
distributed across many nodes in a network. Keys are hashed to generate the | distributed across many nodes in a network. Keys are hashed to generate the | ||
index at which the value can be found. Depending on the nature of the hash | index at which the value can be found. Depending on the nature of the hash | ||
function, only exact queries may be returned. | function, typically, only exact queries may be returned. | ||
the hash table, as opposed to a full replica. This has given rise to a number | |||
of different | Usually, each node has a partial view of | ||
* | the hash table, as opposed to a full replica. They don't know exactly which other node is responsible for a given key. This has given rise to a number | ||
at which the value can be found. This method involves a single point of failure. | of different routing techniques: | ||
* Each node may query all connected nodes. This method has performance and | * A centralized server may maintain a list of all keys and associated nodes at which the value can be found. This method involves a single point of failure. | ||
scalability shortcomings | ** eg. Napster | ||
* The keyspace can be partitioned such that nodes will maintain the values | * Flooding: Each node may query all connected nodes. This method has performance and scalability shortcomings but had the benefit of being decentralized. | ||
for keys that hash to similar indices (e.g., within a certain hamming distance). | ** eg. Gnutella | ||
* [http://en.wikipedia.org/wiki/Consistent_hashing Consistent Hashing] The keyspace can be partitioned such that nodes will maintain the values for keys that hash to similar indices (e.g., within a certain hamming distance). Given a query, nodes do not know specifically on which node a key is located, but they do know a few nodes (a proper subset of the network) located "closer" to the key. The query then continues onto the closest node. This seems to be the most popular technique for DHTs. It's biggest benefit is that nodes can be added and removed without notifying every other node on the network. | |||
** eg. Tapestry | |||
==Tapestry:== | |||
Tapestry is an overlay network which makes use of a DHT to provide routing for | Tapestry is an overlay network which makes use of a DHT to provide routing for | ||
distributed applications. Similar to IP routing, not all nodes need to be | distributed applications. Similar to IP routing, not all nodes need to be | ||
Line 23: | Line 29: | ||
Routing is performed in such a way that nodes are aware of their ''distance'' | Routing is performed in such a way that nodes are aware of their ''distance'' | ||
to the object being queried. Hence objects can be located with low latency | to the object being queried. Hence objects can be located with low latency | ||
without the need to migrate actual object data between nodes. | without the need to migrate actual object data between nodes. | ||
Tapestry was built for Oceanstore. Oceanstore was built for the open internet. Nodes would be constantly added and removed. Chances are, the network topology would change. That's why you'd need a dynamic routing system. | |||
In Tapestry, every object and node is identified by a UUID. The system is entirely distributed, decentralized and peer-to-peer. Each node stores a routing table of its various neighbours by the prefix of their UUID. This lets routing occur digit by digit, effectively turning lookup into the traversal of a distributed [https://en.wikipedia.org/wiki/Radix_tree Radix tree]. | |||
* | * DNS as tree but Tapestry as hercically structured. | ||
* | * How does the information flow? Each node has a neighbour table which that contains the neighbour's number. | ||
* | ** From initialization, each node has a locally optimal routing table that it maintains | ||
* | ** Routing happens digit by digit | ||
* | |||
* | |||
* | * Tapestry API: | ||
** | ** have four operations called PublishObject, UnpublishObject, RouteToObject, RouteToNode. | ||
* each | ** each node has ID and each endpoint object has a GUID (Globally unique identifier). | ||
* Tapestry look like operating system. | |||
* | ** it has two models,one is built on UDP protocol and the other on TCP protocol. | ||
* | |||
Fun fact, it is now called [http://current.cs.ucsb.edu/projects/chimera/ Chimera]. |
Latest revision as of 14:41, 24 April 2014
Distributed Hash Tables (March 18)
- Wikipedia's article on Distributed Hash Tables
- Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)
Distributed Hash Table Overview
A Distributed Hash Table (DHT) is a fast lookup structure of <key,value> pairs, distributed across many nodes in a network. Keys are hashed to generate the index at which the value can be found. Depending on the nature of the hash function, typically, only exact queries may be returned.
Usually, each node has a partial view of the hash table, as opposed to a full replica. They don't know exactly which other node is responsible for a given key. This has given rise to a number of different routing techniques:
- A centralized server may maintain a list of all keys and associated nodes at which the value can be found. This method involves a single point of failure.
- eg. Napster
- Flooding: Each node may query all connected nodes. This method has performance and scalability shortcomings but had the benefit of being decentralized.
- eg. Gnutella
- Consistent Hashing The keyspace can be partitioned such that nodes will maintain the values for keys that hash to similar indices (e.g., within a certain hamming distance). Given a query, nodes do not know specifically on which node a key is located, but they do know a few nodes (a proper subset of the network) located "closer" to the key. The query then continues onto the closest node. This seems to be the most popular technique for DHTs. It's biggest benefit is that nodes can be added and removed without notifying every other node on the network.
- eg. Tapestry
Tapestry:
Tapestry is an overlay network which makes use of a DHT to provide routing for distributed applications. Similar to IP routing, not all nodes need to be directly connected to each other: they can query a subset of neighbours for information about which nodes are responsible for certain parts of the keyspace. Routing is performed in such a way that nodes are aware of their distance to the object being queried. Hence objects can be located with low latency without the need to migrate actual object data between nodes.
Tapestry was built for Oceanstore. Oceanstore was built for the open internet. Nodes would be constantly added and removed. Chances are, the network topology would change. That's why you'd need a dynamic routing system.
In Tapestry, every object and node is identified by a UUID. The system is entirely distributed, decentralized and peer-to-peer. Each node stores a routing table of its various neighbours by the prefix of their UUID. This lets routing occur digit by digit, effectively turning lookup into the traversal of a distributed Radix tree.
- DNS as tree but Tapestry as hercically structured.
- How does the information flow? Each node has a neighbour table which that contains the neighbour's number.
- From initialization, each node has a locally optimal routing table that it maintains
- Routing happens digit by digit
- Tapestry API:
- have four operations called PublishObject, UnpublishObject, RouteToObject, RouteToNode.
- each node has ID and each endpoint object has a GUID (Globally unique identifier).
- Tapestry look like operating system.
- it has two models,one is built on UDP protocol and the other on TCP protocol.
Fun fact, it is now called Chimera.