Difference between revisions of "DistOS 2014W Lecture 18"

From Soma-notes
Jump to navigation Jump to search
Line 1: Line 1:
== Distributed Hash Table Overview ==
A Distributed Hash Tables (DHT) is a fast lookup structure of <key,value> pairs,
distributed across many nodes in a network.  Keys are hashed to generate the
index at which the value can be found.  Depending on the nature of the hash
function, only exact queries may be returned.  Each node has a partial view of
the hash table, as opposed to a full replica.  This has given rise to a number
of different search techniques:
*  A centralized server may maintain a list of all keys and associated nodes
at which the value can be found.  This method involves a single point of failure.
* Each node may query all connected nodes.  This method has performance and
scalability shortcomings
* The keyspace can be partitioned such that nodes will maintain the values
for keys that hash to similar indices (e.g., within a certain hamming distance).
This method is not suitable for all applications, as it involves
bandwidth-intensive migration of data stored in the DHT.
===Tapestry:===
Tapestry is an overlay network which makes use of a DHT to provide routing for
distributed applications.  Similar to IP routing, not all nodes need to be
directly connected to each other: they can query a subset of neighbours for
information about which nodes are responsible for certain parts of the keyspace.
Routing is performed in such a way that nodes are aware of their ''distance''
to the object being queried.  Hence objects can be located with low latency
without the need to migrate actual object data between nodes.  It would be interesting
to see if Tapestry could be adapted for anonymity similar to ant-colony routing
schemes.
**Tapestry
**Tapestry
* Distributed .
* Distributed .

Revision as of 11:09, 18 March 2014

Distributed Hash Table Overview

A Distributed Hash Tables (DHT) is a fast lookup structure of <key,value> pairs, distributed across many nodes in a network. Keys are hashed to generate the index at which the value can be found. Depending on the nature of the hash function, only exact queries may be returned. Each node has a partial view of the hash table, as opposed to a full replica. This has given rise to a number of different search techniques:

  • A centralized server may maintain a list of all keys and associated nodes

at which the value can be found. This method involves a single point of failure.

  • Each node may query all connected nodes. This method has performance and

scalability shortcomings

  • The keyspace can be partitioned such that nodes will maintain the values

for keys that hash to similar indices (e.g., within a certain hamming distance). This method is not suitable for all applications, as it involves bandwidth-intensive migration of data stored in the DHT.

Tapestry:

Tapestry is an overlay network which makes use of a DHT to provide routing for distributed applications. Similar to IP routing, not all nodes need to be directly connected to each other: they can query a subset of neighbours for information about which nodes are responsible for certain parts of the keyspace. Routing is performed in such a way that nodes are aware of their distance to the object being queried. Hence objects can be located with low latency without the need to migrate actual object data between nodes. It would be interesting to see if Tapestry could be adapted for anonymity similar to ant-colony routing schemes.

    • Tapestry
  • Distributed .
  • Simple key-value store.
  • using DHT ( distributed hash table).
  • look up table contains : key and value
  • DNS as tree but Tapestry as hercically structure.
    • More dtails about Tapestry:
    • how the information flow?
  • each nod has neighbour table which that contains the node neighbour number.
    • Tapestry API:
  • have four operations called
  • each node has ID and each endpoint has GUID (Globally unique identifier).
    • Tapestry look like operating system.