DistOS 2015W Session 10: Difference between revisions
Mohamedahmed (talk | contribs) Created page with "Group 2 - Mohamed Ahmed, Apoorv Sangal, Ambalica Sharma == Comet ==" |
No edit summary |
||
(19 intermediate revisions by 6 users not shown) | |||
Line 1: | Line 1: | ||
* [http://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables] | |||
* [http://en.wikipedia.org/wiki/Kademlia Wikipedia's article on Kademlia] | |||
* [http://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf Maymounkov and Mazieres, "Kademlia: A Peer-to-peer information system based on the XOR Metric" (2002)] | |||
* [http://en.wikipedia.org/wiki/Tapestry_%28DHT%29 Wikipedia's article on Tapestry] | |||
* [http://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)] | |||
* [https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Geambasu.pdf Geambasu et al., "Comet: An active distributed key-value store" (OSDI 2010)] | |||
== Comet == | Session 10 is about Distributed Hash Tables. How they work, various algorithmic options (keyspace partitioning being a major example) and some of the earliest implementations. | ||
(Feel free to tweak the questions!) | |||
==Kademlia== | |||
Members: Kirill, Deep, Jason, Hassan | |||
* '''Why are DHTs relevant to distributed OSs?''' | |||
** Using many system, having repetition | |||
** DHT to distributed content over multiple nodes | |||
** Decentralized therefore peer-to-peer | |||
* '''How is content divided?''' | |||
** File hashes | |||
** Node ID to locate the value | |||
** 160 bit key space, binary tree for partition and searching down the tree | |||
* '''How is the network traversed?''' | |||
** Highest prefix and increase the number of digit it used to get close to the target nodes | |||
* '''What trust assumptions does the system make?''' | |||
** DHT by itself is insecure | |||
** The academic and practitioner communities have realized that all current DHT designs suffer from a security weakness, known as the Sybil attack | |||
** K-buckets | |||
*** Binary tree with each node having k-buckets as leaf | |||
*** Any given k-node is very unlikely to fail within an hour of each other | |||
*** New nodes are only inserted when there is room in the bucket or the oldest node doesn't respond | |||
** Uses UDP therefore packets are often lost | |||
* '''Performance constraints?''' | |||
** Binary tree traversing, therefore traversing is maximum O(log n) | |||
* '''What non-DHT internet infrastructure would you replace with a DHT? How suitable is Kademlia for this purpose?''' | |||
** DNS | |||
** Any kind of meta-data service | |||
==Comet== | |||
Members: Mohamed Ahmed, Apoorv Sangal, Ambalica Sharma | |||
* Why are DHTs relevant to distributed OSs? | |||
DHT is an infrastructure than enables many clients to share information, and scale to handle node arrival, departure and failure. DHT's serve many of the design goals of disbtributed operating systems. The paper states that "DHTs are increasingly used to support a variety of distributed applications, such as file-sharing, distributed resource tracking, end-system multicast, publish-subscribe systems, distributed search engines" | |||
* How is content divided? | |||
One of the three main components of the comet system is a routing substrate which | |||
implements the value/node mapping. This allows a client to find the node htat stores | |||
a specific data item. Since Comet uses a DHT implementation, routing occurs by applying | |||
a hash function to the key to compute node ID's that store the associated value. | |||
* How is the network traversed? | |||
* What trust assumptions does the system make? | |||
Assumes clients re untrusted autonomous nodes. | |||
A client node running Comet should be protected from the execution of handlers | |||
e.g. an executing handler cannot corrupt the node or use unlimited resources. | |||
Handlers should not be able to mount messaging attacks on other nodes. | |||
Users downloading Comet must trust it and have guarantees about its behavior. For this reason, Comet enforces four important restrictions: | |||
1. Limited knowledge: an ASO is not aware of other objects | |||
or resources stored on the same node and has no | |||
direct way to learn about them. | |||
2. Limited access: an object handler can manipulate only its own value and cannot modify the values of other objects on its storage node. | |||
3. Limited communication: an active storage object cannot | |||
send arbitrary messages over the network. | |||
4. Limited resource consumption: an ASO’s resource usage is strictly bounded, e.g., the system limits the amount of computation and memory it can consume. | |||
* Performance constraints? | |||
* What non-DHT internet infrastructure would you replace with a DHT? How suitable is Comet for this purpose? | |||
==Tapestry== | |||
Members: Ashley, Dany, Alexis, Khaled | |||
* Why are DHTs relevant to distributed OSs? | |||
Because they provide a way to distribute information over large networks (distributed key/value store). | |||
* How is content divided? | |||
Uses consistent hashing (SHA-1), upon node creation (join) creates optimal routing table. | |||
* How is the network traversed? | |||
You look at your neighbours, you see which neighbour is closest to your destination, and recurse. | |||
* What trust assumptions does the system make? | |||
It assumes the system is entirely trustworthy from adversaries. While network failures may happen and nodes may go down, no node will explicitly try to mess with the network. | |||
* Performance constraints? | |||
O(log n) access times to any given node. Best effort publishing/unpublishing via decentralized object location routing. | |||
* What non-DHT internet infrastructure would you replace with a DHT? How suitable is Tapestry for this purpose? | |||
== DHT Discussion == | |||
* DHT is weak against mutation | |||
* performance - high variance | |||
** hop vs direct connection | |||
*DHT as DNS server | |||
**DHT have no ownership/authority | |||
*DHT as web hosting server | |||
** ok for static content but not good for dynamic content which might have private data | |||
== Other Resources == | |||
*[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1610546&tag=1 A survey and comparison of peer-to-peer overlay network schemes] | |||
*[http://link.springer.com/article/10.1007/s12083-012-0157-3 Collaborative Applications over Peer-to-Peer Systems – Challenges and Solutions] |
Latest revision as of 14:13, 20 April 2015
- Wikipedia's article on Distributed Hash Tables
- Wikipedia's article on Kademlia
- Maymounkov and Mazieres, "Kademlia: A Peer-to-peer information system based on the XOR Metric" (2002)
- Wikipedia's article on Tapestry
- Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)
- Geambasu et al., "Comet: An active distributed key-value store" (OSDI 2010)
Session 10 is about Distributed Hash Tables. How they work, various algorithmic options (keyspace partitioning being a major example) and some of the earliest implementations.
(Feel free to tweak the questions!)
Kademlia
Members: Kirill, Deep, Jason, Hassan
- Why are DHTs relevant to distributed OSs?
- Using many system, having repetition
- DHT to distributed content over multiple nodes
- Decentralized therefore peer-to-peer
- How is content divided?
- File hashes
- Node ID to locate the value
- 160 bit key space, binary tree for partition and searching down the tree
- How is the network traversed?
- Highest prefix and increase the number of digit it used to get close to the target nodes
- What trust assumptions does the system make?
- DHT by itself is insecure
- The academic and practitioner communities have realized that all current DHT designs suffer from a security weakness, known as the Sybil attack
- K-buckets
- Binary tree with each node having k-buckets as leaf
- Any given k-node is very unlikely to fail within an hour of each other
- New nodes are only inserted when there is room in the bucket or the oldest node doesn't respond
- Uses UDP therefore packets are often lost
- Performance constraints?
- Binary tree traversing, therefore traversing is maximum O(log n)
- What non-DHT internet infrastructure would you replace with a DHT? How suitable is Kademlia for this purpose?
- DNS
- Any kind of meta-data service
Comet
Members: Mohamed Ahmed, Apoorv Sangal, Ambalica Sharma
- Why are DHTs relevant to distributed OSs?
DHT is an infrastructure than enables many clients to share information, and scale to handle node arrival, departure and failure. DHT's serve many of the design goals of disbtributed operating systems. The paper states that "DHTs are increasingly used to support a variety of distributed applications, such as file-sharing, distributed resource tracking, end-system multicast, publish-subscribe systems, distributed search engines"
- How is content divided?
One of the three main components of the comet system is a routing substrate which implements the value/node mapping. This allows a client to find the node htat stores a specific data item. Since Comet uses a DHT implementation, routing occurs by applying a hash function to the key to compute node ID's that store the associated value.
- How is the network traversed?
- What trust assumptions does the system make?
Assumes clients re untrusted autonomous nodes.
A client node running Comet should be protected from the execution of handlers e.g. an executing handler cannot corrupt the node or use unlimited resources. Handlers should not be able to mount messaging attacks on other nodes.
Users downloading Comet must trust it and have guarantees about its behavior. For this reason, Comet enforces four important restrictions: 1. Limited knowledge: an ASO is not aware of other objects or resources stored on the same node and has no direct way to learn about them. 2. Limited access: an object handler can manipulate only its own value and cannot modify the values of other objects on its storage node. 3. Limited communication: an active storage object cannot send arbitrary messages over the network. 4. Limited resource consumption: an ASO’s resource usage is strictly bounded, e.g., the system limits the amount of computation and memory it can consume.
- Performance constraints?
- What non-DHT internet infrastructure would you replace with a DHT? How suitable is Comet for this purpose?
Tapestry
Members: Ashley, Dany, Alexis, Khaled
- Why are DHTs relevant to distributed OSs?
Because they provide a way to distribute information over large networks (distributed key/value store).
- How is content divided?
Uses consistent hashing (SHA-1), upon node creation (join) creates optimal routing table.
- How is the network traversed?
You look at your neighbours, you see which neighbour is closest to your destination, and recurse.
- What trust assumptions does the system make?
It assumes the system is entirely trustworthy from adversaries. While network failures may happen and nodes may go down, no node will explicitly try to mess with the network.
- Performance constraints?
O(log n) access times to any given node. Best effort publishing/unpublishing via decentralized object location routing.
- What non-DHT internet infrastructure would you replace with a DHT? How suitable is Tapestry for this purpose?
DHT Discussion
- DHT is weak against mutation
- performance - high variance
- hop vs direct connection
- DHT as DNS server
- DHT have no ownership/authority
- DHT as web hosting server
- ok for static content but not good for dynamic content which might have private data