Difference between revisions of "DistOS-2011W Distributed Data Structures: a survey"

From Soma-notes
Jump to navigation Jump to search
Line 9: Line 9:
[mailto:amoham16@connect.carleton.ca amoham16@connect.carleton.ca]
[mailto:amoham16@connect.carleton.ca amoham16@connect.carleton.ca]


==Abstract==
=Abstract=
Access to stored data in main memory is much faster than access to local disks/hard drives. Well maintained data structures can utilize such access. In addition, the performance of the memory fetch, dispatch and store operations on Distributed Operating Systems would augment if they exploit a highly scalable distributed data structures. This paper describes the concept of Distributed Data Structures (DDS) which has been made viable by multicomputers. Scalability aspects and their implementation are reviewed for a chosen subset of classic data structures. Open research issues for SDDS are also discussed.
Access to stored data in main memory is much faster than access to local disks/hard drives. Well maintained data structures can utilize such access. In addition, the performance of the memory fetch, dispatch and store operations on Distributed Operating Systems would augment if they exploit a highly scalable distributed data structures. This paper describes the concept of Distributed Data Structures (DDS) which has been made viable by multicomputers. Scalability aspects and their implementation are reviewed for a chosen subset of classic data structures. Open research issues for SDDS are also discussed.


==Introduction==
=Introduction=
Multicomputers are collections of autonomous workstations or PCs on a network (network multicomputers), or of share-nothing processors with local storage linked through a high-speed network or bus (switched multicomputer)<ref> A. S. Tanenbaum, Distributed Operating Systems. Prentice Hall, 1995.</ref>. Recent advances in multicomputers showed that they became in need for new data structures that scale well with the number of components to make effective use of their aggregate performance <ref> W. Litwin, R. Moussa, and T. Schwarz, “Lh*<sub>RS</sub> —a highly- available scalable distributed data structure,” ACM Trans. Database Syst., vol. 30, pp. 769–811, September 2005. [Online]. Available: http://doi.acm.org/10.1145/1093382.1093386 </ref>. In addition, low latency interconnected mulicomputers allow significant flexibility to the size of data structure without suffering from space utilization or access time penalties <ref>V. Gupta, M. Modi, and A. D. Pimentel, “Performance evaluation of the lh*lh scalable, distributed data structure for a cluster of workstations,” in Proceedings of the 2001 ACM symposium on Applied computing, ser. SAC ’01. New York, NY, USA: ACM, 2001, pp. 544–548. [Online]. Available: http://doi.acm.org/10.1145/372202.372458</ref>
Multicomputers are collections of autonomous workstations or PCs on a network (network multicomputers), or of share-nothing processors with local storage linked through a high-speed network or bus (switched multicomputer)<ref> A. S. Tanenbaum, Distributed Operating Systems. Prentice Hall, 1995.</ref>. Recent advances in multicomputers showed that they became in need for new data structures that scale well with the number of components to make effective use of their aggregate performance <ref> W. Litwin, R. Moussa, and T. Schwarz, “Lh*<sub>RS</sub> —a highly- available scalable distributed data structure,” ACM Trans. Database Syst., vol. 30, pp. 769–811, September 2005. [Online]. Available: http://doi.acm.org/10.1145/1093382.1093386 </ref>. In addition, low latency interconnected mulicomputers allow significant flexibility to the size of data structure without suffering from space utilization or access time penalties <ref>V. Gupta, M. Modi, and A. D. Pimentel, “Performance evaluation of the lh*lh scalable, distributed data structure for a cluster of workstations,” in Proceedings of the 2001 ACM symposium on Applied computing, ser. SAC ’01. New York, NY, USA: ACM, 2001, pp. 544–548. [Online]. Available: http://doi.acm.org/10.1145/372202.372458</ref>


The rest of this paper is organized as follows. Section 2 presents the work done to implement a distributed version of classical data structures including trees, sets, lists, graphs and queues, binary indexed trees (BITs) and distributed hash tables (DHT). The described work hide the distinction between local and distributed data structure manipulation, simplifying the programming model at some performance cost. Section 3 focuses on scalability issues in distributed data structure models that are based on linear hashing. Section 4 provides a performance parameters dealing with SDDS research outcomes. Finally, in section 5, a conclusion is drawn summarizing the presented work.
The rest of this paper is organized as follows. Section 2 presents the work done to implement a distributed version of classical data structures including trees, sets, lists, graphs and queues, binary indexed trees (BITs) and distributed hash tables (DHT). The described work hide the distinction between local and distributed data structure manipulation, simplifying the programming model at some performance cost. Section 3 focuses on scalability issues in distributed data structure models that are based on linear hashing. Section 4 provides a performance parameters dealing with SDDS research outcomes. Finally, in section 5, a conclusion is drawn summarizing the presented work.


==Distributed Versions of Classic Data Structures==
=Distributed Versions of Classic Data Structures=
==Hash Tables==
A distributed hash table can provide more scalability of throughput and capacity of data more than a locally deployed one. An architecture for a distributed hash table was proposed in <ref>S. D. Gribble, E. A. Brewer, J. M. Hellerstein, and D. Culler, “Scalable, distributed data structures for internet service construction,” in Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4, ser. OSDI’00. Berkeley, CA, USA: USENIX Association, 2000, pp. 22–22.</ref> which consisted of the following components:
* Client (a web browser for example): a component that is totally unaware of the underlying DDS infrastructure. i.e., DDS parts don't runs on a client.
* Service: software processes (each is called a <i>service instance</i>) that are cooperating together.
* Hash table API: the boundary between DDS libraries and their service instances.
* DDS library: a library (in Java) that presents the API of the hash table to services.
* Brick: the component that manages durable data. It consists of a lock manager, a buffer cache, a persistent chained hash table implementation and network stubs as well as skeletons for remote communication. Each CPU in the distributed clusters runs one brick.


(Image)


==References==
The partitioning and replication operations work by horizontally partitioning tables to spread data and operations across bricks. Each brick saves a number of partitions of each table in the system and when new nodes are added to the cluster, this partitioning is altered so that the data is spread onto the new node. For avoiding unavailability of portions in the hash table due to node failures, each partition in the hash table is replicated on more than one cluster node. Two problems arise, finding a partition that manages a specific key, and determining the list of replicas in partitions' replica groups. To solve these, the DDS libraries consult two metadata maps (namely data partitioning map <i>DP</i> and replica group membership map <i>RG</i>) that are replicated on each node of the cluster. That pair of metadata maps exist on each hash table in the cluster. The <i>DP</i> map controls the horizontal partitioning of data across the bricks. Whereas the <i>RG</i> map returns a list of bricks that are currently working as replicas. An asynchronous event-driven programming style builds all components of the distributed hash table. Each layer in the hash table is designed so that only a single thread executes at a time. In this manner, distributed hash tables succeed in simplifying the construction of services which exploited the data consistency and scalability of the hash table <ref>S. D. Gribble, E. A. Brewer, J. M. Hellerstein, and D. Culler, “Scalable, distributed data structures for internet service construction,” in Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4, ser. OSDI’00. Berkeley, CA, USA: USENIX Association, 2000, pp. 22–22.</ref>.
 
=References=
<references/>
<references/>

Revision as of 22:57, 4 March 2011

Please note that this page shows only the abstract of the paper. A PDF version of the full document can be found here

A. M. AbdelRahman
PhD Student
Dept. of Systems and Computer Engineering
Carleton University
Ottawa, ON
Canada K1S 5B6
amoham16@connect.carleton.ca

Abstract

Access to stored data in main memory is much faster than access to local disks/hard drives. Well maintained data structures can utilize such access. In addition, the performance of the memory fetch, dispatch and store operations on Distributed Operating Systems would augment if they exploit a highly scalable distributed data structures. This paper describes the concept of Distributed Data Structures (DDS) which has been made viable by multicomputers. Scalability aspects and their implementation are reviewed for a chosen subset of classic data structures. Open research issues for SDDS are also discussed.

Introduction

Multicomputers are collections of autonomous workstations or PCs on a network (network multicomputers), or of share-nothing processors with local storage linked through a high-speed network or bus (switched multicomputer)<ref> A. S. Tanenbaum, Distributed Operating Systems. Prentice Hall, 1995.</ref>. Recent advances in multicomputers showed that they became in need for new data structures that scale well with the number of components to make effective use of their aggregate performance <ref> W. Litwin, R. Moussa, and T. Schwarz, “Lh*RS —a highly- available scalable distributed data structure,” ACM Trans. Database Syst., vol. 30, pp. 769–811, September 2005. [Online]. Available: http://doi.acm.org/10.1145/1093382.1093386 </ref>. In addition, low latency interconnected mulicomputers allow significant flexibility to the size of data structure without suffering from space utilization or access time penalties <ref>V. Gupta, M. Modi, and A. D. Pimentel, “Performance evaluation of the lh*lh scalable, distributed data structure for a cluster of workstations,” in Proceedings of the 2001 ACM symposium on Applied computing, ser. SAC ’01. New York, NY, USA: ACM, 2001, pp. 544–548. [Online]. Available: http://doi.acm.org/10.1145/372202.372458</ref>

The rest of this paper is organized as follows. Section 2 presents the work done to implement a distributed version of classical data structures including trees, sets, lists, graphs and queues, binary indexed trees (BITs) and distributed hash tables (DHT). The described work hide the distinction between local and distributed data structure manipulation, simplifying the programming model at some performance cost. Section 3 focuses on scalability issues in distributed data structure models that are based on linear hashing. Section 4 provides a performance parameters dealing with SDDS research outcomes. Finally, in section 5, a conclusion is drawn summarizing the presented work.

Distributed Versions of Classic Data Structures

Hash Tables

A distributed hash table can provide more scalability of throughput and capacity of data more than a locally deployed one. An architecture for a distributed hash table was proposed in <ref>S. D. Gribble, E. A. Brewer, J. M. Hellerstein, and D. Culler, “Scalable, distributed data structures for internet service construction,” in Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4, ser. OSDI’00. Berkeley, CA, USA: USENIX Association, 2000, pp. 22–22.</ref> which consisted of the following components:

  • Client (a web browser for example): a component that is totally unaware of the underlying DDS infrastructure. i.e., DDS parts don't runs on a client.
  • Service: software processes (each is called a service instance) that are cooperating together.
  • Hash table API: the boundary between DDS libraries and their service instances.
  • DDS library: a library (in Java) that presents the API of the hash table to services.
  • Brick: the component that manages durable data. It consists of a lock manager, a buffer cache, a persistent chained hash table implementation and network stubs as well as skeletons for remote communication. Each CPU in the distributed clusters runs one brick.

(Image)

The partitioning and replication operations work by horizontally partitioning tables to spread data and operations across bricks. Each brick saves a number of partitions of each table in the system and when new nodes are added to the cluster, this partitioning is altered so that the data is spread onto the new node. For avoiding unavailability of portions in the hash table due to node failures, each partition in the hash table is replicated on more than one cluster node. Two problems arise, finding a partition that manages a specific key, and determining the list of replicas in partitions' replica groups. To solve these, the DDS libraries consult two metadata maps (namely data partitioning map DP and replica group membership map RG) that are replicated on each node of the cluster. That pair of metadata maps exist on each hash table in the cluster. The DP map controls the horizontal partitioning of data across the bricks. Whereas the RG map returns a list of bricks that are currently working as replicas. An asynchronous event-driven programming style builds all components of the distributed hash table. Each layer in the hash table is designed so that only a single thread executes at a time. In this manner, distributed hash tables succeed in simplifying the construction of services which exploited the data consistency and scalability of the hash table <ref>S. D. Gribble, E. A. Brewer, J. M. Hellerstein, and D. Culler, “Scalable, distributed data structures for internet service construction,” in Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4, ser. OSDI’00. Berkeley, CA, USA: USENIX Association, 2000, pp. 22–22.</ref>.

References

<references/>