Soma-notes - User contributions [en]

DistOS 2015W Session 9

2015-04-04T12:08:12Z

Apoorv: /* Naiad */

== BOINC ==

*Public Resource Computing Platform
*Gives scientists the ability to use large amounts of computation resources.
*The clients do not connect directly with each other but instead they talk to a central server located at Berkley
*The goals of Boinc are:
:*1) reduce the barriers of entry
:*2) Share resources among autonomous projects
:*3) Support diverse applications
:*4) Reward participants.

*It can run as applications in common language with no modifications
A BOINC application can be identified by a single master URL, which serves as the homepage as well as the directory of the servers.

== SETI@Home ==

*Uses public resource computing to analyze radio signals to find extraterrestrial intelligence
*Need good quality telescope to search for radio signals, and lots of computational power, which was unavailable locally
*It has not yet found extraterrestrial intelligence, but its has established credibility of public resource computing projects which are given by the public
*Uses BOINC as a backbone for the project
*Uses relational database to store information on a large scale, further it uses a multi-threaded server to distribute work to clients
*Quality of data in this architecture is untrustworthy, the main incentive to use it, however, is that it is a cheap and easy way of scaling the work exponentially.
*Provided social incentives to encourage users to join the system.
*This computation model still exists but not in the legitimate world.
*Formed a good concept of public resource computing and a distributed computing by providing a platform independent framework

== MapReduce ==

*A programming model presented by Google to do large scale parallel computations
*Uses the <code>Map()</code> and <code>Reduce()</code> functions from functional style programming languages
:*Map (Filtering)
::*Takes a function and applies it to all elements of the given data set
* Hides parallelization, fault tolerance, locality optimization and load balancing
:*Reduce (Summary)
::*Accumulates results from the data set using a given function

== Naiad ==

*A programming model similar to <code>MapReduce</code> but with streaming capabilities so that data results are almost instantaneous
*A distributed system for executing data parallel cyclic dataflow programs offering high throughput and low latency
*Aims to provide a general purpose system which will fulfill the requirements and the will also support wide variety of high level programming models.
*Highly used for parallel execution of data
*Provides the functionality of checkpoint and restoring
*Real Time Applications:
:*Batch iterative Machine Learning:
VW, an open source distributed machine learning performs iteration in 3 phases: each process updates local state; processes independently training on local data; and process jointly performed global average which is All Reduce.
:*Streaming Acyclic Computation
When compared to a system called [http://research.microsoft.com/apps/pubs/default.aspx?id=163832 Kineograph] ( also done by Microsoft ), which processes twitter handles and provides counts of the occurrence of hashtags as well as links between popular tags, was written using Naiad in 26 lines of code and ran close to 2X faster.
* Naiad paper won the best paper award in SOSP 2013, check-out this link in Microsoft Research website http://research.microsoft.com/en-us/projects/naiad/ . Down in this page you can see some videos that explains naiad including Derek's Murray presentation at SOSP 2013.

DistOS 2015W Session 9

2015-04-04T12:07:51Z

Apoorv: /* Naiad */

== BOINC ==

*Public Resource Computing Platform
*Gives scientists the ability to use large amounts of computation resources.
*The clients do not connect directly with each other but instead they talk to a central server located at Berkley
*The goals of Boinc are:
:*1) reduce the barriers of entry
:*2) Share resources among autonomous projects
:*3) Support diverse applications
:*4) Reward participants.

*It can run as applications in common language with no modifications
A BOINC application can be identified by a single master URL, which serves as the homepage as well as the directory of the servers.

== SETI@Home ==

*Uses public resource computing to analyze radio signals to find extraterrestrial intelligence
*Need good quality telescope to search for radio signals, and lots of computational power, which was unavailable locally
*It has not yet found extraterrestrial intelligence, but its has established credibility of public resource computing projects which are given by the public
*Uses BOINC as a backbone for the project
*Uses relational database to store information on a large scale, further it uses a multi-threaded server to distribute work to clients
*Quality of data in this architecture is untrustworthy, the main incentive to use it, however, is that it is a cheap and easy way of scaling the work exponentially.
*Provided social incentives to encourage users to join the system.
*This computation model still exists but not in the legitimate world.
*Formed a good concept of public resource computing and a distributed computing by providing a platform independent framework

== MapReduce ==

*A programming model presented by Google to do large scale parallel computations
*Uses the <code>Map()</code> and <code>Reduce()</code> functions from functional style programming languages
:*Map (Filtering)
::*Takes a function and applies it to all elements of the given data set
* Hides parallelization, fault tolerance, locality optimization and load balancing
:*Reduce (Summary)
::*Accumulates results from the data set using a given function

== Naiad ==

*A programming model similar to <code>MapReduce</code> but with streaming capabilities so that data results are almost instantaneous
*A distributed system for executing data parallel cyclic dataflow programs offering high throughput and low latency
*Aims to provide a general purpose system which will fulfill the requirements and the will also support wide variety of high level programming models.
*Highly used for parallel execution of data
*Real Time Applications:
:*Batch iterative Machine Learning:
VW, an open source distributed machine learning performs iteration in 3 phases: each process updates local state; processes independently training on local data; and process jointly performed global average which is All Reduce.
:*Streaming Acyclic Computation
When compared to a system called [http://research.microsoft.com/apps/pubs/default.aspx?id=163832 Kineograph] ( also done by Microsoft ), which processes twitter handles and provides counts of the occurrence of hashtags as well as links between popular tags, was written using Naiad in 26 lines of code and ran close to 2X faster.
* Naiad paper won the best paper award in SOSP 2013, check-out this link in Microsoft Research website http://research.microsoft.com/en-us/projects/naiad/ . Down in this page you can see some videos that explains naiad including Derek's Murray presentation at SOSP 2013.

DistOS 2015W Session 9

2015-04-04T12:06:29Z

Apoorv: /* MapReduce */

== BOINC ==

*Public Resource Computing Platform
*Gives scientists the ability to use large amounts of computation resources.
*The clients do not connect directly with each other but instead they talk to a central server located at Berkley
*The goals of Boinc are:
:*1) reduce the barriers of entry
:*2) Share resources among autonomous projects
:*3) Support diverse applications
:*4) Reward participants.

*It can run as applications in common language with no modifications
A BOINC application can be identified by a single master URL, which serves as the homepage as well as the directory of the servers.

== SETI@Home ==

*Uses public resource computing to analyze radio signals to find extraterrestrial intelligence
*Need good quality telescope to search for radio signals, and lots of computational power, which was unavailable locally
*It has not yet found extraterrestrial intelligence, but its has established credibility of public resource computing projects which are given by the public
*Uses BOINC as a backbone for the project
*Uses relational database to store information on a large scale, further it uses a multi-threaded server to distribute work to clients
*Quality of data in this architecture is untrustworthy, the main incentive to use it, however, is that it is a cheap and easy way of scaling the work exponentially.
*Provided social incentives to encourage users to join the system.
*This computation model still exists but not in the legitimate world.
*Formed a good concept of public resource computing and a distributed computing by providing a platform independent framework

== MapReduce ==

*A programming model presented by Google to do large scale parallel computations
*Uses the <code>Map()</code> and <code>Reduce()</code> functions from functional style programming languages
:*Map (Filtering)
::*Takes a function and applies it to all elements of the given data set
* Hides parallelization, fault tolerance, locality optimization and load balancing
:*Reduce (Summary)
::*Accumulates results from the data set using a given function

== Naiad ==

*A programming model similar to <code>MapReduce</code> but with streaming capabilities so that data results are almost instantaneous
*A distributed system for executing data parallel cyclic dataflow programs offering high throughput and low latency
*Aims to provide a general purpose system which will fulfill the requirements and the will also support wide variety of high level programming models.
*Real Time Applications:
:*Batch iterative Machine Learning:
VW, an open source distributed machine learning performs iteration in 3 phases: each process updates local state; processes independently training on local data; and process jointly performed global average which is All Reduce.
:*Streaming Acyclic Computation
When compared to a system called [http://research.microsoft.com/apps/pubs/default.aspx?id=163832 Kineograph] ( also done by Microsoft ), which processes twitter handles and provides counts of the occurrence of hashtags as well as links between popular tags, was written using Naiad in 26 lines of code and ran close to 2X faster.
* Naiad paper won the best paper award in SOSP 2013, check-out this link in Microsoft Research website http://research.microsoft.com/en-us/projects/naiad/ . Down in this page you can see some videos that explains naiad including Derek's Murray presentation at SOSP 2013.

DistOS 2015W Session 9

2015-04-04T12:01:38Z

Apoorv: /* BOINC */

== BOINC ==

*Public Resource Computing Platform
*Gives scientists the ability to use large amounts of computation resources.
*The clients do not connect directly with each other but instead they talk to a central server located at Berkley
*The goals of Boinc are:
:*1) reduce the barriers of entry
:*2) Share resources among autonomous projects
:*3) Support diverse applications
:*4) Reward participants.

*It can run as applications in common language with no modifications
A BOINC application can be identified by a single master URL, which serves as the homepage as well as the directory of the servers.

== SETI@Home ==

*Uses public resource computing to analyze radio signals to find extraterrestrial intelligence
*Need good quality telescope to search for radio signals, and lots of computational power, which was unavailable locally
*It has not yet found extraterrestrial intelligence, but its has established credibility of public resource computing projects which are given by the public
*Uses BOINC as a backbone for the project
*Uses relational database to store information on a large scale, further it uses a multi-threaded server to distribute work to clients
*Quality of data in this architecture is untrustworthy, the main incentive to use it, however, is that it is a cheap and easy way of scaling the work exponentially.
*Provided social incentives to encourage users to join the system.
*This computation model still exists but not in the legitimate world.
*Formed a good concept of public resource computing and a distributed computing by providing a platform independent framework

== MapReduce ==

*A programming model presented by Google to do large scale parallel computations
*Uses the <code>Map()</code> and <code>Reduce()</code> functions from functional style programming languages
:*Map (Filtering)
::*Takes a function and applies it to all elements of the given data set
:*Reduce (Summary)
::*Accumulates results from the data set using a given function

== Naiad ==

*A programming model similar to <code>MapReduce</code> but with streaming capabilities so that data results are almost instantaneous
*A distributed system for executing data parallel cyclic dataflow programs offering high throughput and low latency
*Aims to provide a general purpose system which will fulfill the requirements and the will also support wide variety of high level programming models.
*Real Time Applications:
:*Batch iterative Machine Learning:
VW, an open source distributed machine learning performs iteration in 3 phases: each process updates local state; processes independently training on local data; and process jointly performed global average which is All Reduce.
:*Streaming Acyclic Computation
When compared to a system called [http://research.microsoft.com/apps/pubs/default.aspx?id=163832 Kineograph] ( also done by Microsoft ), which processes twitter handles and provides counts of the occurrence of hashtags as well as links between popular tags, was written using Naiad in 26 lines of code and ran close to 2X faster.
* Naiad paper won the best paper award in SOSP 2013, check-out this link in Microsoft Research website http://research.microsoft.com/en-us/projects/naiad/ . Down in this page you can see some videos that explains naiad including Derek's Murray presentation at SOSP 2013.

DistOS 2015W Session 9

2015-04-04T11:58:17Z

Apoorv: /* SETI@Home */

== BOINC ==

*Public Resource Computing Platform
*Gives scientists the ability to use large amounts of computation resources.
*The clients do not connect directly with each other but instead they talk to a central server located at Berkley
*The goals of Boinc are:
:*1) reduce the barriers of entry
:*2) Share resources among autonomous projects
:*3) Support diverse applications
:*4) Reward participants.
A BOINC application can be identified by a single master URL, which serves as the homepage as well as the directory of the servers.

== SETI@Home ==

*Uses public resource computing to analyze radio signals to find extraterrestrial intelligence
*Need good quality telescope to search for radio signals, and lots of computational power, which was unavailable locally
*It has not yet found extraterrestrial intelligence, but its has established credibility of public resource computing projects which are given by the public
*Uses BOINC as a backbone for the project
*Uses relational database to store information on a large scale, further it uses a multi-threaded server to distribute work to clients
*Quality of data in this architecture is untrustworthy, the main incentive to use it, however, is that it is a cheap and easy way of scaling the work exponentially.
*Provided social incentives to encourage users to join the system.
*This computation model still exists but not in the legitimate world.
*Formed a good concept of public resource computing and a distributed computing by providing a platform independent framework

== MapReduce ==

*A programming model presented by Google to do large scale parallel computations
*Uses the <code>Map()</code> and <code>Reduce()</code> functions from functional style programming languages
:*Map (Filtering)
::*Takes a function and applies it to all elements of the given data set
:*Reduce (Summary)
::*Accumulates results from the data set using a given function

== Naiad ==

*A programming model similar to <code>MapReduce</code> but with streaming capabilities so that data results are almost instantaneous
*A distributed system for executing data parallel cyclic dataflow programs offering high throughput and low latency
*Aims to provide a general purpose system which will fulfill the requirements and the will also support wide variety of high level programming models.
*Real Time Applications:
:*Batch iterative Machine Learning:
VW, an open source distributed machine learning performs iteration in 3 phases: each process updates local state; processes independently training on local data; and process jointly performed global average which is All Reduce.
:*Streaming Acyclic Computation
When compared to a system called [http://research.microsoft.com/apps/pubs/default.aspx?id=163832 Kineograph] ( also done by Microsoft ), which processes twitter handles and provides counts of the occurrence of hashtags as well as links between popular tags, was written using Naiad in 26 lines of code and ran close to 2X faster.
* Naiad paper won the best paper award in SOSP 2013, check-out this link in Microsoft Research website http://research.microsoft.com/en-us/projects/naiad/ . Down in this page you can see some videos that explains naiad including Derek's Murray presentation at SOSP 2013.

DistOS 2015W Session 11

2015-04-04T11:38:25Z

Apoorv: /* Spanner */

==BigTable==
* Google System used for storing data of various Google Products, for instance Google Analytics, Google Finance, Orkut, Personalized Search, Writely, Google Earth and many more
* Big table is
** Sparse
** Persistant
** Muti dimensional Sorted Map
*It is indexed by
** Row Key: Every read or write of data under single row key is atomic. Each row range is called Tablet. Select Row key to get good locality for data access.
** Column Key: Grouped into sets called Column Families. Forms basic unit of Access Control.All data stored is of same type.Syntax used: ''family:qualifier''
** Time Stamp:Each cell consists of multiple versions of same data which are indexed by Timestamps.In order to avoid collisions, Timestamps need to be generated by applications.
* Big Table '''API''': Provides functions for
** Creating and Deleting
*** Tables
*** Column Families
**Changing Cluster
**Changing Table
**Column Family metadata like Access Control Rights.
** Set of wrappers which allow Big Data to be used both as
*** Input source
***Output Target
*The timestamp mechanism in BIG table helps clients to access recent versions of data with simple accessing aspects of using row and column.
*Parallel computation and cluster management system makes BIG table flexible and highly scalable.

== Dynamo==
* Amazon's Key Value Store
*Availability is the buzz word for Dynamo. Dynamo=Availability
*Shifted Computer Science paradigm from caring about the consistency to availability.
*Sacrifices consistency under certain failure scenarios.
*Treats failure handling as normal case without impact on availability and performance.
*Data is partitioned and replicated using consistent hashing and consistency is facilitated by use of object versioning.
* This system has certain requirements such as:
** Query Model: Simple read and write operations to data item that are uniquely identified by a key.
**ACID properties: Atomicity, Consistency, Isolation, Durability.
**Efficiency: System needs to function on a commodity hardware infrastructure.
* Service Level Agreements(SLA): They are a negotiated contract between a client and a service regarding characteristics related to systems. They are used in order to guarantee that in a bounded time period, an application can deliver it's functionality.
* System Architecture: It consists of ''System Interface'', ''Partitioning Algorithm'', ''Replication'',''Data Versioning''.
* Successfully handles
** Server Failure
** Data Centre Failure
** Network Partitions
* Allows service owners to customize their own storage systems according to their storage systems to meet the desired performance, durability and consistency SLAs.
* Building block for highly available applications.

==Cassandra==
* Facebook's storage system to fulfil needs of the Inbox Search Problem
*Partitions data across the cluster using consistent hashing.
*Distributed multi dimensional map indexed by a key
* In it's data model:
** Columns grouped together into sets called column families. Column Families further of 2 types:
***Simple column families
***Super column families
* API consists of :
** Insert
**Get
** Delete
* System Architecture consists of :
** Partitioning: Takes place using consistent hashing
**Replication: Each item replicated at n hosts where "n" is the replication factor configured per system.
** Membership: Cluster membership is based on Scuttle butt which is a highly efficient anti-entropy Gossip based mechanism.The Membership further has sub part such as:
***Failure Detection
**Bootstrapping
** Scaling the cluster
*It can run cheap commodity hardware and handle high throughput
*Its multiple usable structure makes it very scalable

=Spanner=
* Google's scalable, multi version, globally distributed database.
* Has been built on top of the Google's Big table.
*Provided data consistency and Supports SQL like Interface.
* Uses True time to guarantee the correctness properties around concurrency control.
** The timestamps are utilized.
*It shares data across machines and migrates data automatically across machines
*Data Control Functions in spanner controls latency and performance

DistOS 2015W Session 11

2015-04-04T11:34:05Z

Apoorv: /* Spanner */

DistOS 2015W Session 11

2015-04-04T11:31:20Z

Apoorv: /* Cassandra */

==BigTable==
* Google System used for storing data of various Google Products, for instance Google Analytics, Google Finance, Orkut, Personalized Search, Writely, Google Earth and many more
* Big table is
** Sparse
** Persistant
** Muti dimensional Sorted Map
*It is indexed by
** Row Key: Every read or write of data under single row key is atomic. Each row range is called Tablet. Select Row key to get good locality for data access.
** Column Key: Grouped into sets called Column Families. Forms basic unit of Access Control.All data stored is of same type.Syntax used: ''family:qualifier''
** Time Stamp:Each cell consists of multiple versions of same data which are indexed by Timestamps.In order to avoid collisions, Timestamps need to be generated by applications.
* Big Table '''API''': Provides functions for
** Creating and Deleting
*** Tables
*** Column Families
**Changing Cluster
**Changing Table
**Column Family metadata like Access Control Rights.
** Set of wrappers which allow Big Data to be used both as
*** Input source
***Output Target
*The timestamp mechanism in BIG table helps clients to access recent versions of data with simple accessing aspects of using row and column.
*Parallel computation and cluster management system makes BIG table flexible and highly scalable.

== Dynamo==
* Amazon's Key Value Store
*Availability is the buzz word for Dynamo. Dynamo=Availability
*Shifted Computer Science paradigm from caring about the consistency to availability.
*Sacrifices consistency under certain failure scenarios.
*Treats failure handling as normal case without impact on availability and performance.
*Data is partitioned and replicated using consistent hashing and consistency is facilitated by use of object versioning.
* This system has certain requirements such as:
** Query Model: Simple read and write operations to data item that are uniquely identified by a key.
**ACID properties: Atomicity, Consistency, Isolation, Durability.
**Efficiency: System needs to function on a commodity hardware infrastructure.
* Service Level Agreements(SLA): They are a negotiated contract between a client and a service regarding characteristics related to systems. They are used in order to guarantee that in a bounded time period, an application can deliver it's functionality.
* System Architecture: It consists of ''System Interface'', ''Partitioning Algorithm'', ''Replication'',''Data Versioning''.
* Successfully handles
** Server Failure
** Data Centre Failure
** Network Partitions
* Allows service owners to customize their own storage systems according to their storage systems to meet the desired performance, durability and consistency SLAs.
* Building block for highly available applications.

==Cassandra==
* Facebook's storage system to fulfil needs of the Inbox Search Problem
*Partitions data across the cluster using consistent hashing.
*Distributed multi dimensional map indexed by a key
* In it's data model:
** Columns grouped together into sets called column families. Column Families further of 2 types:
***Simple column families
***Super column families
* API consists of :
** Insert
**Get
** Delete
* System Architecture consists of :
** Partitioning: Takes place using consistent hashing
**Replication: Each item replicated at n hosts where "n" is the replication factor configured per system.
** Membership: Cluster membership is based on Scuttle butt which is a highly efficient anti-entropy Gossip based mechanism.The Membership further has sub part such as:
***Failure Detection
**Bootstrapping
** Scaling the cluster
*It can run cheap commodity hardware and handle high throughput
*Its multiple usable structure makes it very scalable

=Spanner=
* Google's scalable, multi version, globally distributed database.
* Has been built on top of the Google's Big table.
*Provided data consistency and Supports SQL like Interface.
*Main focus is managing cross-datacentre replicated data.
* Uses True time to guarantee the correctness properties around concurrency control.
** The timestamps are utilized.

DistOS 2015W Session 11

2015-04-04T11:27:52Z

Apoorv: /* BigTable */

==BigTable==
* Google System used for storing data of various Google Products, for instance Google Analytics, Google Finance, Orkut, Personalized Search, Writely, Google Earth and many more
* Big table is
** Sparse
** Persistant
** Muti dimensional Sorted Map
*It is indexed by
** Row Key: Every read or write of data under single row key is atomic. Each row range is called Tablet. Select Row key to get good locality for data access.
** Column Key: Grouped into sets called Column Families. Forms basic unit of Access Control.All data stored is of same type.Syntax used: ''family:qualifier''
** Time Stamp:Each cell consists of multiple versions of same data which are indexed by Timestamps.In order to avoid collisions, Timestamps need to be generated by applications.
* Big Table '''API''': Provides functions for
** Creating and Deleting
*** Tables
*** Column Families
**Changing Cluster
**Changing Table
**Column Family metadata like Access Control Rights.
** Set of wrappers which allow Big Data to be used both as
*** Input source
***Output Target
*The timestamp mechanism in BIG table helps clients to access recent versions of data with simple accessing aspects of using row and column.
*Parallel computation and cluster management system makes BIG table flexible and highly scalable.

== Dynamo==
* Amazon's Key Value Store
*Availability is the buzz word for Dynamo. Dynamo=Availability
*Shifted Computer Science paradigm from caring about the consistency to availability.
*Sacrifices consistency under certain failure scenarios.
*Treats failure handling as normal case without impact on availability and performance.
*Data is partitioned and replicated using consistent hashing and consistency is facilitated by use of object versioning.
* This system has certain requirements such as:
** Query Model: Simple read and write operations to data item that are uniquely identified by a key.
**ACID properties: Atomicity, Consistency, Isolation, Durability.
**Efficiency: System needs to function on a commodity hardware infrastructure.
* Service Level Agreements(SLA): They are a negotiated contract between a client and a service regarding characteristics related to systems. They are used in order to guarantee that in a bounded time period, an application can deliver it's functionality.
* System Architecture: It consists of ''System Interface'', ''Partitioning Algorithm'', ''Replication'',''Data Versioning''.
* Successfully handles
** Server Failure
** Data Centre Failure
** Network Partitions
* Allows service owners to customize their own storage systems according to their storage systems to meet the desired performance, durability and consistency SLAs.
* Building block for highly available applications.

==Cassandra==
* Facebook's storage system to fulfil needs of the Inbox Search Problem
*Partitions data across the cluster using consistent hashing.
*Distributed multi dimensional map indexed by a key
* In it's data model:
** Columns grouped together into sets called column families. Column Families further of 2 types:
***Simple column families
***Super column families
* API consists of :
** Insert
**Get
** Delete
* System Architecture consists of :
** Partitioning: Takes place using consistent hashing
**Replication: Each item replicated at n hosts where "n" is the replication factor configured per system.
** Membership: Cluster membership is based on Scuttle butt which is a highly efficient anti-entropy Gossip based mechanism.The Membership further has sub part such as:
***Failure Detection
**Bootstrapping
** Scaling the cluster

=Spanner=
* Google's scalable, multi version, globally distributed database.
* Has been built on top of the Google's Big table.
*Provided data consistency and Supports SQL like Interface.
*Main focus is managing cross-datacentre replicated data.
* Uses True time to guarantee the correctness properties around concurrency control.
** The timestamps are utilized.

DistOS 2015W Session 12

2015-04-04T11:21:26Z

Apoorv: /* Sapphire */

=Haystack=
* Facebook's Photo Application Storage System.
* Previous Fb photo storage based on NFS design. The reason why NFS dint work is because it gave 3 reads for every photo. The issue here was that they needed 1 read per photo.
*Main goals of Haystack:
** High throughput with low latency. It uses one disk operation to provide these.
**Fault tolerance
**Cost effective
**SImple
*Facebook utilises CDN to serve popular images and further uses haystack to respond to photo requests in the long tail effectively.
*Haystack reduces the memory used for ''filesystem metadata''
*It has 2 types of metadata:
**''Application metadata''
**''File System metadata''
* The architecture consists of 3 components:
**Haystack Store
**Haystack Directory
**Haystack Cache
*Pitchfork and bulk sync were used to tolerate faults. the fault tolerance works in a very profound way to make haystack feasible and reliable

=Comet=
*Introduced the concept of distributed shared memory (DSM). In a DSM, RAMs from multiple servers would appear as if they are all belonging to one server, allowing better scalability for caching.
*client and server model maintain consistency using DSM
*Comet model works by offloading the computation intensive process from the mobile to only one server.
*The offloading process works by passing the computation intensive process to the server and hold it on the mobile device. Once the process on the server completes, it returns the results and the handle back to the mobile device. In other words, the process does not get physically offloaded to the server but instead it runs on the server and stopped on the mobile device.

=F4=
* Warm Blob Storage System.
** Warm Blob is a immutable data that gets cool very rapidly.
** F4 reduce the space usage by 3.6 to 2.8 or 2.1 replication factor using Reed Solomon coding and XOR coding respectively but still provides consistency.
*Reed Solomon coding basically use(10,4) which means 10 data and 4 parity blocks in a stripe, and can thus tolerate losing up to 4 blocks whch means it can tolerate 4 rack failure and use 1.4 expansion factor.Two copies of this would be 2* 1.4= 2.8 effective replication factor.
*XOR coding use(2,1) across three data center and use 1.5 expansion factor which gives 1.5*1.4= 2.1 effective replication factor
*The caching mechanism provides the reduction in load on storage system and it makes BLOB scalable
*The concept of hot and warm storage is used to make it simple and modular

=Sapphire=
*Represents a building block towards building this global distributed systems. The main critique to it is that it didn’t present a specific use case upon which their design is built upon.
*Sapphire does not show their scalability boundaries. There is no such distributed system model that can be “one size fits all”, most probably it will break in some large scale distributed application.
*Reaching this global distributed system that address all the distributed OS use cases will be a cumulative work of many big bodies and building it block by block and then this system will evolve by putting all these different building blocks together. In other words, reaching a global distributed system will come from a “bottom up not top down approach” [Somayaji, 2015].
*The concept of separate application logic from deployment logic helps programmers in making a flexible system. The other important part that makes it as a scalable system was that it is object based and could be integrated with any object oriented language.

DistOS 2015W Session 12

2015-04-04T11:18:31Z

Apoorv: /* F4 */

DistOS 2015W Session 12

2015-04-04T11:15:59Z

Apoorv: /* Comet */

=Haystack=
* Facebook's Photo Application Storage System.
* Previous Fb photo storage based on NFS design. The reason why NFS dint work is because it gave 3 reads for every photo. The issue here was that they needed 1 read per photo.
*Main goals of Haystack:
** High throughput with low latency. It uses one disk operation to provide these.
**Fault tolerance
**Cost effective
**SImple
*Facebook utilises CDN to serve popular images and further uses haystack to respond to photo requests in the long tail effectively.
*Haystack reduces the memory used for ''filesystem metadata''
*It has 2 types of metadata:
**''Application metadata''
**''File System metadata''
* The architecture consists of 3 components:
**Haystack Store
**Haystack Directory
**Haystack Cache
*Pitchfork and bulk sync were used to tolerate faults. the fault tolerance works in a very profound way to make haystack feasible and reliable

=Comet=
*Introduced the concept of distributed shared memory (DSM). In a DSM, RAMs from multiple servers would appear as if they are all belonging to one server, allowing better scalability for caching.
*client and server model maintain consistency using DSM
*Comet model works by offloading the computation intensive process from the mobile to only one server.
*The offloading process works by passing the computation intensive process to the server and hold it on the mobile device. Once the process on the server completes, it returns the results and the handle back to the mobile device. In other words, the process does not get physically offloaded to the server but instead it runs on the server and stopped on the mobile device.

=F4=
* Warm Blob Storage System.
** Warm Blob is a immutable data that gets cool very rapidly.
** F4 reduce the space usage by 3.6 to 2.8 or 2.1 replication factor using Reed Solomon coding and XOR coding respectively but still provides consistency.
*Reed Solomon coding basically use(10,4) which means 10 data and 4 parity blocks in a stripe, and can thus tolerate losing up to 4 blocks whch means it can tolerate 4 rack failure and use 1.4 expansion factor.Two copies of this would be 2* 1.4= 2.8 effective replication factor.
*XOR coding use(2,1) across three data center and use 1.5 expansion factor which gives 1.5*1.4= 2.1 effective replication factor

=Sapphire=
*Represents a building block towards building this global distributed systems. The main critique to it is that it didn’t present a specific use case upon which their design is built upon.
*Sapphire does not show their scalability boundaries. There is no such distributed system model that can be “one size fits all”, most probably it will break in some large scale distributed application.
*Reaching this global distributed system that address all the distributed OS use cases will be a cumulative work of many big bodies and building it block by block and then this system will evolve by putting all these different building blocks together. In other words, reaching a global distributed system will come from a “bottom up not top down approach” [Somayaji, 2015].

DistOS 2015W Session 12

2015-04-04T11:13:17Z

Apoorv: /* Haystack */

=Haystack=
* Facebook's Photo Application Storage System.
* Previous Fb photo storage based on NFS design. The reason why NFS dint work is because it gave 3 reads for every photo. The issue here was that they needed 1 read per photo.
*Main goals of Haystack:
** High throughput with low latency. It uses one disk operation to provide these.
**Fault tolerance
**Cost effective
**SImple
*Facebook utilises CDN to serve popular images and further uses haystack to respond to photo requests in the long tail effectively.
*Haystack reduces the memory used for ''filesystem metadata''
*It has 2 types of metadata:
**''Application metadata''
**''File System metadata''
* The architecture consists of 3 components:
**Haystack Store
**Haystack Directory
**Haystack Cache
*Pitchfork and bulk sync were used to tolerate faults. the fault tolerance works in a very profound way to make haystack feasible and reliable

=Comet=
*Introduced the concept of distributed shared memory (DSM). In a DSM, RAMs from multiple servers would appear as if they are all belonging to one server, allowing better scalability for caching.
*Comet model works by offloading the computation intensive process from the mobile to only one server.
*The offloading process works by passing the computation intensive process to the server and hold it on the mobile device. Once the process on the server completes, it returns the results and the handle back to the mobile device. In other words, the process does not get physically offloaded to the server but instead it runs on the server and stopped on the mobile device.
=F4=
* Warm Blob Storage System.
** Warm Blob is a immutable data that gets cool very rapidly.
** F4 reduce the space usage by 3.6 to 2.8 or 2.1 replication factor using Reed Solomon coding and XOR coding respectively but still provides consistency.
*Reed Solomon coding basically use(10,4) which means 10 data and 4 parity blocks in a stripe, and can thus tolerate losing up to 4 blocks whch means it can tolerate 4 rack failure and use 1.4 expansion factor.Two copies of this would be 2* 1.4= 2.8 effective replication factor.
*XOR coding use(2,1) across three data center and use 1.5 expansion factor which gives 1.5*1.4= 2.1 effective replication factor

=Sapphire=
*Represents a building block towards building this global distributed systems. The main critique to it is that it didn’t present a specific use case upon which their design is built upon.
*Sapphire does not show their scalability boundaries. There is no such distributed system model that can be “one size fits all”, most probably it will break in some large scale distributed application.
*Reaching this global distributed system that address all the distributed OS use cases will be a cumulative work of many big bodies and building it block by block and then this system will evolve by putting all these different building blocks together. In other words, reaching a global distributed system will come from a “bottom up not top down approach” [Somayaji, 2015].

DistOS 2015W Session 12

2015-04-03T04:18:08Z

Apoorv: /* F4 */

=Haystack=
* Facebook's Photo Application Storage System.
* Previous Fb photo storage based on NFS design. The reason why NFS dint work is because it gave 3 reads for every photo. The issue here was that they needed 1 read per photo.
*Main goals of Haystack:
** High throughput with low latency
**Fault tolerance
**Cost effective
**SImple
*Facebook utilises CDN to serve popular images and further uses haystack to respond to photo requests in the long tail effectively.
*Haystack reduces the memory used for ''filesystem metadata''
*It has 2 types of metadata:
**''Application metadata''
**''File System metadata''
* The architecture consists of 3 components:
**Haystack Store
**Haystack Directory
**Haystack Cache

=Comet=
*Introduced the concept of distributed shared memory (DSM). In a DSM, RAMs from multiple servers would appear as if they are all belonging to one server, allowing better scalability for caching.
*Comet model works by offloading the computation intensive process from the mobile to only one server.
*The offloading process works by passing the computation intensive process to the server and hold it on the mobile device. Once the process on the server completes, it returns the results and the handle back to the mobile device. In other words, the process does not get physically offloaded to the server but instead it runs on the server and stopped on the mobile device.
=F4=
* Warm Blob Storage System.
** Warm Blob is a immutable data that gets cool very rapidly.
** F4 reduce the space usage by 3.6 to 2.8 or 2.1 replication factor using Reed Solomon coding and XOR coding respectively but still provides consistency.
*Reed Solomon coding basically use(10,4) which means 10 data and 4 parity blocks in a stripe, and can thus tolerate losing up to 4 blocks whch means it can tolerate 4 rack failure and use 1.4 expansion factor.Two copies of this would be 2* 1.4= 2.8 effective replication factor.
*XOR coding use(2,1) across three data center and use 1.5 expansion factor which gives 1.5*1.4= 2.1 effective replication factor

=Sapphire=
*Represents a building block towards building this global distributed systems. The main critique to it is that it didn’t present a specific use case upon which their design is built upon.
*Sapphire does not show their scalability boundaries. There is no such distributed system model that can be “one size fits all”, most probably it will break in some large scale distributed application.
*Reaching this global distributed system that address all the distributed OS use cases will be a cumulative work of many big bodies and building it block by block and then this system will evolve by putting all these different building blocks together. In other words, reaching a global distributed system will come from a “bottom up not top down approach” [Somayaji, 2015].

DistOS 2015W Session 12

2015-04-03T04:15:27Z

Apoorv: /* F4 */

=Haystack=
* Facebook's Photo Application Storage System.
* Previous Fb photo storage based on NFS design. The reason why NFS dint work is because it gave 3 reads for every photo. The issue here was that they needed 1 read per photo.
*Main goals of Haystack:
** High throughput with low latency
**Fault tolerance
**Cost effective
**SImple
*Facebook utilises CDN to serve popular images and further uses haystack to respond to photo requests in the long tail effectively.
*Haystack reduces the memory used for ''filesystem metadata''
*It has 2 types of metadata:
**''Application metadata''
**''File System metadata''
* The architecture consists of 3 components:
**Haystack Store
**Haystack Directory
**Haystack Cache

=Comet=
*Introduced the concept of distributed shared memory (DSM). In a DSM, RAMs from multiple servers would appear as if they are all belonging to one server, allowing better scalability for caching.
*Comet model works by offloading the computation intensive process from the mobile to only one server.
*The offloading process works by passing the computation intensive process to the server and hold it on the mobile device. Once the process on the server completes, it returns the results and the handle back to the mobile device. In other words, the process does not get physically offloaded to the server but instead it runs on the server and stopped on the mobile device.
=F4=
* Warm Blob Storage System.
** Warm Blob is a immutable data that gets cool very rapidly.
** F4 reduce the space usage by 3.6 to 2.8 or 2.1 replication factor using Reed Solomon coding and XOR coding respectively but still provides consistency.
*Reed Solomon coding basically use(10,4) which means 10 data and 4 parity blocks in a stripe, and can thus tolerate losing up to 4 blocks before they lose the stripe and use 1.4 expansion factor.Two copies of this would be 2* 1.4= 2.8 effective replication factor.
*XOR coding use(2,1) across three data center and use 1.5 expansion factor which gives 1.5*1.4= 2.1 effective replication factor

=Sapphire=
*Represents a building block towards building this global distributed systems. The main critique to it is that it didn’t present a specific use case upon which their design is built upon.
*Sapphire does not show their scalability boundaries. There is no such distributed system model that can be “one size fits all”, most probably it will break in some large scale distributed application.
*Reaching this global distributed system that address all the distributed OS use cases will be a cumulative work of many big bodies and building it block by block and then this system will evolve by putting all these different building blocks together. In other words, reaching a global distributed system will come from a “bottom up not top down approach” [Somayaji, 2015].

DistOS 2015W Session 12

2015-04-03T04:14:45Z

Apoorv: /* F4 */

=Haystack=
* Facebook's Photo Application Storage System.
* Previous Fb photo storage based on NFS design. The reason why NFS dint work is because it gave 3 reads for every photo. The issue here was that they needed 1 read per photo.
*Main goals of Haystack:
** High throughput with low latency
**Fault tolerance
**Cost effective
**SImple
*Facebook utilises CDN to serve popular images and further uses haystack to respond to photo requests in the long tail effectively.
*Haystack reduces the memory used for ''filesystem metadata''
*It has 2 types of metadata:
**''Application metadata''
**''File System metadata''
* The architecture consists of 3 components:
**Haystack Store
**Haystack Directory
**Haystack Cache

=Comet=
*Introduced the concept of distributed shared memory (DSM). In a DSM, RAMs from multiple servers would appear as if they are all belonging to one server, allowing better scalability for caching.
*Comet model works by offloading the computation intensive process from the mobile to only one server.
*The offloading process works by passing the computation intensive process to the server and hold it on the mobile device. Once the process on the server completes, it returns the results and the handle back to the mobile device. In other words, the process does not get physically offloaded to the server but instead it runs on the server and stopped on the mobile device.
=F4=
* Warm Blob Storage System.
** Warm Blob is a immutable data that gets cool very rapidly.
** F4 reduce the space usage by 3.6 to 2.8 or 2.1 replication factor using Reed Solomon coding and XOR coding respectively but still provides consistency.
*Reed Solomon coding basically use(10,4) which means 10 data and 4 parity blocks in a stripe, and can thus tolerate losing up to 4 blocks before they lose the entire stripe and use 1.4 expansion factor.Two copies of this would be 2* 1.4= 2.8 effective replication factor.
*XOR coding use(2,1) across three data center and use 1.5 expansion factor which gives 1.5*1.4= 2.1 effective replication factor

=Sapphire=
*Represents a building block towards building this global distributed systems. The main critique to it is that it didn’t present a specific use case upon which their design is built upon.
*Sapphire does not show their scalability boundaries. There is no such distributed system model that can be “one size fits all”, most probably it will break in some large scale distributed application.
*Reaching this global distributed system that address all the distributed OS use cases will be a cumulative work of many big bodies and building it block by block and then this system will evolve by putting all these different building blocks together. In other words, reaching a global distributed system will come from a “bottom up not top down approach” [Somayaji, 2015].

DistOS 2015W Session 12

2015-04-03T04:04:11Z

Apoorv: /* F4 */

=Haystack=
* Facebook's Photo Application Storage System.
* Previous Fb photo storage based on NFS design. The reason why NFS dint work is because it gave 3 reads for every photo. The issue here was that they needed 1 read per photo.
*Main goals of Haystack:
** High throughput with low latency
**Fault tolerance
**Cost effective
**SImple
*Facebook utilises CDN to serve popular images and further uses haystack to respond to photo requests in the long tail effectively.
*Haystack reduces the memory used for ''filesystem metadata''
*It has 2 types of metadata:
**''Application metadata''
**''File System metadata''
* The architecture consists of 3 components:
**Haystack Store
**Haystack Directory
**Haystack Cache

=Comet=
*Introduced the concept of distributed shared memory (DSM). In a DSM, RAMs from multiple servers would appear as if they are all belonging to one server, allowing better scalability for caching.
*Comet model works by offloading the computation intensive process from the mobile to only one server.
*The offloading process works by passing the computation intensive process to the server and hold it on the mobile device. Once the process on the server completes, it returns the results and the handle back to the mobile device. In other words, the process does not get physically offloaded to the server but instead it runs on the server and stopped on the mobile device.
=F4=
* Warm Blob Storage System.
** Warm Blob is a immutable data that gets cool very rapidly.
** F4 reduce the space usage by 3.6 to 2.8 or 2.1 replication factor using Reed Solomon coding and XOR coding respectively but still provides consistency.
*Reed Solomon coding basically use(10,4) which means 10 data and 4 parity blocks in a stripe, and can thus tolerate losing up to 4 blocks before they lose the entire stripe and use 1.4 expansion factor.Two copies of this would give a 2.8 effective replication factor

=Sapphire=
*Represents a building block towards building this global distributed systems. The main critique to it is that it didn’t present a specific use case upon which their design is built upon.
*Sapphire does not show their scalability boundaries. There is no such distributed system model that can be “one size fits all”, most probably it will break in some large scale distributed application.
*Reaching this global distributed system that address all the distributed OS use cases will be a cumulative work of many big bodies and building it block by block and then this system will evolve by putting all these different building blocks together. In other words, reaching a global distributed system will come from a “bottom up not top down approach” [Somayaji, 2015].

DistOS 2015W Session 12

2015-04-03T03:57:14Z

Apoorv: /* F4 */

=Haystack=
* Facebook's Photo Application Storage System.
* Previous Fb photo storage based on NFS design. The reason why NFS dint work is because it gave 3 reads for every photo. The issue here was that they needed 1 read per photo.
*Main goals of Haystack:
** High throughput with low latency
**Fault tolerance
**Cost effective
**SImple
*Facebook utilises CDN to serve popular images and further uses haystack to respond to photo requests in the long tail effectively.
*Haystack reduces the memory used for ''filesystem metadata''
*It has 2 types of metadata:
**''Application metadata''
**''File System metadata''
* The architecture consists of 3 components:
**Haystack Store
**Haystack Directory
**Haystack Cache

=Comet=
*Introduced the concept of distributed shared memory (DSM). In a DSM, RAMs from multiple servers would appear as if they are all belonging to one server, allowing better scalability for caching.
*Comet model works by offloading the computation intensive process from the mobile to only one server.
*The offloading process works by passing the computation intensive process to the server and hold it on the mobile device. Once the process on the server completes, it returns the results and the handle back to the mobile device. In other words, the process does not get physically offloaded to the server but instead it runs on the server and stopped on the mobile device.
=F4=
* Warm Blob Storage System.
** Warm Blob is a immutable data that gets cool very rapidly.
** F4 reduce the space usage by 3.6 to 2.8 or 2.1 replication factor using Reed Solomon coding and XOR coding respectively but still provides consistency.
*Reed Solomon coding basically use(10,4) that lays blocks on different racks to ensure the failure within single datacenter and use 1.4 expansion factor.Two copies of this would give a 2.8 effective replication factor

=Sapphire=
*Represents a building block towards building this global distributed systems. The main critique to it is that it didn’t present a specific use case upon which their design is built upon.
*Sapphire does not show their scalability boundaries. There is no such distributed system model that can be “one size fits all”, most probably it will break in some large scale distributed application.
*Reaching this global distributed system that address all the distributed OS use cases will be a cumulative work of many big bodies and building it block by block and then this system will evolve by putting all these different building blocks together. In other words, reaching a global distributed system will come from a “bottom up not top down approach” [Somayaji, 2015].

DistOS 2015W Session 7

2015-02-24T05:07:04Z

Apoorv: /* Ceph */

= Ceph =
* Key advantage is that it is a general purpose distributed file system.
* System is composed of three units:
*Client,
*Cluster of Object Storage device (OSDs): It is basically stores data and metadata and clients communicate directly with it to perform IO operations.
*MetaData Server (MDS): It is used to manage the file and directories.Client basically interacts with it to perform metadata operations like open, rename. It manages the capabilities of a client.
* system has three key features:
* decoupled data and metadata:
* Dynamic Distributed Metadata Management: It distribute the metadata among multiple metadata servers using dynamic subtree partitioning to increase the performance and avoid metadata access hot spots.
* Object based storage: Using cluster of OSDs to form a Reliable Autonomic Distributed Object-Store(RADOS) for ceph failure detection and recovery.

*CRUSH (Controlled, Replicated, Under Scalable, Hashing) is the hashing algorithm used to calculate the location of object instead of looking for them. The CRUSH paper on Ceph’s website can be downloaded from here http://ceph.com/papers/weil-crush-sc06.pdf.
* RADOS (Reliable Autonomic Distributed Object-Store) is the object store for Ceph.

= Chubby =
* Is a consensus algorithm among a set of servers to agree on who is the master that is in charge of the metadata.
* Can be considered a distributed file system for small size files only “256 KB” with very low scalability “5 servers”.
* Is defined in the paper as “A lock service used within a loosely-coupled distributed system consisting of moderately large number of small machines connected by a high speed network”.