DistOS 2015W Session 11

From Soma-notes
Jump to navigation Jump to search

Session 11's theme is about Distributed Hash Tables as they are used and implemented by large commercial companies, for what purposes, and with what specializations or design constraints. Two papers focus on Google, while Facebook and Amazon get one each.


  • Google System used for storing data of various Google Products, for instance Google Analytics, Google Finance, Orkut, Personalized Search, Writely, Google Earth and many more
  • Big table is
    • Sparse
    • Persistant
    • Muti dimensional Sorted Map
  • It is indexed by
    • Row Key: Every read or write of data under single row key is atomic. Each row range is called Tablet. Select Row key to get good locality for data access.
    • Column Key: Grouped into sets called Column Families. Forms basic unit of Access Control.All data stored is of same type.Syntax used: family:qualifier
    • Time Stamp:Each cell consists of multiple versions of same data which are indexed by Timestamps.In order to avoid collisions, Timestamps need to be generated by applications.
  • Big Table API: Provides functions for
    • Creating and Deleting
      • Tables
      • Column Families
    • Changing Cluster
    • Changing Table
    • Column Family metadata like Access Control Rights.
    • Set of wrappers which allow Big Data to be used both as
      • Input source
      • Output Target
  • The timestamp mechanism in BIG table helps clients to access recent versions of data with simple accessing aspects of using row and column.
  • Parallel computation and cluster management system makes BIG table flexible and highly scalable.


  • Amazon's Key Value Store
  • Availability is the buzz word for Dynamo. Dynamo=Availability
  • Shifted Computer Science paradigm from caring about the consistency to availability.
  • Sacrifices consistency under certain failure scenarios.
  • Treats failure handling as normal case without impact on availability and performance.
  • Data is partitioned and replicated using consistent hashing and consistency is facilitated by use of object versioning.
  • This system has certain requirements such as:
    • Query Model: Simple read and write operations to data item that are uniquely identified by a key.
    • ACID properties: Atomicity, Consistency, Isolation, Durability.
    • Efficiency: System needs to function on a commodity hardware infrastructure.
  • Service Level Agreements(SLA): They are a negotiated contract between a client and a service regarding characteristics related to systems. They are used in order to guarantee that in a bounded time period, an application can deliver it's functionality.
  • System Architecture: It consists of System Interface, Partitioning Algorithm, Replication,Data Versioning.
  • Successfully handles
    • Server Failure
    • Data Centre Failure
    • Network Partitions
  • Allows service owners to customize their own storage systems according to their storage systems to meet the desired performance, durability and consistency SLAs.
  • Building block for highly available applications.


  • Facebook's storage system to fulfil needs of the Inbox Search Problem
  • Partitions data across the cluster using consistent hashing.
  • Distributed multi dimensional map indexed by a key
  • In it's data model:
    • Columns grouped together into sets called column families. Column Families further of 2 types:
      • Simple column families
      • Super column families
  • API consists of :
    • Insert
    • Get
    • Delete
  • System Architecture consists of :
    • Partitioning: Takes place using consistent hashing
    • Replication: Each item replicated at n hosts where "n" is the replication factor configured per system.
    • Membership: Cluster membership is based on Scuttle butt which is a highly efficient anti-entropy Gossip based mechanism.The Membership further has sub part such as:
      • Failure Detection
    • Bootstrapping
    • Scaling the cluster
  • It can run cheap commodity hardware and handle high throughput
  • Its multiple usable structure makes it very scalable


  • Google's scalable, multi version, globally distributed database.
  • Has been built on top of the Google's Big table.
  • Provided data consistency and Supports SQL like Interface.
  • Uses a separate high-reliability time service to guarantee the correctness properties around concurrency control.
    • The timestamps are utilized.
  • It shares data across machines and migrates data automatically across machines
  • Data Control Functions in spanner controls latency and performance