DistOS 2014W Lecture 20

Cassandra

Cassandra is essentially running a BigTable interface on top of a Dynamo infrastructure. BigTable uses GFS' built-in replication and Chubby for locking. Cassandra uses gossip algorithms (similar to Dynamo): Scuttlebutt.

A brief look at Open Source

Initialy Anil talked about Google versus Facebook approach to technologies.

Google has been good at publishing papers on their distributed system structure.
Google developed its technology internally and used it for competitive advantage.
People try to implement Google’s methods in the open source arena.
Google had proprietary technology but Facebook wanted open design.
Facebook developed its technology in open source manner. They needed to create an open source community to keep up.
Facebook managed to be very good with contributing to open source projects that they were a part of, and stay alive in the open source projects that they start. Google works internally and pitches their code "over the fence."
He talked little bit about licences. With GPL3 you have to provide code with binary. In AGPL additional service also be given with source code.
GFS optimized for data-analyst cluster.

While discussing Hbase versus Cassandra discussed why two projects with same notion are supported? Apache as a community. For any tool in computer science, particularly software tools, its important to have more than one good implementation. Only time it doesn't happen is because of market interference. An example of this was Microsoft Word, which took out most of the competition. Different implementations are commonly due to programmers disagreeing on concepts and designs.

Hadoop is a set of technologies that represent the open source equivalent of Google's infrastructure

Cassandra -> ???
HBase -> BigTable
HDFS -> GFS
Zookeeper -> Chubby

Back to Cassandra

Cassandra is basically you take a key value store system like Dynamo and then you extend to look like BigTable.
Not just a key value store. It is a multi dimensional map. You can look up different columns, etc. The data is more structured than a Key-Value store.
In a key value store, you can only look up the key. Cassandra is much richer than this.
A fundamental difference in Cassandra is that adding columns is trivial.

Bigtable vs. Cassandra:

Bigtable and Cassandra exposes similar APIs.
Cassandra seems to be lighter weight.
Bigtable depends on GFS. Cassandra depends on server's file system. Anil feels cassandra cluster is easy to setup.
Bigtable is designed for stream oriented batch processing . Cassandra is for handling online/realtime/highspeed stuff.

Schema design is explained in inbox example. It does not give clarity about how table will look like. Anil thinks they store lot data with messages which makes table crappy.

Apache Zookeeper is used for distributed configuration. It will also bootstrap and configure a new node. It is similar to Chubby. Zookeeper is for node level information. The Gossip protocol is more about key partitioning information and distributing that information amongst nodes.

Cassandra uses a modified version of the Accrual Failure Detector. The idea of an Accrual Failure Detection is that failure detection module emits a value which represents a suspicion level for each of monitored nodes. The idea is to express the value of phi� on a scale that is dynamically adjusted to react network and load conditions at the monitored nodes.

Files are written to disk in an sequential way and are never mutated. This way, reading a file does not require locks. Garbage collection takes care of deletion.

Cassandra writes in an immutable way like functional programming. There is no assignment in functional programming. It tries to eliminate side effects. Data is just binded you associate a name with a value.

Mutation is difficult to parallel due to the coordination of the current state. To escape abstraction: create code, give/add it to the operating system, and change the semantics of the operating system to better suit the application. In general, extensible systems are bad.

Cassandra -

Uses consistent hashing (like most DHTs)
Lighter weight
All most of the readings are part of Apache
More designed for low latency online updates and more optimized for reads.
Once they write to disk they only read back
Scalable multi master database with no single point of failure
Reason for not giving out the complete detail on the table schema
Probably not just inbox search
All data in one row of a table
Its not a key-value store. Big blob of data.
Gossip based protocol - Scuttlebutt. Every node is aware of overy other.
Fixed circular ring
Consistency issue not addressed at all. Does writes in an immutable way. Never change them.

Older style network protocol - token rings What sort of computational systems avoid changing data? Systems talking about implementing functional like semantics.

Comet

The major idea behind Comet is triggers/callbacks. There is an extensive literature in extensible operating systems, basically adding code to the operating system to better suit my application. "Generally, extensible systems suck." -User:Soma This was popular before operating systems were open source.

The presentation video of Comet

Comet seeks to greatly expand the application space for key-value storage systems through application-specific customization.Comet storage object is a <key,value> pair.Each Comet node stores a collection of active storage objects (ASOs) that consist of a key, a value, and a set of handlers. Comet handlers run as a result of timers or storage operations, such as get or put, allowing an ASO to take dynamic, application-specific actions to customize its behaviour. Handlers are written in a simple sandboxed extension language, providing properties of safety and isolation.ASO can modify its environment, monitor its execution,and make dynamic decisions about its state.

Researchers try to provide the ability to extend a DHT without requiring a substantial investment of effort to modify its implementation.They try to implement to isolation and safety using restricting system access,restricting resource consumption and restricting within-Comet communication.

Provids callbacks (aka. Database triggers)
Provides DHT platform that is extensible at the application level
Uses Lua
Provided extensibility in an untrusted environment. Dynamo, by contrast, was extensible but only in a trusted environment.
Why do we care? We don't really. Why would you want this extensibility? You wouldn't. It isn't worth the cost. Current systems currently have an allowance for tuneability. This extensibility only has use under undesirable circumstances.

Other

if someone wants to understand the consistent hashing in detail, here is a blog which explains it really well, this blog has other great posts in the field of distributed system as well -

http://loveforprogramming.quora.com/Distributed-Systems-Part-1-A-peek-into-consistent-hashing *