Soma-notes - User contributions [en]

DistOS 2014W Lecture 19

2014-04-26T01:26:34Z

Ronak: /* Bigtable */

== Dynamo ==

* Key value-store.
* Query model: key-value only
* Highly available, always writable.
* Guarantee Service Level Agreements (SLA).
* 0-hop DHT: it has direct link to the destination. Has complete view of system locally. No dynamic routing.
* Dynamo sacrifices consistency under certain failure scenarios.
* Consistent hashing to partition key-space: the output range of a hash function is treated as a fixed circular space or “ring”.
* Key-space is linear and the nodes partition it.
* ”Virtual Nodes”: Each server can be responsible for more than one virtual node.
* Each data item is replicated at N hosts.
* “preference list”: The list of nodes that is responsible for storing a particular key.
* Sacrifice strong consistency for availability.
** Eventual consistency.
* Decentralized, P2P, limited administration.
* it work with 100 servers,it is not bigger.
* Application/client specific conflict resolution.
* Designed to be flexible
** "Tuneable consistency"
** Pluggable local persistence: DBD, MySQL.

Amazon's motivating use case is that at no point, in a customer's shopping cart, should any newly added item be dropped. Dynamo should be highly available and always writeable.

Amazon has an service oriented architecture. A response to a client is a composite of many services, so SLA's were a HUGE consideration when designing Dynamo. Amazon needed low latency and high availability to ensure a good user experience when aggregating all the services together.

Traditional RDBMS emphasise ACID compliance. Amazon found that ACID compliancy lead to systems with far less availability. It's hard to have consistency and availability both at the same time. See [http://en.wikipedia.org/wiki/CAP_theorem CAP Theorem]. Dynamo can, and usually does sacrifice consistency for availability. They use the terms "eventual consistency" and "tunable consistency".

Key range is partitioned according to consistent hashing algorithm,which treats the output range as a fixed circular space or “ring”. Any time a new node joins in, it takes a token which decides its position on the ring. Every node becomes owner of key range which is in between itself and the previous node on the ring, so anytime a node comes in or leaves it only affects its neighbor nodes. Dynamo has this notion of virtual node, where a machine actually can host more than one node and hence allows to adjust the load according to the machine's capability.

Dynamo uses replication to provide availability, each key-value is distributed to N-1 node (N can be configured by the application that uses Dynamo).

Each node has a complete view of the network. A node knows the key-range that every node supports. Any time a node joins, the gossip based protocols are used to inform every node about the key range changes. This allows for Dynamo to be a 0-hop network. 0-hop means it is logically 0 hop network. IP routing is still be required to actually physically get to the node. This 0-hop approach is different from typical distributed hash tables where routing and hops are used to find the node responsible for a key (eg. Tapestry). Dynamo can do this because the system is deployed on trusted, fully known, networks.

Dynamo is deployed on trusted networks (ie. for amazon's internal applications. It doesn't have to worry about making the system secure. Compare this to Oceanstore.

When compared to BigTable, Dynamo typically scales to hundreds of servers, not thousands. That is not to say that Dynamo can not scale, we need to understand the difference between the use cases for BigTable and Dynamo.

Any "write" that is done on any replica is never held off to serialize the updates to maintain consistency, it will eventually try to reconcile the difference between two different versions( based on the logs) if it can not do so, the conflict resolution is left to the client application which would read data from Dynamo(If there are more than versions of a replica, all the versions along with the log is passed to client and client should reconcile the changes)

== Bigtable ==

* Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers.
* Designed to scale to a very large size
* More focused on consistency than Dynamo.
* Bigtable has achieved several goals: wide applicability, scalability, high performance, and high availability.

* A BigTable is a sparse, distributed persistent multi-dimensional sorted map.The map is indexed by a row key, column key, and a timestamp; each value in the map is an uninterpreted array of bytes.
* Column oriented DB.
** Streaming chunks of columns is easier than streaming entire rows.

* Data Model: rows made up of column families.Bigtable also treats data as uninterpreted strings, although clients often serialize various forms of structured and semi-structured data into these strings.
* The row range for a table is dynamically partitioned. Each row range is called a tablet, which is the unit of distribution and load balancing.
** Eg. Row: the page URL. Column families would either be the content, or the set of inbound links.
** Each column in a column family has copies. Timestamped.
* Bigtable schema parameters let clients dynamically control whether to serve data out of memory or from disk.
* Bigtable uses the distributed Google File System (GFS) to store log and data �files.

* Tablets: Large tables broken into tablets at row boundaries and each raw Tablet holds contiguous range of sorted rows.
** Immutable b/c of GFS. Deletion happens via garbage collection.

* An SSTable provides a persistent,ordered immutable map from keys to values, where both keys and values are arbitrary byte strings.
* Metadata operations: Create/delete tables, column families, change metadata.

===Implementation:===

* Centralized, hierchy.
* Three major components: client library, one master server, many tablet servers.

* Master server
** Assigns tablets to tablet server.
** Detects tablet additions and removals
** garbage collection on GFS.
* balancing tablet-server load
*it handles schema changes such as table and column family creations.

* Tablet Servers
** holds tablet locations.
** Manages multiple tablets (thousands per tablet server)
** Handles I/O.

* Client Library
** What devs use.
** Caches tablet locations

=== Tablet Serving ===

* The persistent state of a tablet is stored in GFS
* Updates are committed to a commit log that stores redo records.
* the recently committed ones are stored in memory in a sorted buffer called a memtable
* the older updates are stored in a sequence of SSTables.
* To recover a tablet, a tablet server reads its metadata from the METADATA table.
* This metadata contains the list of SSTables that comprise a tablet and a set of a redo points, which are pointers into any commit logs that may contain data for the tablet.
* The server reads the indices of the SSTables into memory and reconstructs the memtable by applying all of the updates that have committed since the redo points.
* When the memtable size reaches a threshold, the memtable is frozen, a new memtable is created, and the frozen memtable is converted to an SSTable and written to GFS.

=== Caching ===

To improve read performance, tablet servers use two levels of caching.

* The Scan Cache is a higher-level cache that caches the key-value pairs returned by the SSTable interface to the tablet server code. The Scan Cache is most useful for applications that tend to read the same data repeatedly.
* The Block Cache is a lower-level cache that caches SSTables blocks that were read from GFS. The Block Cache is useful for applications that tend to read data that is close to the data they recently read

Apart from that,Bigtable relies on a highly-available and persistent distributed lock service called Chubby.Bigtable uses Chubby for a variety of tasks: to ensure that there is at most one active master at any time; to store the bootstrap location of Bigtable data to discover tablet servers and �finalize tablet server deaths to store Bigtable schema information and to store access control lists.

To reduce the number of accesses by allowing clients to specify that Bloom �filters .A Bloom �filter allows us to ask whether an SSTable might contain any data for a specifi�ed row/column pair. For certain applications, a small amount of tablet server memory used for storing Bloom
�filters drastically reduces the number of disk seeks required for read operations.

=== Consider the following ===

Can big table be used in a shopping cart type of scenario, where low latency and availability are the main focus. Can it be used like Dynamo? Yes, it can, but not as well. Big Table would have more latency because it was designed for Data procession and was not designed to work under such a scenario. Dynamo was designed for different use cases. There is no one solution that can solve all the problems in the world of distributed file systems, there is no silver bullet, no - one size fits all. File systems are usually designed for specific use cases and they work best for them, later if the need be they can be molded to work on other scenarios as well and they may provide good enough performance for the later added goals as well but they would work best for the use cases,which were the targets in the beginnings.

* BigTable -> Highly consistent, Data Processing, Map Reduce, semi structured store
* Dynamo -> High availability, low latency, key-value store

== General talk ==

* Read the introduction and conclusion for each paper and think about cases in the paper more than look to how the author solve the problem.

DistOS 2014W Lecture 21

2014-04-25T09:21:15Z

Ronak:

== Presentation ==

=== Marking ===

* marked mostly on presentation, not content
* basically we want to communicate the basic structure of the paper, and do so in a way that isn't boring

=== Content ===

* concrete, not "head in the clouds"
* present the area
* compare and contrast the papers
* 10 minutes talk, 5 minutes feedback
* basic argument
* basic references

=== Form ===

* show the work we've done on paper
* try to get feedback
* think of it as a rough draft
* try to get people to read the paper
* enthusiasm
* powerpoints are easier
* don't read slides
* no whole sentences on slides
* look at talks by Mark Shuttleworth

== MapReduce ==

MapReduce is a programming model and an associated implementation for processing and generating large data sets.A clever observation that a simple solution could solve most distributed problems. It's all about programming to an abstraction that is efficiently parallelizable. Note that it's not actually a simple solution, because it sits atop a mountain of code. It requires something like BigTable which requires something like GFS, which requires something like Chubby. Despite this, it allows for programmers to easily do distributed computation using a simple framework that hides the messy details of parrallelization.

* Restricted programming model
* Interestingly large scale problems can be implemented with this
* Easy to program, powerful for certain classes of problems, it scales well.
* MapReduce job model is VERY limited though. You can't do things like simulations.
* MapReduce is problem specific.
** Naiad is less problem specific and allows you to do more.

Programming to an abstraction that is efficiently parllel. We have learnt all about infrastructure until now.
Classic OS abstractions were about files. Now we used programming abstraction.

Example: word frequency in a document.

=== How does it work? ===

* Two steps. Map and Reduce. The user writes theses.
** Map takes a single input key-value pair (eg. a named document) and converts it to an intermediate (k,v) representation. A list of new key-values.
** Reduce: Take the intermediate representation and merge the values.

Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine
failures, and managing the required inter-machine communication.

=== Execution ===

When the user program calls the MapReduce function, the following sequence of actions occurs.

* splits the input �files into M pieces of typically
* starts up many copies of the program on a cluster of machines
* One of the copies of the program is special the master. The rest are workers that are assigned work by the master.
* A worker who is assigned a map task reads the contents of the corresponding input split. It parses key/value pairs out of the input data and passes each pair to the user-de�fined Map function.
* Periodically, the buffered pairs are written to local disk, partitioned into R regions by the partitioning function. The locations of these buffered pairs on the local disk are passed back to the master, who is responsible for forwarding these locations to the reduce workers.
* The reduce worker iterates over the sorted intermediate data and for each unique intermediate key encountered, it passes the key and the corresponding set of intermediate values to the user's Reduce function.
* When all map tasks and reduce tasks have been completed, the master wakes up the user program.
* After successful completion, the output of the mapreduce execution is available in the R output �files.

The master keeps several data structures. For each map task and reduce task, it stores the state (idle, in-progress, or completed), and the identity of the worker machine (for non-idle tasks).

=== Fault Tolerance ===

The master pings every worker periodically. If no response is received from a worker in a certain amount of time, the master marks the worker as failed. Any map tasks completed by the worker are reset back to their initial idle state, and therefore become eligible for scheduling
on other workers.

In case of Master Failure, It is easy to make the master write periodic checkpoints of the master data structures described above. If the master task dies, a new copy can be started from the last checkpointed state.

=== Implementation ===

* Uses commodity HW and GFS.
* Master/Slave relationship amongst machines. Master delegates tasks to slaves.
* Intermediate representation saved as files.
* Many MapReduce jobs can happen in sequence.

== Naiad ==

Where MapReduce was suited for a specific family of solutions, Naiad tries to generalize the solution to apply parallelization to a much wider family. Naiad supports MapReduce style solutions, but also many other solutions. However, the tradeoff was simplicity. It's like we took MapReduce and took away its low barrier to entry. The idea is to create a constrained graph that can easily be parallelized.

* More complicated than Map Reduce
* Talks about Timely dataflow graphs
* Its all about Graph algorithms - Graph abstraction
* Restrictions on graphs so that they can be mapped to parllel computation
* How to fit anything to this model is a big question.
* More general than map reduce.

* After reading the MapReduce paper, you could easily write a map reduce job. After reading the Naiad, you can't. Naiad is super complicated.
* Their model is super complicated. It doesn't minimize our cognitive load.
* Doesn't scale at all. After about 40 nodes, there is no improvement in performance. MapReduce can scale to thousands of nodes and scales forever.
* Nobody wants to use it because the abstraction is complicated.

DistOS 2014W Lecture 15

2014-04-25T08:57:52Z

Ronak:

'''Designing Exercise'''

Can we do any kind of distributed system without crypto? We can't trust crypto...

What are the main features we need to consider for such a system ?
*Limited Sharing
*Integrity
*Availability

Perhaps probabilistically...

Want to be able to put data in, have it distributed, and be able to get it out on some other machine. This kind of sharing would need identification or authentication process.

Availability: "distribute the crap out of it", doesn't need crypto. No corruption of data.

Integrity: hashing, but we assume hashes can be forged. If we want to know that we got the same file, then simply send each other the file and compare.

'''Big Takeaway'''

Everything you do with crypto is a refinement of what you can already do in
weaker forms with weaker assumptions.

'''Note on Project Proposal'''
* Date has been extended until next week. As Prof said some of the proposals are not completely up to mark.

'''Farsite'''

This paper describes Farsite, a serverless distributed file system that logically functions as a centralized file server but whose physical realization is dispersed among a network of untrusted desktop workstations. Farsite is intended to provide both the benefits of a
central file server (a shared namespace, locationtransparent access, and reliable data storage) and the benefits of local desktop file systems (low cost, privacy from nosy sysadmins, and resistance to geographically localized faults).Farsite provides file availability and reliability through randomized replicated storage; it ensures the secrecy of file contents with cryptographic techniques; it maintains the integrity of
file and directory data with a Byzantine-fault-tolerant protocol.It achieves good performance by locally caching file data, lazily propagating file updates, and varying the duration and granularity of content leases.It requires no central administration to maintain.

Goal in designing Farsite is to harness the collective resources of loosely coupled, insecure, and unreliable machines to provide logically centralized, secure, and reliable file-storage service. Farsite system protects and preserves file data and directory metadata primarily through the techniques of cryptography and replication.

Farsite is not a high-speed parallel I/O system.
Farsite manages trust using public-key-cryptographic certificates.

An important assumption they mentioned is files are both read by many users and also frequently updated by at least one user which is a disadvantage in Farsite.Two technology trends are fundamental in rendering Farsite's design practical:The large amount of unused disk capacity enables the use of replication for reliability, and the relatively low cost of strong cryptography enables distributed security. Every machine in Farsite may perform three roles: It is a client, a member of a directory group, and afile host. A client is a machine that directly interacts with a user. A directory group is a set of machines that collectively manage file information using a Byzantine-fault-tolerant protocol. Every member of the group stores a replica of the information, and as the group receives client requests, each member processes these requests deterministically, updates its replica, and sends replies to the client.When a client wishes to read a file, it sends a
message to the directory group, which replies with the contents of the requested file. If the client updates the file, it sends the update to the directory group.

When a client wishes to read a file, it sends a message to the directory group, which replies with the contents of the requested file. Advantage of Farsite are (1)It adds local caching of file content on the client to improve read performance. (2)Farsite delays pushing updates to the directory group, because most file writes are deleted or overwritten shortly after they occur. (3)Performing encryption on a block level enables a client to write an individual block without having to rewrite the entire file. It also enables the client to read individual blocks without having to wait for the download of an entire file from a file host.

Farsite achieves reliability (long-term data persistence) and availability (immediate accessibility of file data when requested) mainly through replication.

The Farsite design uses two main mechanisms to keep a node's computation, communication, and storage from growing with the system size: hint-based pathname translation and delayed directory-change notification.

DistOS 2014W Lecture 15

2014-04-25T07:46:59Z

Ronak:

'''Designing Exercise'''

Can we do any kind of distributed system without crypto? We can't trust crypto...

What are the main features we need to consider for such a system ?
*Limited Sharing
*Integrity
*Availability

Perhaps probabilistically...

Want to be able to put data in, have it distributed, and be able to get it out on some other machine. This kind of sharing would need identification or authentication process.

Availability: "distribute the crap out of it", doesn't need crypto. No corruption of data.

Integrity: hashing, but we assume hashes can be forged. If we want to know that we got the same file, then simply send each other the file and compare.

'''Big Takeaway'''

Everything you do with crypto is a refinement of what you can already do in
weaker forms with weaker assumptions.

'''Note on Project Proposal'''
* Date has been extended until next week. As Prof said some of the proposals are not completely up to mark.

'''Farsite'''

This paper describes Farsite, a serverless distributed file system that logically functions as a centralized file server but whose physical realization is dispersed among a network of untrusted desktop workstations. Farsite is intended to provide both the benefits of a
central file server (a shared namespace, locationtransparent access, and reliable data storage) and the benefits of local desktop file systems (low cost, privacy from nosy sysadmins, and resistance to geographically localized faults).Farsite provides file availability and reliability through randomized replicated storage; it ensures the secrecy of file contents with cryptographic techniques; it maintains the integrity of
file and directory data with a Byzantine-fault-tolerant protocol.It achieves good performance by locally caching file data, lazily propagating file updates, and varying the duration and granularity of content leases.It requires no central administration to maintain.

An important assumption they mentioned is files are both read by many users and also frequently updated by at least one user which is a disadvantage in Farsite.Two technology trends are fundamental in rendering Farsite's design practical:The large amount of unused disk capacity enables the use of replication for reliability, and the relatively low cost of strong cryptography enables distributed security.
Every machine in Farsite may perform three roles: It is a client, a member of a directory group, and afile host. A client is a machine that directly interacts with a user. A directory group is a set of machines that collectively manage file information using a Byzantine-fault-tolerant protocol. Every member of the group stores a replica of the information, and as the group receives client requests,
each member processes these requests deterministically, updates its replica, and sends replies to the client.

When a client wishes to read a file, it sends a message to the directory group, which replies with the
contents of the requested file.
Advantage of Farsite are (1)It adds local caching of file content on the client to improve read performance. (2)Farsite delays pushing updates to the directory group, because most file writes are deleted or overwritten shortly after they occur. (3)Performing encryption on a block level enables a client to write an individual block without having to rewrite the entire file. It also enables the client to read individual blocks without having to wait for the download of an entire file from a file host.

DistOS 2014W Lecture 15

2014-04-25T03:55:04Z

Ronak:

DistOS 2014W Lecture 15

2014-04-25T03:54:19Z

Ronak:

DistOS 2014W Lecture 15

2014-04-25T03:53:40Z

Ronak:

Distributed OS: Winter 2014

2014-03-26T21:32:57Z

Ronak: /* Presentations 2 (April 3) */

==Course Information==

*'''Course Number:''' COMP 4000/5102
*'''Term:''' Winter 2014
*'''Title:''' Distributed Operating Systems
*'''Institution:''' Carleton University, School of Computer Science
*'''Instructor:''' [http://people.scs.carleton.ca/~soma Anil Somayaji] (anilsomayaji at connect.carleton.ca): Wed. 3-4 and Thurs. 12-1 in HP 5137
*'''Teaching Assistant:''' Andrew Schoenrock (aschoenr at scs.carleton.ca)
*'''Lectures:''' Tuesday and Thursday 10:05-11:25 AM, ME 3356
*'''Course Website''': http://homeostasis.scs.carleton.ca/wiki/index.php?title=Distributed_OS:_Winter_2014

==Official Course Descriptions==

'''COMP 4000:''' An advanced course emphasizing the principles of distributed operating systems including networking protocols, distributed file systems, remote IPC mechanisms, graphical user interfaces, load balancing, and process migration. Case studies include current "standards" as well as novel systems under development. Prerequisite(s): one of COMP 3203 or SYSC 4602, and one of COMP 3000, SYSC 3001, SYSC 4001.

'''COMP 5102:''' Design issues of advanced multiprocessor distributed operating systems: multiprocessor system architectures; process and object models; synchronization and message passing primitives; memory architectures and management; distributed file systems; protection and security; distributed concurrency control; deadlock; recovery; remote tasking; dynamic reconfiguration; performance measurement, modeling, and system tuning. Prerequisite(s): COMP 3000 and COMP 3203 or equivalent.

==Communication==

This wiki page is the canonical source of information on this course. Please refer to it for updates. When significant changes are made to this document it will be either announced in lecture and/or posted in the course discussion forum.

Online course discussions will be on [http://culearn.carleton.ca cuLearn].

You should get an account on this wiki so you can edit content here. Email Prof. Somayaji to get one with your preferred username and email address to which a password should be sent.

==Required Textbooks/Software==

There are no required textbooks or software for this course. Instead we will be reading research papers which will be linked to from this webpage. While many of these papers will be available directly via web search, some will be behind paywalls. In this case there will be alternate links to those pages that go through the Carleton Library's proxy.

==Grading==

Students enrolled in COMP 4000 (undergraduates) have the following grading scheme:

* 15% Class Participation
* 15% Reading Responses
* 10% Lecture Notes/Wiki contributions
* 25% Midterm (Feb. 27th)
* 35% Final Exam

Students enrolled in COMP 5102 (mostly graduate students) instead have this grading scheme:

* 15% Class Participation
* 15% Reading Responses
* 10% Lecture Notes/Wiki contributions
* 10% Project Proposal (Feb. 13th - extended!)
* 15% Project Presentation (April 1st, 3rd)
* 35% Final Project

Optionally, students enrolled in COMP 4000 may choose to be graded on the COMP 5102 grading scheme.

Each of these elements are explained below.

==Class Participation==

You are expected to attend every class for this course. Moreover, you are expected to participate in each class. This participation part of your grade will be based in part upon attendance; however, it will also be based upon the degree to which you were an active participant. Students who attend every class but who do nothing while in class will get a worse participation grade than those who miss some classes but who fully participate in those they do attend.

==Reading Responses==

* Before the start of class on each Tuesday you should submit an all-text reading response on cuLearn that is approximately 500 words in length. Responses longer than 600 words may be marked off for verbosity. Your responses should say what you found interesting and what do you have questions about/were confused by. Where appropriate, they should also discuss the relationship between the papers of the week and other work that you know about (including those covered earlier in class).
* '''Do not summarize the readings.''' I've already read them! Instead you should be telling me what you got out of these papers, good and bad. Please also tell me what issues you'd like to learn more about, either in class or potentially through later readings.
* Responses will be graded on a scale of 0 to 4, with a 4 being given for a response that has clear evidence that you made an effort to read and understand all of the assigned readings.
* Please submit your responses in plain text or PDF format. (No MS Word files! Please convert to PDF or text before submitting.) You may want to consider writing your response as a text file formatted in [http://en.wikipedia.org/wiki/Markdown Markdown] and then convert the output to PDF using [http://johnmacfarlane.net/pandoc/ pandoc].
* Responses will be accepted until Thursday, 6 AM. Responses submitted after Tuesday morning, though, will have a maximum grade of 3, not 4.

==Class Notes/Wiki contributions==

Everyone will be expected to co-author the notes for at least one class and upload them to the class wiki. Class notes should capture the content of lectures and discussions in a way that will aid study for the midterm and final, especially for those who have missed that particular class. Draft class notes are due on the wiki a week after the class; however, it is recommended that preliminary notes be uploaded within 48 hours of class. Notes will need to be revised until they are acceptable; however, students other than the original authors may elect to make modifications and updates.

Wiki contribution grades will be determined based on the overall quality of wiki contributions over the course of the term.

==Midterm and Final Exam==

Undergraduates by default will be required to complete an in-class midterm exam and a formally scheduled final exam. These will be essay tests based on the material covered in class. Sample questions will be made available during study sessions prior to the exams.

==Project==

The project may be a literature review of a specialized area of computer science related to distributed operating systems, or it may be a research proposal on a problem related to distributed operating systems. A research proposal should be thought of as an abbreviated literature review paper combined with a description of potential future work that would fill a gap in the covered literature.

You may choose to follow up on your proposal and actually implement what you propose; given the implementation complexity of most research problems in distributed operating systems, though, such an implementation is strictly optional (but may be advisable if you wish to make your project publishable).

Your project outline should consist of an abstract, an argument outline, and at least ten references that you plan to cite in your final project.

You should run ideas for your project by Prof. Somayaji before writing your proposal before you spent time making your outline.

==Collaboration==

Collaboration on all work is allowed except for the midterm and final exams. Collaboration, however, should be clearly acknowledged. Specifically, co-authored works should be marked as such. When co-authored, all authors of reading responses and projects will get the same grade, unless there is reason to believe that some co-authors did not in fact contribute significantly to the submitted work. Co-authored contributions may get different grades depending upon the relative contribution of the different authors; however, the default here will also be to give all authors the same grade.

It is '''essential''' that outside references be cited appropriately. Proper citation format should be followed except where more relaxed forms are specifically allowed.

Plagiarism or intellectual dishonesty of any kind is strictly forbidden. In other words, it should always be clear what is your work and what is the work of others. If anything you submit is, in part or whole, very similar in content or structure to that of work produced by someone else, you are plagiarizing. This includes figures.

Think of plagiarism as a kind of unauthorized collaboration. Don't do it. Plagiarism and other instructional offenses will be reported to the Dean of Science for disciplinary action, as per university guidelines.

==Topics==

===The Early Internet (Jan. 14)===

* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/kahn1972-resource.pdf Robert E. Kahn, "Resource-Sharing Computer Communications Networks" (1972)] [http://dx.doi.org/10.1109/PROC.1972.8911 (DOI)]
* [https://archive.org/details/ComputerNetworks_TheHeraldsOfResourceSharing Computer Networks: The Heralds of Resource Sharing (1972)] - video

===The Alto (Jan. 16)===

* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/alto.pdf Thacker et al., "Alto: A Personal computer" (1979)] ([https://archive.org/details/bitsavers_xeroxparcttoAPersonalComputer_6560658 archive.org])

===The Mother of all Demos (Jan. 21)===

If you can, watch the whole demo. The Stanford version with annotated clips is good if you are short for time.

* [http://www.dougengelbart.org/firsts/dougs-1968-demo.html Doug Engelbart Institute, "Doug's 1968 Demo"]
* [http://en.wikipedia.org/wiki/The_Mother_of_All_Demos Wikipedia's page on "The Mother of all Demos"]

===The Early Web (Jan. 23)===

* [https://archive.org/details/02Kahle000673 Berners-Lee et al., "World-Wide Web: The Information Universe" (1992)], pp. 52-58
* [http://www.youtube.com/watch?v=72nfrhXroo8 Alex Wright, "The Web That Wasn't" (2007)], Google Tech Talk

===UNIX and Plan 9 (Jan. 28)===

* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/unix.pdf Dennis M. Ritchie and Ken Thompson, "The UNIX Time-Sharing System" (1974)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2014w/presotto-plan9.pdf Presotto et. al, Plan 9, A Distributed System (1991)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2014w/pike-plan9.pdf Pike et al., Plan 9 from Bell Labs (1995)]

===NFS and AFS (Jan 30)===

* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/sandberg-nfs.pdf Russel Sandberg et al., "Design and Implementation of the Sun Network Filesystem" (1985)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/howard-afs.pdf John H. Howard et al., "Scale and Performance in a Distributed File System" (1988)]

===GFS and Ceph (Feb. 4)===
* [http://research.google.com/archive/gfs-sosp2003.pdf Sanjay Ghemawat et al., "The Google File System" (SOSP 2003)]
* [http://www.usenix.org/events/osdi06/tech/weil.html Weil et al., Ceph: A Scalable, High-Performance Distributed File System (OSDI 2006)].

===Chubby (Feb. 13)===
Note: no response this week, as proposals are due on this day.
* [https://www.usenix.org/legacy/events/osdi06/tech/burrows.html Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI 2006)]

===Oceanstore (Mar. 4)===

* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)]

===Farsite (Mar. 6)===

* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/adya-farsite-intro.pdf Atul Adya et al.,"FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment" (2002)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/bolosky-farsite-retro.pdf William J. Bolosky et al., "The Farsite Project: A Retrospective" (2007)]

===Public Resource Computing (March 11)===

* Anderson et al., "SETI@home: An Experiment in Public-Resource Computing" (CACM 2002) [http://dx.doi.org/10.1145/581571.581573 (DOI)] [http://dl.acm.org.proxy.library.carleton.ca/citation.cfm?id=581573 (Proxy)]
* Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004) [http://dx.doi.org/10.1109/GRID.2004.14 (DOI)] [http://ieeexplore.ieee.org.proxy.library.carleton.ca/stamp/stamp.jsp?tp=&arnumber=1382809 (Proxy)]

===Project Help Session (March 13)===

* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://www.writing.utoronto.ca/advice/specific-types-of-writing/literature-review Taylor, "The Literature Review: A Few Tips On Conducting It"]

===Distributed Hash Tables (March 18)===

* [http://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables]
* [http://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)]

===Structured Data 1 (March 20)===

* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]
* [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)]

===Structured Data 2 (March 25)===

* [http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf Lakshman & Malik, "Cassandra - A Decentralized Structured Storage System" (LADIS 2009)]
* [https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Geambasu.pdf Geambasu et al., "Comet: An active distributed key-value store" (OSDI 2010)]

===Computational Models (March 27)===

* [http://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)]
* [http://dl.acm.org/citation.cfm?doid=2517349.2522738 Murray et al., "Naiad: a timely dataflow system" (SOSP 2013)]

===Presentations 1 (April 1)===

* Gehana
* Keerthi
* Simon & Peter
* Adam
*

===Presentations 2 (April 3)===

* Mojgan
* Mohammed
* Sijo
* Sandarbh
* Ronak Chaudhari

===The Future (April 8)===

* [http://dl.acm.org/citation.cfm?id=1562783 Berkeley CS Dept., "A view of the parallel computing landscape" (CACM 2009)]
* [http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf Berkeley CS Dept., "The Landscape of Parallel Computing Research: A View from Berkeley" (2006 Tech Report)] Note Section 6.2, "Deconstructing operating system support".

===Grid Computing (Optional)===

* Foster et al., "The Anatomy of the Grid: Enabling Scalable Virtual Organizations" (IJHPCA 2001) [http://dx.doi.org/10.1177/109434200101500302 (DOI)] [http://hpc.sagepub.com.proxy.library.carleton.ca/content/15/3/200 (Library Proxy)]
* Foster et al., "Cloud Computing and Grid Computing 360-Degree Compared" (GCE 2008) [http://dx.doi.org/10.1109/GCE.2008.4738445 (DOI)] [http://arxiv.org/abs/0901.0131 (arXiv.org)]

==Schedule==

<table style="width: 100%;" border="1" cellpadding="4" cellspacing="0">
<tr valign="top">
<th>
Date
</th>
<th>
Topic
</th>
</tr>
<tr>
<td>
Jan. 7

</td>
<td>
[[DistOS 2014W Lecture 1|Lecture 1]]

</td>
</tr>
<tr>
<td>
Jan. 9

</td>
<td>
[[DistOS 2014W Lecture 2|Lecture 2]]

</td>
</tr>
<tr>
<td>
Jan. 14

</td>
<td>
[[DistOS 2014W Lecture 3|Lecture 3]]

</td>
</tr>
<tr>
<td>
Jan. 16

</td>
<td>
[[DistOS 2014W Lecture 4|Lecture 4]]

</td>
</tr>
<tr>
<td>
Jan. 21

</td>
<td>
[[DistOS 2014W Lecture 5|Lecture 5]]

</td>
</tr>
<tr>
<td>
Jan. 23

</td>
<td>
[[DistOS 2014W Lecture 6|Lecture 6]]

</td>
</tr>
<tr>
<td>
Jan. 28

</td>
<td>
[[DistOS 2014W Lecture 7|Lecture 7]]

</td>
</tr>
<tr>
<td>
Jan. 30

</td>
<td>
[[DistOS 2014W Lecture 8|Lecture 8]]

</td>
</tr>
<tr>
<td>
Feb. 4

</td>
<td>
[[DistOS 2014W Lecture 9|Lecture 9]]

</td>
</tr>
<tr>
<td>
Feb. 6

</td>
<td>
[[DistOS 2014W Lecture 10|Lecture 10]]

</td>
</tr>
<tr>
<td>
Feb. 11

</td>
<td>
[[DistOS 2014W Lecture 11|Lecture 11]]

</td>
</tr>
<tr>
<td>
Feb. 13

</td>
<td>
[[DistOS 2014W Lecture 12|Lecture 12]] '''Project Proposals Due'''

</td>
</tr>
<tr>
<td>
Feb. 25

</td>
<td>
[[DistOS 2014W Midterm Review|Midterm Review]]

</td>
</tr>
<tr>
<td>
Feb. 27

</td>
<td>
Midterm Exam

</td>
</tr>
<tr>
<td>
Mar. 4

</td>
<td>
[[DistOS 2014W Lecture 14|Lecture 14]]

</td>
</tr>
<tr>
<td>
Mar. 6

</td>
<td>
[[DistOS 2014W Lecture 15|Lecture 15]]

</td>
</tr>
<tr>
<td>
Mar. 11

</td>
<td>
[[DistOS 2014W Lecture 16|Lecture 16]]

</td>
</tr>
<tr>
<td>
Mar. 13

</td>
<td>
[[DistOS 2014W Lecture 17|Lecture 17]]

</td>
</tr>
<tr>
<td>
Mar. 18

</td>
<td>
[[DistOS 2014W Lecture 18|Lecture 18]]

</td>
</tr>
<tr>
<td>
Mar. 20

</td>
<td>
[[DistOS 2014W Lecture 19|Lecture 19]]

</td>
</tr>
<tr>
<td>
Mar. 25

</td>
<td>
[[DistOS 2014W Lecture 20|Lecture 20]]

</td>
</tr>
<tr>
<td>
Mar. 27

</td>
<td>
[[DistOS 2014W Lecture 21|Lecture 21]]

</td>
</tr>
<tr>
<td>
Apr. 1

</td>
<td>
[[DistOS 2014W Lecture 22|Lecture 22]]

</td>
</tr>
<tr>
<td>
Apr. 3

</td>
<td>
[[DistOS 2014W Lecture 23|Lecture 23]]

</td>
</tr>
<tr>
<td>
Apr. 8

</td>
<td>
[[DistOS 2014W Lecture 24|Lecture 24]]

</td>
</tr>
<tr>
<td>
April 24, 2 PM

</td>
<td>
'''Final Exam'''

</td>
</tr>
</table>

==University Policies==

===Student Academic Integrity Policy===

Every student should be familiar with the Carleton University student academic integrity policy. A student found in violation of academic integrity standards may be awarded penalties which range from a reprimand to receiving a grade of F in the course or even being expelled from the program or University. Some examples of offences are: plagiarism and unauthorized co-operation or collaboration. Information on this policy may be found in the Undergraduate Calendar.

===Plagiarism===

As defined by Senate, "plagiarism is presenting, whether intentional or not, the ideas, expression of ideas or work of others as one's own". Such reported offences will be reviewed by the office of the Dean of Science.

===Unauthorized Co-operation or Collaboration===

Senate policy states that "to ensure fairness and equity in assessment of term work, students shall not co-operate or collaborate in the completion of an academic assignment, in whole or in part, when the instructor has indicated that the assignment is to be completed on an individual basis".

Please see above for the specific collaboration policy for this course.

===Academic Accommodations for Students with Disabilities===

The Paul Menton Centre for Students with Disabilities (PMC) provides services to students with Learning Disabilities (LD), psychiatric/mental health disabilities, Attention Deficit Hyperactivity Disorder (ADHD), Autism Spectrum Disorders (ASD), chronic medical conditions, and impairments in mobility, hearing, and vision. If you have a disability requiring academic accommodations in this course, please contact PMC at 613-520-6608 or pmc@carleton.ca for a formal evaluation. If you are already registered with the PMC, contact your PMC coordinator to send me your Letter of Accommodation at the beginning of the term, and no later than two weeks before the first in-class scheduled test or exam requiring accommodation (if applicable). After requesting accommodation from PMC, meet with me to ensure accommodation arrangements are made. Please consult the PMC website for the deadline to request accommodations for the formally-scheduled exam (if applicable) at http://www2.carleton.ca/pmc/new-and-current-students/dates-and-deadlines

===Religious Obligation===

Write to the instructor with any requests for academic accommodation during the first two weeks of class, or as soon as possible after the need for accommodation is known to exist. For more details visit the Equity Services website: http://www2.carleton.ca/equity/

===Pregnancy Obligation===

Write to the instructor with any requests for academic accommodation during the first two weeks of class, or as soon as possible after the need for accommodation is known to exist. For more details visit the Equity Services website: http://www2.carleton.ca/equity/

===Medical Certificate===

The following is a link to the official medical certificate accepted by Carleton University for the deferral of final examinations or assignments in undergraduate courses. To access the form, please go to http://www.carleton.ca/registrar/forms

DistOS 2014W Lecture 20

2014-03-25T18:51:54Z

Ronak: /* Comet */

== Cassandra ==

Cassandra is essentially running a BigTable interface on top of a Dynamo infrastructure. BigTable uses GFS' built-in replication and Chubby for locking. Cassandra uses gossip algorithms: [http://dl.acm.org/citation.cfm?id=1529983 Scuttlebutt].

Initialy Anil talked about google versus facebook approach to technologies.Google developed its technology internally and used for competitive advantage.Facebook developed its technology in open source manner.He talked little bit about licences. Gpl 3 you have to provide code with binary. In AGPL additional service also be given with source code.

While discussing Hbase versus Cassandra discussed why two projects with same notion are supported?Apache as a community. For any tool in CS particularly software tools, its actually important to have more than one good implementation. Only time it doesn't happen because of market realities.

Bigtable and Cassandra exposes similar apis. Bigtable needs GFS. Cassandra depends on server's file system.Anil feels cassandra cluster is easy to setup. Bigtable is designed for batch updates.Cassandra is for handling realtime stuff.

Schema design is explained in inbox example.It does not give clarity about how table will look like. Anil thinks they store lot data with messages which makes table crappy.

Cassandra is design for high speed access and online operation.

Apache Zookeeper is used for distributed configuration. Zookeeper is similar to chubby. Zookeeper is for node level information.Gossip is more about key partitioning.Zookeeper is for configuration of new node.

Cassandra uses a modified version of the Accrual Failure Detector. The idea of an Accrual Failure Detection is that failure detection module emits a value which represents a suspicion level for each of monitored nodes. The idea is to express the value of phi� on a scale that is dynamically adjusted to react network and load conditions at the monitored nodes.

Cassandra writes in immutable way like functional programming.There is no assignment in functional programming. It tries to eliminate side effects. Data is just binded you associate a name with a value. Garbage collection.

Casandra -
GFS type cluster which big table depends on
Lighter weight
All most of the readings are part of Apache
More designed for online updates for interactive lower latency
Once they write to disk they only read back
Scalable multi master database with no single point of failure
Reason for not giving out the complete detail on the table schema
Probably not just inbox search
All data in one row of a table
Its not a key-value store. Big blob of data.
Gossip based protocol - Scuttlebutt
Fixed circular ring
Consistency issue not addressed at all. Does writes in an immutable way. Never change them.

Older style network protocol - token rings
What sort of computational systems avoid changing data?
Systems talking about implementing functional like semantics.

== Comet ==

The major idea behind Comet is triggers/callbacks. There is an extensive literature in extensible operating systems, basically adding code to the operating system to better suit my application. "Generally, extensible systems suck." -[[User:Soma]]

[https://www.usenix.org/conference/osdi10/comet-active-distributed-key-value-store The presentation video of Comet]

Comet seeks to greatly expand the application space for key-value storage systems through application-specific customization.Comet storage object is a <key,value> pair.Each Comet node stores a collection of active storage objects (ASOs) that consist of a key, a value, and a set of handlers. Comet handlers run as a result of timers or storage operations, such as get or put, allowing an ASO to take dynamic, application-specific actions to customize its behaviour. Handlers are written in a simple sandboxed extension language, providing properties of safety and isolation.ASO can modify its environment, monitor its execution,and make dynamic decisions about its state.

Researchers try to provide the ability to extend a DHT without requiring a substantial investment of effort to modify its implementation.They try to implement to isolation and safety using restricting system access,restricting resource consumption and restricting within-Comet communication.

DistOS 2014W Lecture 20

2014-03-25T16:50:15Z

Ronak:

== Cassandra ==

Cassandra is essentially running a BigTable interface on top of a Dynamo infrastructure. BigTable uses GFS' built-in replication and Chubby for locking. Cassandra uses gossip algorithms: [http://dl.acm.org/citation.cfm?id=1529983 Scuttlebutt].

Initialy Anil talked about google versus facebook approach to technologies.Google developed its technology internally and used for competitive advantage.Facebook developed its technology in open source manner.He talked little bit about licences. Gpl 3 you have to provide code with binary.In AGPL additional service also be given with source code.

While discussing Hbase versus Cassandra discussed why two projects with same notion are supported?Apache as a community. For any tool in CS particularly software tools, its actually important to have more than one good implementation. Only time it doesn't happen because of market realities.

Bigtable and Cassandra exposes similar apis.Bigtable needs GFS.Cassandra depends on server's file system.Anil feels cassandra cluster is easy to setup.Bigtable is designed for batch updates.Cassandra is for handling realtime stuff.

Schema design is explained in inbox example.It does not give clarity about how table will look like.Anil thinks they store lot data with messages which makes table crappy.

Cassandra is design for high speed access and online operation.

Apache Zookeeper is used for distributed configuration.Zookeeper is similar to chhuby. Zookeeper is for node level information.Gossip is more about key partitioning.Zookeeper is for configuration of new node.

Cassandra uses a modified version of the Accrual Failure Detector. The idea of an Accrual Failure Detection is that failure detection module emits a value which represents a suspicion level for each of monitored nodes. The idea is to express the value of phi� on a scale that is dynamically adjusted to react network and load conditions at the monitored nodes.

Cassandra writes in immutable way like functional programming.There is no assignment in functional programming.It tries to eliminate side effects.Data is just binded you associate a name with a value. Garbage collection.

Casandra -
GFS type cluster which big table depends on
Lighter weight
All most of the readings are part of Apache
More designed for online updates for interactive lower latency
Once they write to disk they only read back
Scalable multi master database with no single point of failure
Reason for not giving out the complete detail on the table schema
Probably not just inbox search
All data in one row of a table
Its not a key-value store. Big blob of data.
Gossip based protocol - Scufflebutt
Fixed circular ring
Consistency issue not addressed at all. Does writes in an immutable way. Never change them.

Older style network protocol - token rings
What sort of computational systems avoid changing data?
Systems talking about implementing functional like semantics.

== Comet ==

The major idea behind Comet is triggers/callbacks. There is an extensive literature in extensible operating systems, basically adding code to the operating system to better suit my application. "Generally, extensible systems suck." -[[User:Soma]]

[https://www.usenix.org/conference/osdi10/comet-active-distributed-key-value-store The presentation video of Comet]

Comet seeks to greatly expand the application space for key-value storage systems through application-specific customization.Each Comet node stores a collection of active storage objects (ASOs) that consist of a key, a value, and a set of handlers. Comet handlers run as a result of timers or storage operations, such as get or put, allowing an ASO to take dynamic, application-specific actions to customize its behaviour. Handlers are written in a simple sandboxed extension language, providing properties of safety and isolation.

DistOS 2014W Lecture 20

2014-03-25T16:45:02Z

Ronak: /* Comet */

== Cassandra ==

Cassandra is essentially running a BigTable interface on top of a Dynamo infrastructure. BigTable uses GFS' built-in replication and Chubby for locking. Cassandra uses gossip algorithms: [http://dl.acm.org/citation.cfm?id=1529983 Scuttlebutt].

Initialy Anil talked about google versus facebook approach to technologies.Google developed its technology internally and used for competitive advantage.Facebook developed its technology in open source manner.He talked little bit about licences. Gpl 3 you have to provide code with binary.In AGPL additional service also be given with source code.

While discussing Hbase versus Cassandra discussed why two projects with same notion are supported?Apache as a community. For any tool in CS particularly software tools, its actually important to have more than one good implementation. Only time it doesn't happen because of market realities.

Bigtable needs GFS.cassandra depends on server's file system.Anil feels cassandra cluster is easy to setup.Bigtable is designed for batch updates.Cassandra is for handling realtime stuff.

Schema design is explained in inbox example.It does not give clarity about how table will look like.Anil thinks they store lot data with messages which makes table crappy.

Cassandra is design for high speed access and online operation.

Apache Zookeeper is used for distributed configuration.Zookeeper is similar to chhuby. Zookeeper is for node level information.Gossip is more about key partitioning.Zookeeper is for configuration of new node.

Cassandra writes in immutable way like functional programming.There is no assignment in functional programming.It tries to eliminate side effects.Data is just binded you associate a name with a value. Garbage collection.

Casandra -
GFS type cluster which big table depends on
Lighter weight
All most of the readings are part of Apache
More designed for online updates for interactive lower latency
Once they write to disk they only read back
Scalable multi master database with no single point of failure
Reason for not giving out the complete detail on the table schema
Probably not just inbox search
All data in one row of a table
Its not a key-value store. Big blob of data.
Gossip based protocol - Scufflebutt
Fixed circular ring
Consistency issue not addressed at all. Does writes in an immutable way. Never change them.

Older style network protocol - token rings
What sort of computational systems avoid changing data?
Systems talking about implementing functional like semantics.

== Comet ==

The major idea behind Comet is triggers/callbacks. There is an extensive literature in extensible operating systems, basically adding code to the operating system to better suit my application. "Generally, extensible systems suck." -[[User:Soma]]

[https://www.usenix.org/conference/osdi10/comet-active-distributed-key-value-store The presentation video of Comet]

Comet seeks to greatly expand the application space for key-value storage systems through application-specific customization.Each Comet node stores a collection of active storage objects (ASOs) that consist of a key, a value, and a set of handlers. Comet handlers run as a result of timers or storage operations, such as get or put, allowing an ASO to take dynamic, application-specific actions to customize its behaviour. Handlers are written in a simple sandboxed extension language, providing properties of safety and isolation.

DistOS 2014W Lecture 20

2014-03-25T16:40:08Z

Ronak:

DistOS 2014W Lecture 20

2014-03-25T16:38:19Z

Ronak:

DistOS 2014W Lecture 20

2014-03-25T16:37:48Z

Ronak:

DistOS 2014W Lecture 20

2014-03-25T16:29:52Z

Ronak:

== Cassandra ==

Cassandra is essentially running a BigTable interface on top of a Dynamo infrastructure. BigTable uses GFS' built-in replication and Chubby for locking. Cassandra uses gossip algorithms: [http://dl.acm.org/citation.cfm?id=1529983 Scuttlebutt]. Apache Zookeeper is used for distributed configuration.

Initialy Anil talked about google versus facebook approach to technologies.Google developed its technology internally and used for competitive advantage.Facebook developed its technology in open source manner.He talked little bit about licences.Gpl 3 you have to provide code with binary
In AGPL additional service also be given with source code.

Bigtable needs GFS.cassandra depends on server's file system.Anil feels cassndra cluster is easy to setup.Bigtable is designed for batch updates.Cassandra is for handling realtime stuff.

Schema design is explained in inbox example.It does does not give clarity about how table will look like.Anil thinks they store lot data with messages which makes table crappy.

Cassandra is design for high speed access and online operation.

Zookeeper is similar to chhuby. Zookeeper is for node level information.Gossip is more about key partitioning.Zookeeper is for configuration of new node.

Cassandra writes in immutable way like functional programming.There is no assignment in functional programming.

Casandra -
GFS type cluster which big table depends on
Lighter weight
All most of the readings are part of Apache
More designed for online updates for interactive lower latency
Once they write to disk they only read back
Scalable multi master database with no single point of failure
Reason for not giving out the complete detail on the table schema
Probably not just inbox search
All data in one row of a table
Its not a key-value store. Big blob of data.
Gossip based protocol - Scufflebutt
Fixed circular ring
Consistency issue not addressed at all. Does writes in an immutable way. Never change them.

Cassandra

Cassandra is essentially running a BigTable interface on top of a Dynamo infrastructure. BigTable uses GFS' built-in replication and Chubby for locking. Cassandra uses gossip algorithms: [http://dl.acm.org/citation.cfm?id=1529983 Scuttlebutt]. Apache Zookeeper is used for distributed configuration.

Discussions

Athero GPL
Cassandra vs Big table
BigTable is not part of Hadoop.
BigTable " Funny thing on top of GFS"
History of HDFS
Hbase
Why two projects with same notion are supported?
Apache as a community. For any tool in CS particularly software tools, its actually important to have more

than one good implementation. Only time it doesnt happen because of market realities.
Older style network protocol - token rings
What sort of computational systems avoid changing data?
Functional programming languages -
Whats different from classic c and c++. Tries to eliminate side effects. Functional there is no assignment.Data is just binded you associate a name with a value. Garbage collection. No mutation of data.
Systems talking about implementing functional like semantics.

== Comet ==

The major idea behind Comet is triggers/callbacks. There is an extensive literature in extensible operating systems, basically adding code to the operating system to better suit my application. "Generally, extensible systems suck." -[[User:Soma]]

[https://www.usenix.org/conference/osdi10/comet-active-distributed-key-value-store The presentation video of Comet]

DistOS 2014W Lecture 20

2014-03-25T15:46:01Z

Ronak:

== Cassandra ==

Cassandra is essentially running a BigTable interface on top of a Dynamo infrastructure. BigTable uses GFS' built-in replication and Chubby for locking. Cassandra uses gossip algorithms: [http://dl.acm.org/citation.cfm?id=1529983 Scuttlebutt]. Apache Zookeeper is used for distributed configuration.

Casandra -
GFS type cluster which big table depends on
Lighter weight
All most of the readings are part of Apache
More designed for online updates for interactive lower latency
Once they write to disk they only read back
Scalable multi master database with no single point of failure
Reason for not giving out the complete detail on the table schema
Probably not just inbox search
All data in one row of a table
Its not a key-value store. Big blob of data.
Gossip based protocol - Scufflebutt
Fixed circular ring
Consistency issue not addressed at all. Does writes in an immutable way. Never change them.

Cassandra

Cassandra is essentially running a BigTable interface on top of a Dynamo infrastructure. BigTable uses GFS' built-in replication and Chubby for locking. Cassandra uses gossip algorithms: [http://dl.acm.org/citation.cfm?id=1529983 Scuttlebutt]. Apache Zookeeper is used for distributed configuration.

Google developed its technology internally and used for competitive advantage.
Facebook developed its technology in open source manner.
Gpl 3 you have to provide code with binary
In AGPL addtional service also be given with source code.

Bigtable needs gfs.cassandra depends on server's file system.Anil feels cassndra cluster is easy to setup.

Bigtable is designed for batch updates.cassandra is for hqbdling realtime stuff.

Schema design is explained in inbox example.it does does not give clarity about how table will look like.Anil thinks they store lot data with messages which makes table crappy.

Cassandra is design for high speed access and online operation.

Zookeeper is similar to chhuby

Zookeeper is for node level information

Gossip is more about key partitioning

Zookeeper is for configuration of new node.

It writes in immutable way like functional programming.there is no assignment in functional programming.

Discussions

Athero GPL
Cassandra vs Big table
BigTable is not part of Hadoop.
BigTable " Funny thing on top of GFS"
History of HDFS
Hbase
Why two projects with same notion are supported?
Apache as a community. For any tool in CS particularly software tools, its actually important to have more

than one good implementation. Only time it doesnt happen because of market realities.
Older style network protocol - token rings
What sort of computational systems avoid changing data?
Functional programming languages -
Whats different from classic c and c++. Tries to eliminate side effects. Functional there is no assignment.Data is just binded you associate a name with a value. Garbage collection. No mutation of data.
Systems talking about implementing functional like semantics.

== Comet ==

The major idea behind Comet is triggers/callbacks. There is an extensive literature in extensible operating systems, basically adding code to the operating system to better suit my application. "Generally, extensible systems suck." -[[User:Soma]]

[https://www.usenix.org/conference/osdi10/comet-active-distributed-key-value-store The presentation video of Comet]

DistOS 2014W Lecture 20

2014-03-25T15:24:35Z

Ronak: