DistOS 2021F 2021-11-02

From Soma-notes

Notes

Lecture 13: Cassandra & Dynamo
------------------------------
 - haven't graded midterm, will do so by next week
    - I had to submit a grant app on Monday

 - will have grades on midterm before proposal is due
    - you have until Nov 9th

Algorithm for making a proposal (for a lit review)
 - find a paper you like/find interesting that is related to the course
   - make sure it is a good one, published somewhere reputable
     (ACM/IEEE isn't sufficient)
   - preferably with reasonable number of citations

From that paper, find a related set of papers
 - related to *one aspect* of the paper
 - see citations (who the paper cites, who cites the paper)
 - follow graph and search keywords to expand out

I prefer using a CS-standard citation format (like the papers we read use)

I prefer individual projects.  If you want to do pairs it is possible -
  but there has to be a clear division of responsibilities
  (so assume no unless it makes sense for the topic and you ask me)

(Partners make more sense if you're building something.)

related to this class: can relate to "distributed" and "operating system"
 - can't just be distributed or OS related

Try not to start with papers we've covered in class
 - branch out, search!
 - use ones related to your other interests

You should look for patterns amongst the papers you identify as being related.  Your paper's argument is showing that that pattern exists and how it connects to the papers you find.
 - don't just list summaries of papers, that's not a lit review


On to the papers

relational databases were inspired by the needs of airline reservation systems

dynamo was inspired by the needs of e-commerce shopping carts
 - so writes (users selecting something to buy always gets saved)

other systems will refuse writes when they'll make the system inconsistent
 - so client has to retry

dynamo always accepts writes, even when they'll make things inconsistent
 - so how do they deal with inconsistency?
   - when data is read
   - use application specific semantics to reconcile conflicts
     in the client
     (like with source code version control - the programmer
      has to manually figure out how to reconcile conflicts)

A hard problem in any distributed system is determining a canonical order of events
 - in principle this can not exist, but we may need an ordering anyway

A ring is the simplest topolgy that remains connected when you lose a node