DistOS 2021F 2021-09-16

From Soma-notes
Jump to navigation Jump to search

Discussion Questions

Discussion questions for LOCUS:

  • What type of workloads was it designed for? How would the system appear to regular users?
  • To what extent did it scale?
  • What was their distribution strategy?
  • Did any of their design choices clearly limit their scalability?
  • How did their design choices affect semantics versus a single system (e.g., standard UNIX semantics)?
  • Could files be bigger than one computer could handle?
  • Could processes be bigger than one computer could handle?


Lecture 3

Comments on responses
 - Some of you summarized the readings.  Don't do that, you'll be heavily penalized.  We know what the readings say, what did you get out of them?  What questions did you have?
 - Also, please write in complete sentences.  Bullet points aren't so good for expressing complete thoughts.

 - Note that quizzes are lower effort but higher risk paths.  You may not get a perfect score on a quiz.  But you should get most of the questions right if you are diligent about doing the readings.
   - we may give full marks for less than 100%, but we'll see
   - I'll decide that later

 - the quizzes are a great start for discussing the readings

What questions do you have about LOCUS?
 - an entire file has to exist on a machine
   - no provision for files spanning machines
 - so a file can't be bigger than a single storage server
   - and probably has to be a good bit smaller

Why does adding a network between computers change the semantics of UNIX?
 - slow
 - unreliable
 - no shared state (shared RAM)
   - yes we can simulate, but at a cost of latency and reliability

Creating shared state manually, using messages sent over a network, is
inherently problematic
 - this is the heart of the tension in this course

A distributed system is much more complex than a single system

When you access a file, now performance and even semantics depend
on where the file is relative to the process
 - and you're now open to entire classes of error states that weren't an issue before

LOCUS isn't so interesting for their solutions.  It is interesting because they identified so many of the key problems we face and came up with, for the time, reasonable solutions that today mostly don't make sense (at the scales we work at)

A partition is when the system gets split
 - all the computers can't talk to one another