Class Review, Future Directions

From Soma-notes

Exam Prep

Topics Covered

- DSM

- Distributed File Systems

  • GFS Major feature? FAULT TOLERANCE!

- RPC

- Process Migration

  • Planetlab?
    • Central admin control - this is very restrictive model
      • Likely why only researchers are using this kind of system

- Fault tolerance

  • Some dealt with this extensively, others not much at all

- Security

  • Again, some systems used security as a tenant of design, while others simply said it was to be developed in the future and did not do anything for it*

Possible Study Questions

1. Example paper, then quiz on the paper?

2. What were the systems that implemented DSM, and what were the problems they faced?

  • Cons
    • Must remember that all the facets of all the systems
    • Mostly regurgitation - not really synthesizing new opinion or information

3. What kind of problems could you use DSM to help solve? What environments is DSM most suitable for?

4. Scenario based question - Suggest that you are a designer for a software project* You are charged with designing a system that implements some kind of DSM solution* Which system would you implement and what would the possible advantages/disadvantages be?

  • A circumstance where you would need enormous amounts of memory
  • Mostly reading - caching would be the win
  • DNS as a big DSM program - one gigantic table
    • Good idea? Need something like signature based submission to control and secure the system
    • Access control and security would be big problem

5. Scenario based question - Suggest a problem and request various solutions, requesting pros/cons of each - (RPC based, DSM based, etc)

6. Which was a more successful distributed operating system?

  • How do you define successful?
    • Was this solely for research? Or real implementation?
    • In terms of championing ideas? Or deployed implementations?

7. Evaluate past work - 'make you turn your head sideways' - evaluate from different perspective

8. Which system best captured "UNIX" in a distributed operating system

  • Best captured the 'flavour' of unix, and which one least captured it?
  • Out of the following... which one is most 'unix-like' (plan 9, locus, mach, etc)... which is least?
  • Out of the following... which distributed file-system is most 'unix-like'(GFS, locus, etc)...

9. Build something to solve X - or - Build X using Y

10. Opinion question

  • What was your favourite system that we covered?
    • What were the key characteristics of this system that you liked?
    • What criteria are you using to evaluate it?
    • Why is your liking it not justified? IE, how did this system fail in some way?
    • Talk about at least two other systems that don't meet this same criteria
      • Criteria being the reasons that you prefer THIS operating system
  • What was your least favourite?
  • Take what you have chosen as your favourite, and then explain why it is the worst! (Will not do this, but great for debating)

11. What were the key problems addressed by most of these systems?

  • Which of these problems are most important to solve in todays computing environment
    • What is todays computing environment? Should we only be optimizing for clusters given that we are not building for systems that cross administrative boundaries? What technology would make these clusters better?

Answers

What do we know about building these systems? What can we do well?

  • Message passing
  • RPC
  • Local files
  • Distributed files? depending on scenario, depending on what your file IS
    • A normal POSIX file in a distributed environment? No, not really
      • Which semantics do you let slip?
    • Append-only files? Sure
  • Single domain authentication
  • Distributed read-only anything (files, memory)
  • Concurrent writing? No, that's the hard part
    • When you try to update the same piece of data from multiple locations, possibly at the same time
    • We know that the less communication, then better
      • Reduces latency problem and minimizes multiple writes
  • Backwards compatibility
    • Completely duplicating the specification of non-distributed systems is HARD (synchronicity = SLOW)
    • Slip the standards enough, minimize changes required and most problems can be alleviated enough to make the system usable
    • Metadata often a big problem
      • Needs higher visibility
      • Typically has higher contention than other data
    • Some have abandoned backwards compatibility
      • Some systems have done this
  • General purpose solutions are generally bad, in distinct contrast to the local case
    • Specific solutions to solve specific problems
  • Security - not easily implemented in distributed OSes
    • Crypto "ain't enough"
    • Typically added on after the fact
      • Without security as design tenant, often design choices are made along the way that make securing the system very difficult
    • Very hard to test accurately, how do you plan to secure "any hole"
    • Often developed before security was a real concern
    • Often adding security makes the system very slow
    • We have yet to develop a good model for multiple administrative domains of control

Study tips

- Go through each paper and ask "Why do I care?"


Day Two

More questions

- Why have distOSs failed?

- What should a distOS do?

  • Networking (TCP/IP)
  • Administrative domains
    • Who administers the whole package? (compared to the internet where there IS no single administration system)
  • Share resources
    • If you participate in the internet, you are sharing resources (or at least, USING resources)
    • When you run javascript on a page, you are using your machine to run someone else's code
    • This philosophy is flawed!
      • Should someone not want to play the game nicely, they can consume more than their share
    • All these locks, permissions, etc are there to help make sure users only consume their share

- What is the state of the internet today?

  • Anarchy
    • There IS NO HIGHER STRUCTURE (no global cohersion, no 'police force' or 'rule of law')
  • Are distOSs trying to implement 'communism' on the internet?
  • A rule of law is different than having good laws
    • A good set of rules implies that users live by a certain model
      • A good society is a society that isn't based on following the letter of the law, but of helping others
  • Traditional OSs try to implement a set of rules that DO NOT ALLOW others to do 'bad things'
  • We all have the capacity to cause harm to others, but our culture helps create a safe environment, that results in social enforcement. More often than not it isn't the law that discourages bad behaviour, but the judgments of others.
  • Related to DistOSs?
    • Each machine has the power to cause problems
    • Supposition: It is impossible to make fixed rules that limit a single machine's power without fundamentally ruining the internet.
  • Who defines appropriate behaviour?
    • The computers themselves, in some kind of distributed framework.
    • Perhaps a series of connected frameworks
    • Though this solution would need to be adaptive, and something that allows the computers to decide what is appropriate
  • How to enforce that behaviour?
    • Attribution - make cause/effect connections - who did what behaviour
      • This is not trivial, and in general not totally possible
    • Punishment
      • Some kind of prison, some kind of process that results in privilege reduction, or resource reduction
      • How would that work? You need a LOT of evidence before punishment can be enforced
        • Therefore there would need to be a lot of damage before punishment can be enforced
        • Shame, austricism
  • A gossip mechanism
    • Spread the knowledge - spread 'interesting information' to other computers
    • gossip is always suspect, but usually contains at least mildly correct information
    • Low level mechanism - what to do with the gossip? How to evaluate it?

If we don't do something, some way to 'teach the mob some manners' - there will be a HUGE problem with a massive number of resources outside of the core infrastructure, to the point that these resources would be easily able to dominate these core systems.

Perhaps a model similar to an oligarchy

  • Machines locked down
  • Specific rules that require certain behaviour
  • A set of small organizations that decides these rules, and in theory punishes deviation from that behaviour
  • Would this work?
  • This is largely what we have now, and are moving more towards

Rather than get humans to behave properly, get the computers to behave properly.

Though in systems that are managed top-down there tends to be vary many deficiencies.

Computers could develop 'opinions' of other computers, so then computers in similar environments might share their opinions, or tend to align themselves to certain other groups. And when things happen different behaviours can result across that whole group, which can then influence another group, etc

  • Though outside of your community your status might not be well defined, you might not get the same level of privileges.
  • Even to the point of stereotyping -> If I don't know you: Where are you coming from? What OS are you running? Use my default level of trust for BLA
  • The idea of rehabilitation? Computers don't have moral behaviour, or psychoses, given that a computer could always be re-imaged, or used by a different user... so it would need to work differently.