Class Review, Future Directions

Exam Prep

Topics Covered

- DSM

- Distributed File Systems

GFS Major feature? FAULT TOLERANCE!

- RPC

- Process Migration

Planetlab?
- Central admin control - this is very restrictive model
  - Likely why only researchers are using this kind of system

- Fault tolerance

Some dealt with this extensively, others not much at all

- Security

Again, some systems used security as a tenant of design, while others simply said it was to be developed in the future and did not do anything for it*

Possible Study Questions

1. Example paper, then quiz on the paper?

2. What were the systems that implemented DSM, and what were the problems they faced?

Cons
- Must remember that all the facets of all the systems
- Mostly regurgitation - not really synthesizing new opinion or information

3. What kind of problems could you use DSM to help solve? What environments is DSM most suitable for?

4. Scenario based question - Suggest that you are a designer for a software project* You are charged with designing a system that implements some kind of DSM solution* Which system would you implement and what would the possible advantages/disadvantages be?

A circumstance where you would need enormous amounts of memory
Mostly reading - caching would be the win
DNS as a big DSM program - one gigantic table
- Good idea? Need something like signature based submission to control and secure the system
- Access control and security would be big problem

5. Scenario based question - Suggest a problem and request various solutions, requesting pros/cons of each - (RPC based, DSM based, etc)

6. Which was a more successful distributed operating system?

How do you define successful?
- Was this solely for research? Or real implementation?
- In terms of championing ideas? Or deployed implementations?

7. Evaluate past work - 'make you turn your head sideways' - evaluate from different perspective

8. Which system best captured "UNIX" in a distributed operating system

Best captured the 'flavour' of unix, and which one least captured it?
Out of the following... which one is most 'unix-like' (plan 9, locus, mach, etc)... which is least?
Out of the following... which distributed file-system is most 'unix-like'(GFS, locus, etc)...

9. Build something to solve X - or - Build X using Y

10. Opinion question

What was your favourite system that we covered?
- What were the key characteristics of this system that you liked?
- What criteria are you using to evaluate it?
- Why is your liking it not justified? IE, how did this system fail in some way?
- Talk about at least two other systems that don't meet this same criteria
  - Criteria being the reasons that you prefer THIS operating system
What was your least favourite?
Take what you have chosen as your favourite, and then explain why it is the worst! (Will not do this, but great for debating)

11. What were the key problems addressed by most of these systems?

Which of these problems are most important to solve in todays computing environment
- What is todays computing environment? Should we only be optimizing for clusters given that we are not building for systems that cross administrative boundaries? What technology would make these clusters better?

Answers

What do we know about building these systems? What can we do well?

Message passing
RPC
Local files
Distributed files? depending on scenario, depending on what your file IS
- A normal POSIX file in a distributed environment? No, not really
  - Which semantics do you let slip?
- Append-only files? Sure
Single domain authentication
Distributed read-only anything (files, memory)
Concurrent writing? No, that's the hard part
- When you try to update the same piece of data from multiple locations, possibly at the same time
- We know that the less communication, then better
  - Reduces latency problem and minimizes multiple writes
Backwards compatibility
- Completely duplicating the specification of non-distributed systems is HARD (synchronicity = SLOW)
- Slip the standards enough, minimize changes required and most problems can be alleviated enough to make the system usable
- Metadata often a big problem
  - Needs higher visibility
  - Typically has higher contention than other data
- Some have abandoned backwards compatibility
  - Some systems have done this
General purpose solutions are generally bad, in distinct contrast to the local case
- Specific solutions to solve specific problems
Security - not easily implemented in distributed OSes
- Crypto "ain't enough"
- Typically added on after the fact
  - Without security as design tenant, often design choices are made along the way that make securing the system very difficult
- Very hard to test accurately, how do you plan to secure "any hole"
- Often developed before security was a real concern
- Often adding security makes the system very slow
- We have yet to develop a good model for multiple administrative domains of control

Study tips

- Go through each paper and ask "Why do I care?"