Difference between revisions of "Class Review, Future Directions"

From Soma-notes
Jump to navigation Jump to search
 
Line 4: Line 4:


- Distributed File Systems
- Distributed File Systems
.GFS Major feature? FAULT TOLERANCE!
*GFS Major feature? FAULT TOLERANCE!


- RPC
- RPC


- Process Migration
- Process Migration
.Planetlab?
*Planetlab?
..Central admin control - this is very restrictive model
**Central admin control - this is very restrictive model
...Likely why only researchers are using this kind of system
***Likely why only researchers are using this kind of system


- Fault tolerance
- Fault tolerance
.Some dealt with this extensively, others not much at all
*Some dealt with this extensively, others not much at all


- Security
- Security
.Again, some systems used security as a tenant of design, while others simply said it was to be developed in the future and did not do anything for it.
*Again, some systems used security as a tenant of design, while others simply said it was to be developed in the future and did not do anything for it*


===Possible Study Questions===
===Possible Study Questions===
Line 23: Line 23:


2. What were the systems that implemented DSM, and what were the problems they faced?
2. What were the systems that implemented DSM, and what were the problems they faced?
.Cons
*Cons
..Must remember that all the facets of all the systems
**Must remember that all the facets of all the systems
..Mostly regurgitation - not really synthesizing new opinion or information
**Mostly regurgitation - not really synthesizing new opinion or information


3. What kind of problems could you use DSM to help solve? What environments is DSM most suitable for?
3. What kind of problems could you use DSM to help solve? What environments is DSM most suitable for?


4. Scenario based question - Suggest that you are a designer for a software project. You are charged with designing a system that implements some kind of DSM solution. Which system would you implement and what would the possible advantages/disadvantages be?
4. Scenario based question - Suggest that you are a designer for a software project* You are charged with designing a system that implements some kind of DSM solution* Which system would you implement and what would the possible advantages/disadvantages be?
.A circumstance where you would need enormous amounts of memory
*A circumstance where you would need enormous amounts of memory
.Mostly reading - caching would be the win
*Mostly reading - caching would be the win
.DNS as a big DSM program - one gigantic table
*DNS as a big DSM program - one gigantic table
..Good idea? Need something like signature based submission to control and secure the system
**Good idea? Need something like signature based submission to control and secure the system
..Access control and security would be big problem
**Access control and security would be big problem


5. Scenario based question - Suggest a problem and request various solutions, requesting pros/cons of each - (RPC based, DSM based, etc)
5. Scenario based question - Suggest a problem and request various solutions, requesting pros/cons of each - (RPC based, DSM based, etc)


6. Which was a more successful distributed operating system?
6. Which was a more successful distributed operating system?
.How do you define successful?
*How do you define successful?
..Was this solely for research? Or real implementation?
**Was this solely for research? Or real implementation?
..In terms of championing ideas? Or deployed implementations?
**In terms of championing ideas? Or deployed implementations?


7. Evaluate past work - 'make you turn your head sideways' - evaluate from different perspective
7. Evaluate past work - 'make you turn your head sideways' - evaluate from different perspective


8. Which system best captured "UNIX" in a distributed operating system
8. Which system best captured "UNIX" in a distributed operating system
.Best captured the 'flavour' of unix, and which one least captured it?
*Best captured the 'flavour' of unix, and which one least captured it?
.Out of the following... which one is most 'unix-like' (plan 9, locus, mach, etc)... which is least?
*Out of the following... which one is most 'unix-like' (plan 9, locus, mach, etc)... which is least?
.Out of the following... which distributed file-system is most 'unix-like'(GFS, locus, etc)...
*Out of the following... which distributed file-system is most 'unix-like'(GFS, locus, etc)...


9. Build something to solve X - or - Build X using Y
9. Build something to solve X - or - Build X using Y


10. Opinion question
10. Opinion question
.What was your favourite system that we covered?
*What was your favourite system that we covered?
..What were the key characteristics of this system that you liked?
**What were the key characteristics of this system that you liked?
..What criteria are you using to evaluate it?
**What criteria are you using to evaluate it?
..Why is your liking it not justified? IE, how did this system fail in some way?
**Why is your liking it not justified? IE, how did this system fail in some way?
..Talk about at least two other systems that don't meet this same criteria
**Talk about at least two other systems that don't meet this same criteria
...Criteria being the reasons that you prefer THIS operating system
***Criteria being the reasons that you prefer THIS operating system
.What was your least favourite?
*What was your least favourite?
.Take what you have chosen as your favourite, and then explain why it is the worst!  (Will not do this, but great for debating)
*Take what you have chosen as your favourite, and then explain why it is the worst!  (Will not do this, but great for debating)


11. What were the key problems addressed by most of these systems?
11. What were the key problems addressed by most of these systems?
.Which of these problems are most important to solve in todays computing environment
*Which of these problems are most important to solve in todays computing environment
..What is todays computing environment?  Should we only be optimizing for clusters given that we are not building for systems that cross administrative boundaries?  What technology would make these clusters better?
**What is todays computing environment?  Should we only be optimizing for clusters given that we are not building for systems that cross administrative boundaries?  What technology would make these clusters better?


===Answers===
===Answers===
What do we know about building these systems? What can we do well?
What do we know about building these systems? What can we do well?
.Message passing
*Message passing
.RPC
*RPC
.Local files
*Local files
.Distributed files? depending on scenario, depending on what your file IS
*Distributed files? depending on scenario, depending on what your file IS
..A normal POSIX file in a distributed environment? No, not really
**A normal POSIX file in a distributed environment? No, not really
...Which semantics do you let slip?
***Which semantics do you let slip?
..Append-only files? Sure
**Append-only files? Sure
.Single domain authentication
*Single domain authentication
.Distributed read-only anything (files, memory)
*Distributed read-only anything (files, memory)
.Concurrent writing? No, that's the hard part
*Concurrent writing? No, that's the hard part
..When you try to update the same piece of data from multiple locations, possibly at the same time
**When you try to update the same piece of data from multiple locations, possibly at the same time
..We know that the less communication, then better
**We know that the less communication, then better
...Reduces latency problem and minimizes multiple writes
***Reduces latency problem and minimizes multiple writes
.Backwards compatibility
*Backwards compatibility
..Completely duplicating the specification of non-distributed systems is HARD (synchronicity = SLOW)
**Completely duplicating the specification of non-distributed systems is HARD (synchronicity = SLOW)
..Slip the standards enough, minimize changes required and most problems can be alleviated enough to make the system usable
**Slip the standards enough, minimize changes required and most problems can be alleviated enough to make the system usable
..Metadata often a big problem
**Metadata often a big problem
...Needs higher visibility
***Needs higher visibility
...Typically has higher contention than other data
***Typically has higher contention than other data
..Some have abandoned backwards compatibility
**Some have abandoned backwards compatibility
...Some systems have done this
***Some systems have done this
.General purpose solutions are generally bad, in distinct contrast to the local case
*General purpose solutions are generally bad, in distinct contrast to the local case
..Specific solutions to solve specific problems
**Specific solutions to solve specific problems
.Security - not easily implemented in distributed OSes
*Security - not easily implemented in distributed OSes
..Crypto "ain't enough"
**Crypto "ain't enough"
..Typically added on after the fact
**Typically added on after the fact
...Without security as design tenant, often design choices are made along the way that make securing the system very difficult
***Without security as design tenant, often design choices are made along the way that make securing the system very difficult
..Very hard to test accurately, how do you plan to secure "any hole"
**Very hard to test accurately, how do you plan to secure "any hole"
..Often developed before security was a real concern
**Often developed before security was a real concern
..Often adding security makes the system very slow
**Often adding security makes the system very slow
..''We have yet to develop a good model for multiple administrative domains of control''
**''We have yet to develop a good model for multiple administrative domains of control''


===Study tips===
===Study tips===
- Go through each paper and ask ''"Why do I care?"''
- Go through each paper and ask ''"Why do I care?"''

Revision as of 15:54, 31 March 2008

Exam Prep

Topics Covered

- DSM

- Distributed File Systems

  • GFS Major feature? FAULT TOLERANCE!

- RPC

- Process Migration

  • Planetlab?
    • Central admin control - this is very restrictive model
      • Likely why only researchers are using this kind of system

- Fault tolerance

  • Some dealt with this extensively, others not much at all

- Security

  • Again, some systems used security as a tenant of design, while others simply said it was to be developed in the future and did not do anything for it*

Possible Study Questions

1. Example paper, then quiz on the paper?

2. What were the systems that implemented DSM, and what were the problems they faced?

  • Cons
    • Must remember that all the facets of all the systems
    • Mostly regurgitation - not really synthesizing new opinion or information

3. What kind of problems could you use DSM to help solve? What environments is DSM most suitable for?

4. Scenario based question - Suggest that you are a designer for a software project* You are charged with designing a system that implements some kind of DSM solution* Which system would you implement and what would the possible advantages/disadvantages be?

  • A circumstance where you would need enormous amounts of memory
  • Mostly reading - caching would be the win
  • DNS as a big DSM program - one gigantic table
    • Good idea? Need something like signature based submission to control and secure the system
    • Access control and security would be big problem

5. Scenario based question - Suggest a problem and request various solutions, requesting pros/cons of each - (RPC based, DSM based, etc)

6. Which was a more successful distributed operating system?

  • How do you define successful?
    • Was this solely for research? Or real implementation?
    • In terms of championing ideas? Or deployed implementations?

7. Evaluate past work - 'make you turn your head sideways' - evaluate from different perspective

8. Which system best captured "UNIX" in a distributed operating system

  • Best captured the 'flavour' of unix, and which one least captured it?
  • Out of the following... which one is most 'unix-like' (plan 9, locus, mach, etc)... which is least?
  • Out of the following... which distributed file-system is most 'unix-like'(GFS, locus, etc)...

9. Build something to solve X - or - Build X using Y

10. Opinion question

  • What was your favourite system that we covered?
    • What were the key characteristics of this system that you liked?
    • What criteria are you using to evaluate it?
    • Why is your liking it not justified? IE, how did this system fail in some way?
    • Talk about at least two other systems that don't meet this same criteria
      • Criteria being the reasons that you prefer THIS operating system
  • What was your least favourite?
  • Take what you have chosen as your favourite, and then explain why it is the worst! (Will not do this, but great for debating)

11. What were the key problems addressed by most of these systems?

  • Which of these problems are most important to solve in todays computing environment
    • What is todays computing environment? Should we only be optimizing for clusters given that we are not building for systems that cross administrative boundaries? What technology would make these clusters better?

Answers

What do we know about building these systems? What can we do well?

  • Message passing
  • RPC
  • Local files
  • Distributed files? depending on scenario, depending on what your file IS
    • A normal POSIX file in a distributed environment? No, not really
      • Which semantics do you let slip?
    • Append-only files? Sure
  • Single domain authentication
  • Distributed read-only anything (files, memory)
  • Concurrent writing? No, that's the hard part
    • When you try to update the same piece of data from multiple locations, possibly at the same time
    • We know that the less communication, then better
      • Reduces latency problem and minimizes multiple writes
  • Backwards compatibility
    • Completely duplicating the specification of non-distributed systems is HARD (synchronicity = SLOW)
    • Slip the standards enough, minimize changes required and most problems can be alleviated enough to make the system usable
    • Metadata often a big problem
      • Needs higher visibility
      • Typically has higher contention than other data
    • Some have abandoned backwards compatibility
      • Some systems have done this
  • General purpose solutions are generally bad, in distinct contrast to the local case
    • Specific solutions to solve specific problems
  • Security - not easily implemented in distributed OSes
    • Crypto "ain't enough"
    • Typically added on after the fact
      • Without security as design tenant, often design choices are made along the way that make securing the system very difficult
    • Very hard to test accurately, how do you plan to secure "any hole"
    • Often developed before security was a real concern
    • Often adding security makes the system very slow
    • We have yet to develop a good model for multiple administrative domains of control

Study tips

- Go through each paper and ask "Why do I care?"