WebOS, PlanetLab, Starfish

From Soma-notes
Jump to navigation Jump to search

Readings

Amin Vahat et al., "WebOS: Operating System Services for Wide Area Applications" (1998)

Adnan Agbaria and Roy Friedman, "Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations" (2003)

Larry Peterson et al., "Experiences Building PlanetLab" (2006)

Thomas Anderson and Timothy Roscoe, "Learning from PlanetLab" (2006)




WebOS

Key features

  • High Availability
  • Lower Latency
  • Fault Tolerance

Consensus: WebOS isn't really an distributed OS

Main components

  • Smart Client
  • WebFS
  • Global naming scheme based on URLs
  • Process control system
  • CRISIS authentication/authorization system (Certificates with ACLs)

Key ideas that were/were not not adopted from WebFS

Adopted:

  • General idea of wide area dynamic distribution -> Akamai (but primarily for static content)
  • Global naming using URLs

Not Adopted:

  • CRISIS
  • WebFS (Although WebDAV could be said to be related)
  • Smart client (for web sites)

What are the pros and cons of using smart clients to do load balancing?

Pro:

  • Distributes computation
  • More flexible

Con:

  • Vulnerable to Denial of Service or other forms of attacks
  • Extra network overhead to locate a service


Starfish

Key features

MPI
  • Fast message passing
  • Allows the programmers more control
OCaml
Pros
  • Recursive algorithms
  • Uses Bytecodes - portable in theory
  • Same code on heterogenous hardware
  • Well known language
Cons
  • Slow performance?
  • Not entirely portable, maybe Just-In-Time?

Checkpointing

  • Save / resume state
  • Provides process migration
  • Maybe better suited to be implemented in the application, not OS?

Management

  • Had a Java interface which allowed any node to login and see full system status
  • Used the distributed nature of daemons to communicate this information


Experiences Building PlanetLab

Key ideas

  • Global platform to distribute/test network services
  • Scale
  • Should be able to monitor and stop disruptive traffic

Challenges

  • Resource allocation is significant, especially in a distributed system such as PlanetLab
  • Providing a global platform for long-term services
  • Implementing a trust relationship between node owners and service developers (users)

PLC

  • Fulfills the trust mechanism required for the network
  • Acts as a middle man / mediator between node owners and users
  • If someone breaks into the PLC however, entire system is compromised

PlanetLab Learning

Bar