WebOS, PlanetLab, Starfish: Difference between revisions

From Soma-notes
Rhooper (talk | contribs)
Jmahonin (talk | contribs)
 
(11 intermediate revisions by the same user not shown)
Line 49: Line 49:
* Vulnerable to Denial of Service or other forms of attacks
* Vulnerable to Denial of Service or other forms of attacks
* Extra network overhead to locate a service
* Extra network overhead to locate a service
== Starfish ==
=== Key features ===
===== MPI =====
* Fast message passing
* Allows the programmers more control
===== OCaml =====
====== Pros ======
* Recursive algorithms
* Uses Bytecodes - portable in theory
* Same code on heterogenous hardware
* Well known language
====== Cons ======
* Slow performance?
* Not entirely portable, maybe Just-In-Time?
==== Checkpointing ====
* Save / resume state
* Provides process migration
* Maybe better suited to be implemented in the application, not OS?
==== Management ====
* Had a Java interface which allowed any node to login and see full system status
* Used the distributed nature of daemons to communicate this information
== Experiences Building PlanetLab ==
=== Key ideas ===
* Global platform to distribute/test network services
* Scale
* Should be able to monitor and stop disruptive traffic
=== Challenges ===
* Resource allocation is significant, especially in a distributed system such as PlanetLab
* Providing a global platform for long-term services
* Implementing a trust relationship between node owners and service developers (users)
=== PLC ===
==== Summary ====
* Fulfills the trust mechanism required for the network
* Acts as a middle man / mediator between node owners and users
* If someone breaks into the PLC however, entire system is compromised
==== Implementation ====
* Nodes are split into slices using VServers, lightweight process groups
==== Criticisms ====
* Bandwidth allocation was slice-based, not node based
== Learning from PlanetLab==
=== Centralized Trusts ===
==== Pro ====
* Easier to manage, one entity to trust
==== Con ====
* One point of control, no competition as to who to trust
=== Centralized resource control ===
==== Pro ====
* All resources are controlled by PLC
==== Con ====
* No incentive for users or administrators to conservce resources
=== Decentralized management ===
==== Pro ====
* Provide bare-bones management, try to foster competition between 3rd party services
==== Con ====
* No motivation to do so, it's hard work
=== Treat bandwidth as free ===
==== Pro ====
* Free bandwidth!
==== Con ====
* No incentive to conserve bandwidth
=== Provide only best-effort service ===
==== Pro ====
* No limit on the number of processes which can be run
==== Con ====
* Other processes may crowd out computation of other processes (cpu/disk hogging)
=== Linux is the execution environment ===
==== Pro ====
* Provides a familiar programming environment
==== Con ====
* Weak isolation between experiments
* No global allocation for resources
* Having a homogenous test-bed is poor for distributed experimentation
=== Don't provide distributed OS services ===
==== Pro ====
*
==== Con ====
*
=== Evolve the API ===
==== Pro ====
* Adaptable API, ground-up
==== Con ====
* Never had a good API, inconsistent, ever changing, unstable programming environment
=== Focus on the machine room ===
==== Pro ====
* Allocate big machine here, other there, etc...
==== Con ====
* Bad for distributed OSes

Latest revision as of 19:55, 19 March 2008

Readings

Amin Vahat et al., "WebOS: Operating System Services for Wide Area Applications" (1998)

Adnan Agbaria and Roy Friedman, "Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations" (2003)

Larry Peterson et al., "Experiences Building PlanetLab" (2006)

Thomas Anderson and Timothy Roscoe, "Learning from PlanetLab" (2006)




WebOS

Key features

  • High Availability
  • Lower Latency
  • Fault Tolerance

Consensus: WebOS isn't really an distributed OS

Main components

  • Smart Client
  • WebFS
  • Global naming scheme based on URLs
  • Process control system
  • CRISIS authentication/authorization system (Certificates with ACLs)

Key ideas that were/were not not adopted from WebFS

Adopted:

  • General idea of wide area dynamic distribution -> Akamai (but primarily for static content)
  • Global naming using URLs

Not Adopted:

  • CRISIS
  • WebFS (Although WebDAV could be said to be related)
  • Smart client (for web sites)

What are the pros and cons of using smart clients to do load balancing?

Pro:

  • Distributes computation
  • More flexible

Con:

  • Vulnerable to Denial of Service or other forms of attacks
  • Extra network overhead to locate a service


Starfish

Key features

MPI
  • Fast message passing
  • Allows the programmers more control
OCaml
Pros
  • Recursive algorithms
  • Uses Bytecodes - portable in theory
  • Same code on heterogenous hardware
  • Well known language
Cons
  • Slow performance?
  • Not entirely portable, maybe Just-In-Time?

Checkpointing

  • Save / resume state
  • Provides process migration
  • Maybe better suited to be implemented in the application, not OS?

Management

  • Had a Java interface which allowed any node to login and see full system status
  • Used the distributed nature of daemons to communicate this information


Experiences Building PlanetLab

Key ideas

  • Global platform to distribute/test network services
  • Scale
  • Should be able to monitor and stop disruptive traffic

Challenges

  • Resource allocation is significant, especially in a distributed system such as PlanetLab
  • Providing a global platform for long-term services
  • Implementing a trust relationship between node owners and service developers (users)

PLC

Summary

  • Fulfills the trust mechanism required for the network
  • Acts as a middle man / mediator between node owners and users
  • If someone breaks into the PLC however, entire system is compromised

Implementation

  • Nodes are split into slices using VServers, lightweight process groups

Criticisms

  • Bandwidth allocation was slice-based, not node based

Learning from PlanetLab

Centralized Trusts

Pro

  • Easier to manage, one entity to trust

Con

  • One point of control, no competition as to who to trust

Centralized resource control

Pro

  • All resources are controlled by PLC

Con

  • No incentive for users or administrators to conservce resources

Decentralized management

Pro

  • Provide bare-bones management, try to foster competition between 3rd party services

Con

  • No motivation to do so, it's hard work

Treat bandwidth as free

Pro

  • Free bandwidth!

Con

  • No incentive to conserve bandwidth

Provide only best-effort service

Pro

  • No limit on the number of processes which can be run

Con

  • Other processes may crowd out computation of other processes (cpu/disk hogging)

Linux is the execution environment

Pro

  • Provides a familiar programming environment

Con

  • Weak isolation between experiments
  • No global allocation for resources
  • Having a homogenous test-bed is poor for distributed experimentation

Don't provide distributed OS services

Pro

Con

Evolve the API

Pro

  • Adaptable API, ground-up

Con

  • Never had a good API, inconsistent, ever changing, unstable programming environment

Focus on the machine room

Pro

  • Allocate big machine here, other there, etc...

Con

  • Bad for distributed OSes