Jump to content

WebOS, PlanetLab, Starfish

From Soma-notes

Revision as of 19:55, 19 March 2008 by Jmahonin (talk | contribs) (→‎Learning from PlanetLab)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Readings

Amin Vahat et al., "WebOS: Operating System Services for Wide Area Applications" (1998)

Adnan Agbaria and Roy Friedman, "Starﬁsh: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations" (2003)

Larry Peterson et al., "Experiences Building PlanetLab" (2006)

Thomas Anderson and Timothy Roscoe, "Learning from PlanetLab" (2006)

WebOS

Key features

High Availability
Lower Latency
Fault Tolerance

Consensus: WebOS isn't really an distributed OS

Main components

Smart Client
WebFS
Global naming scheme based on URLs
Process control system
CRISIS authentication/authorization system (Certificates with ACLs)

Key ideas that were/were not not adopted from WebFS

Adopted:

General idea of wide area dynamic distribution -> Akamai (but primarily for static content)
Global naming using URLs

Not Adopted:

CRISIS
WebFS (Although WebDAV could be said to be related)
Smart client (for web sites)

What are the pros and cons of using smart clients to do load balancing?

Pro:

Distributes computation
More flexible

Con:

Vulnerable to Denial of Service or other forms of attacks
Extra network overhead to locate a service

Starfish

Key features

MPI

Fast message passing
Allows the programmers more control

OCaml

Pros

Recursive algorithms
Uses Bytecodes - portable in theory
Same code on heterogenous hardware
Well known language

Cons

Slow performance?
Not entirely portable, maybe Just-In-Time?

Checkpointing

Save / resume state
Provides process migration
Maybe better suited to be implemented in the application, not OS?

Management

Had a Java interface which allowed any node to login and see full system status
Used the distributed nature of daemons to communicate this information

Experiences Building PlanetLab

Key ideas

Global platform to distribute/test network services
Scale
Should be able to monitor and stop disruptive traffic

Challenges

Resource allocation is significant, especially in a distributed system such as PlanetLab
Providing a global platform for long-term services
Implementing a trust relationship between node owners and service developers (users)

PLC

Summary

Fulfills the trust mechanism required for the network
Acts as a middle man / mediator between node owners and users
If someone breaks into the PLC however, entire system is compromised

Implementation

Nodes are split into slices using VServers, lightweight process groups

Criticisms

Bandwidth allocation was slice-based, not node based

Learning from PlanetLab

Centralized Trusts

Pro

Easier to manage, one entity to trust

Con

One point of control, no competition as to who to trust

Centralized resource control

Pro

All resources are controlled by PLC

Con

No incentive for users or administrators to conservce resources

Decentralized management

Pro

Provide bare-bones management, try to foster competition between 3rd party services

Con

No motivation to do so, it's hard work

Treat bandwidth as free

Pro

Free bandwidth!

Con

No incentive to conserve bandwidth

Provide only best-effort service

Pro

No limit on the number of processes which can be run

Con

Other processes may crowd out computation of other processes (cpu/disk hogging)

Linux is the execution environment

Pro

Provides a familiar programming environment

Con

Weak isolation between experiments
No global allocation for resources
Having a homogenous test-bed is poor for distributed experimentation

Don't provide distributed OS services

Pro

Con

Evolve the API

Pro

Adaptable API, ground-up

Con

Never had a good API, inconsistent, ever changing, unstable programming environment

Focus on the machine room

Pro

Allocate big machine here, other there, etc...

Con

Bad for distributed OSes

Retrieved from "https://homeostasis.scs.carleton.ca/wiki/index.php?title=WebOS,_PlanetLab,_Starfish&oldid=1816"