DistOS 2014W Lecture 14

From Soma-notes

OceanStore

What is the dream?

  • High availabitility, universally accessible.
  • Utility managed by multiple parties.
  • Highly redundant, fault tolerant
  • Basic assumption was that servers would NOT be trusted.
  • Highly persistent
    • Everything archived
    • Everything saved, nothing deleted. "Commits"
  • Service was untrusted
    • Held opaque/encrypted data.
  • Would have been used for more than files. (eg. DB's, etc.)
  • Global ubiquitous persistent data storage
  • Nomadic data
  • Untrusted infrastructure
  • Cannot delete data, universal archive
    • The easier you delete stuff, the easier you lose stuff

Why did the dream die?

  • Biggest reason it died was it's assumption of mistrusting the actors.
    • Everything else they did was right.
  • Other successful distributed systems are built on a more trusted model.

Technology

  • The trust model is the most attractive feature which ultimately killed it.
    • The untrusted assumption was a huge burden on the system. Forced technical limitations made them uncompetitive.
    • It is just easier to trust a given system. More convenient.
    • Every system is compromisable despite this mistrust
  • Pub key system reduces usability
    • If you loose your key, you're S.O.L.
  • security
    • there is no security mechanism in servers side.
    • can not now who access the data
  • economic side
    • The economic model is unconvincing as defined. The authors suggest that a collection of companies will host OceanStore servers, and consumers will buy capacity (not unlike web-hosting of today).

Use Cases

  • Subset of the features already exist
    • Blackberry and Google offer similar services.
    • These current services owned by one company, not many providers.
    • Can not sell back your services as a user.
      • ex. Can not sell your extra storage back to the utility.

Pond: What insights?

  • They actually built it.
  • Can't assume the use of any infrastructure, so they rebuild everything!
    • Built over the internet.
    • Tapestry (routing).
    • GUID for object indentification. Object naming scheme.

Benchmarks

  • Really good read speed, really bad write speed.

Storage overhead

  • How much are they increasing the storage needed to implement their storage model.
  • Factor of 4.8x the space needed (you'll have 1/5th the storage)
  • Expensive, but good value (data is backed up, replicated, etc..)
  • Considerations of importance before making an update
    • burn more storage space as more updates are made

Update performance

  • No data is mutated. It is diffed and archived.
  • Creating a new version of an object and distributing that object.

Benchmarks in a nutshell

  • Everything is expensive!
  • High latency

Other stuff

  • Byzantine fault tolerance
    • Assuming certain actors are malicious
  • Bitcoin
    • Trusted vs Untrusted.
    • It is considered to be untrusted but it takes huge amount of trust when exchanges are made.

What's worth salvaging from the dream?

  • Using spare resources in other locations.
  • Similar routing system are used in large peer to peer systems.

How to read a research paper

  • Start with Intro
    • Figure out what the problem is
  • then see the related work for context
  • then go to conclusion. Focus on results.
  • then fill in the gaps by reading specific parts of the body