DistOS 2018F 2018-10-03

From Soma-notes

Readings

Farsite & Oceanstore

Notes

in-class Lecture Notes: Oceanstore: - large scale - durable, archive - distributed world wide -untrusted storage nodes Client had the keys, b/c could not trust the data storage nodes How many keys did you need? Could store them on anything. untrusted storage - data centre storage

--> what is similar: fundamentally s3 but how does it compare

S3 is a web interface to a large file...immutable chunk of data with directories (buckets) etc. S3 is cheaper than virtual disks, really cheap for storing data. Can have new versions of files but it is immutable. S3 fragments but unknown to user....but S3 is not OceanStore in fundamental ways; in S3 no encrypting or the keys are with Amazon...by default. OceanStore, would be across multiple providers and organization while with Amazon, all stored with Amazon. Other companies offer S3 compatibility but that is selecting an organization not automatically spread around.

Why not doing Oceanstore? Legal issues (where’s the data?) b/c encrypting is so strong, when you have data on your system, have no idea what it is. Could be nasty stuff but have no clue. Not significant b/c of usability...have to manage my keys! If this breaks, everything is dead. If don’t want to do that, have contractual relationship that does not make sense. Easier to manage S3, might as well trust them...paying them money, business relationship...society is based on trust. Cloud is a high trust environment. Lots of overhead to try to eliminate trust and the benefit is limited b/c they end up trusting their providers. And the Amazon model allows for lock-in. OceanStore has no lock-in. OceanStore, nodes never have the keys, streaming the data.


Farsite - Distributed – in an organization (smaller scale) - “Use disks on workstations” - directory server is what manages the keys and authentication codes. - at Microsoft use Korborus ... it’s about keys, certain systems has keys, issues keys etc. - having a centralized authentication server was a key problem - How does security fall apart, especially anything based on crypto - TLS as a protocol is great except how certificates are managed - Unreliable, nodes not trusted but still trusted somewhat b/c the nodes often have the keys

-today, we have nothing close to Farsite in use...why? Storage is cheap. If you want reliability etc. Just setup servers....or go to cloud, let someone else manage the server. Outsources, thin clients rather than this.

What happened with Pond. Ended up running on 40 machines, prototype. Built on Tapestry, and PlanetLab nods...workstations everywhere...universities would join to use it for experiments....big for distributed systems. But it is not production use. From a production use point of view how what they built looked? How it reads and how it writes. Reads were really fast, writes were really slow. Why were reads fast and writes were slow? They had parallel reads which is different from NTFS....not a good comparison. They had to votes on changes and lots of overhead to sync.

Farsight – tech transfer Microsoft research ... hired a bunch of academics and set them up with resources and said go have fun with your group. Have to show that you contributed to Microsoft. Cool tech that came from research but rarely went to the product side. Development process, re-implementation of components...off having fun for years....solving a problem PACSOS....retrospective, Google File System published years before PACSOS....other companies built the stuff while Microsoft researched.

For exam: common themes, good questions....might show up on test