Distributed OS: Winter 2023: Difference between revisions

From Soma-notes
No edit summary
Line 92: Line 92:


Midterm exam
Midterm exam
==Other Readings==
NASD & GFS
* [https://homeostasis.scs.carleton.ca/~soma/distos/2008-03-10/gibson-nasd.pdf Garth A. Gibson et al., "A Cost-Effective, High-Bandwidth Storage Architecture" (1998)]
* [https://research.google.com/archive/gfs-sosp2003.pdf Sanjay Ghemawat et al., "The Google File System" (SOSP 2003)]
Chubby & ZooKeeper
* [https://www.usenix.org/legacy/events/osdi06/tech/burrows.html Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI 2006)]
* [http://static.usenix.org/event/atc10/tech/full_papers/Hunt.pdf Hunt et al., "ZooKeeper: Wait-free coordination for Internet-scale systems" (USENIX ATC 2010)] [https://www.usenix.org/legacy/multimedia/atc10hunt (video)]
BigTable & MapReduce
* [https://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]
* [https://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)] (be sure to read paper)
Omega & Borg
* [https://research.google/pubs/pub41684.pdf Schwarzkopf et al., "Omega: flexible, scalable schedulers for large compute clusters" (EuroSys 2013)]
* [https://research.google/pubs/pub43438.pdf Verma et al., "Large-scale cluster management at Google with Borg" (EuroSys 2015)]
Cassandra & Dynamo
* [http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf Lakshman & Malik, "Cassandra - A Decentralized Structured Storage System" (LADIS 2009)]
* [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)]
Haystack & F4
* [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar Muralidhar et al., "f4: Facebook's Warm BLOB Storage System" (OSDI 2014)]
Spanner & Tensorflow
* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]
* [https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi Martin Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning" (OSDI 2016)]
Ceph
* [https://www.usenix.org/events/osdi06/tech/weil.html Weil et al., Ceph: A Scalable, High-Performance Distributed File System (OSDI 2006)].
* [https://homeostasis.scs.carleton.ca/~soma/distos/2021f/papers/weil2006-crush.pdf Weil et al., CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data (SC 2006)]
OceanStore & BOINC
* [https://homeostasis.scs.carleton.ca/~soma/distos/fall2008/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (SIGPLAN 2000)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/fall2008/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (FAST 2003)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/fall2008/anderson-boinc.pdf Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004)]
Tapestry & Delos
* [https://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)]
* [https://www.usenix.org/system/files/osdi20-balakrishnan.pdf Balakrishnan et al., "Virtual Consensus in Delos" (OSDI 2020)]
Background (optional but helpful):
* [https://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables]

Revision as of 17:20, 6 February 2023

Course Outline

Here is the course outline.

Project Help

To develop a literature review or research proposal, start with a single research paper that you find interesting and that is related to distributed operating systems in some way.

To begin selecting a paper, I suggest that you:

  • search on Google Scholar using keywords relating to your interests, and/or
  • browse the proceedings of major conferences that publish work related to distributed operating systems.

The main operating system conferences are OSDI and ACM SOSP (sosp.org,ACM DL). Note that not all the work here is on distributed operating systems! Also, many other conferences publish some work related to distributed operating systems, e.g. NSDI.

To help you write a literature review or the background of a research paper, read the following:

Class Schedule & Readings

January 9, 2023

Introduction

January 11, 2023

Designing a distributed operating system

January 16, 2023

The Early Internet & Multics:

Optional: Browse around the Multicians website.

January 18, 2023

UNIX

Note that the video covers the main points of the UNIX paper.

January 23, 2023

The Mother of All Demos:

The Alto:

January 25, 2023

Distributed Shared Memory

January 30, 2023

Remote Procedure Calls

February 1, 2023

LOCUS & NFS

February 6, 2023

Sprite, AFS

February 8, 2023

Plan 9 & Inferno

February 13, 2023

Midterm review

February 15, 2023

Midterm exam


Other Readings

NASD & GFS

Chubby & ZooKeeper

BigTable & MapReduce

Omega & Borg

Cassandra & Dynamo

Haystack & F4

Spanner & Tensorflow

Ceph

OceanStore & BOINC

Tapestry & Delos

Background (optional but helpful):