Difference between revisions of "Distributed OS: Fall 2019"

From Soma-notes
Jump to navigation Jump to search
 
(3 intermediate revisions by the same user not shown)
Line 2: Line 2:


[[Distributed OS: Fall 2019 Course Outline|Here]] is the course outline.
[[Distributed OS: Fall 2019 Course Outline|Here]] is the course outline.
==Group Report Grading==
Make sure your reports have the following (10 points total):
* Briefly summarize the papers so report can be read by anyone (2 points)
* Address most of the assigned discussion questions (4 points)
* Write in an understandable form using complete sentences and paragraphs (2 points)
* Reflect the discussion you had, reporting the questions that came up and points of debate (2 points)


==Project Help==
==Project Help==
Line 94: Line 102:
===October 16, 2019 (in person)===
===October 16, 2019 (in person)===


Mid-term exam (in class)
[https://homeostasis.scs.carleton.ca/~soma/distos/2019f/comp4000-2019f-midterm.pdf Midterm exam] (in class)


===[[DistOS 2019F 2019-10-28|October 28, 2019]] (online)===
===[[DistOS 2019F 2019-10-28|October 28, 2019]] (online)===


BOINC & Tapestry
* Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004) [http://dx.doi.org/10.1109/GRID.2004.14 (DOI)] [http://ieeexplore.ieee.org.proxy.library.carleton.ca/stamp/stamp.jsp?tp=&arnumber=1382809 (Proxy)]
* [http://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)]
Background (optional but helpful):
* [http://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables]
* [http://en.wikipedia.org/wiki/Kademlia Wikipedia's article on Kademlia]
* [http://en.wikipedia.org/wiki/Tapestry_%28DHT%29 Wikipedia's article on Tapestry]


===[[DistOS 2019F 2019-10-30|October 30, 2019]] (online)===
===[[DistOS 2019F 2019-10-30|October 30, 2019]] (online)===


Project Proposal due.
Farsite & Oceanstore
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/adya-farsite-intro.pdf Atul Adya et al.,"FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment" (2002)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/bolosky-farsite-retro.pdf William J. Bolosky et al., "The Farsite Project: A Retrospective" (2007)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)]
 
Project Proposal due Nov. 3rd.


===[[DistOS 2019F 2019-11-04|November 4, 2019]] (online)===
===[[DistOS 2019F 2019-11-04|November 4, 2019]] (online)===
GFS & Chubby
* [http://research.google.com/archive/gfs-sosp2003.pdf Sanjay Ghemawat et al., "The Google File System" (SOSP 2003)]
* [https://www.usenix.org/legacy/events/osdi06/tech/burrows.html Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI 2006)]


===[[DistOS 2019F 2019-11-06|November 6, 2019]] (online)===
===[[DistOS 2019F 2019-11-06|November 6, 2019]] (online)===
MapReduce & BigTable
* [http://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)] (be sure to read paper)
* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]


===[[DistOS 2019F 2019-11-11|November 11, 2019]] (online)===
===[[DistOS 2019F 2019-11-11|November 11, 2019]] (online)===
NASD & Ceph
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-03-10/gibson-nasd.pdf Garth A. Gibson et al., "A Cost-Effective, High-Bandwidth Storage Architecture" (1998)]
* [http://www.usenix.org/events/osdi06/tech/weil.html Weil et al., Ceph: A Scalable, High-Performance Distributed File System (OSDI 2006)].


===[[DistOS 2019F 2019-11-13|November 13, 2019]] (online)===
===[[DistOS 2019F 2019-11-13|November 13, 2019]] (online)===
Cassandra & Dynamo
* [http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf Lakshman & Malik, "Cassandra - A Decentralized Structured Storage System" (LADIS 2009)]
* [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)]


===[[DistOS 2019F 2019-11-18|November 18, 2019]] (online)===
===[[DistOS 2019F 2019-11-18|November 18, 2019]] (online)===
Haystack & F4
* [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar Muralidhar et al., "f4: Facebook's Warm BLOB Storage System" (OSDI 2014)]


===[[DistOS 2019F 2019-11-20|November 20, 2019]] (online)===
===[[DistOS 2019F 2019-11-20|November 20, 2019]] (online)===
Spanner & Tensorflow
* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]
* [https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi Martin Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning" (OSDI 2016)]


===[[DistOS 2019F 2019-11-25|November 25, 2019]] (online)===
===[[DistOS 2019F 2019-11-25|November 25, 2019]] (online)===
HTCondor
* [https://homeostasis.scs.carleton.ca/~soma/distos/2019f/thain2005-condor.pdf Thain et al., "Distributed Computing in Practice: The Condor Experience" (Concurrency and computation: practice and experience, Feb. 2005)]
* [https://research.cs.wisc.edu/htcondor/description.html HTCondor About Page]
* [https://en.wikipedia.org/wiki/HTCondor HTCondor on Wikipedia]


===[[DistOS 2019F 2019-11-27|November 27, 2019]] (online)===
===[[DistOS 2019F 2019-11-27|November 27, 2019]] (online)===
SCOPE & Yarn
* [https://homeostasis.scs.carleton.ca/~soma/distos/2019f/chaiken2008-scope.pdf Chaiken et al., "SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets" (PVLDB 2008)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2019f/vavilapalli2013-yarn.pdf Vavilapalli et al., "Apache Hadoop YARN: Yet Another Resource Negotiator" (SoCC 2013)]
* Optional: [https://homeostasis.scs.carleton.ca/~soma/distos/2019f/isard2007-dryad.pdf Isard et al., "Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks" (Eurosys 2007)]


===[[DistOS 2019F 2019-12-02|December 2, 2019]] (online)===
===[[DistOS 2019F 2019-12-02|December 2, 2019]] (online)===
Borg, Omega, Kubernetes
* [https://ai.google/research/pubs/pub41684.pdf Schwarzkopf et al., "Omega: flexible, scalable schedulers for large compute clusters" (EuroSys 2013)]
* [https://ai.google/research/pubs/pub43438.pdf Verma et al., "Large-scale cluster management at Google with Borg" (EuroSys 2015)]
* [https://ai.google/research/pubs/pub44843.pdf Burns et al., "Borg, Omega, and Kubernetes: Lessons learned from three container-management systems over a decade" (ACM Queue, Jan/Feb 2016)]


===[[DistOS 2019F 2019-12-04|December 4, 2019]] (online)===
===[[DistOS 2019F 2019-12-04|December 4, 2019]] (online)===


===[[DistOS 2019F 2019-12-09|December 9, 2019]] (online)===
Zookeeper & Sapphire
* [http://static.usenix.org/event/atc10/tech/full_papers/Hunt.pdf Hunt et al., "ZooKeeper: Wait-free coordination for Internet-scale systems" (USENIX ATC 2010)] [https://www.usenix.org/legacy/multimedia/atc10hunt (video)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/zhang Zhang et al., "Customizable and Extensible Deployment for Mobile/Cloud Applications" (OSDI 2014)]


===[[DistOS 2019F 2019-12-06|December 6, 2019]] (online)===


===Other readings===
Wrap-up Discussion


Farsite & Oceanstore
===December 13, 2019===
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/adya-farsite-intro.pdf Atul Adya et al.,"FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment" (2002)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/bolosky-farsite-retro.pdf William J. Bolosky et al., "The Farsite Project: A Retrospective" (2007)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)]


* Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004) [http://dx.doi.org/10.1109/GRID.2004.14 (DOI)] [http://ieeexplore.ieee.org.proxy.library.carleton.ca/stamp/stamp.jsp?tp=&arnumber=1382809 (Proxy)]
Final Exam Review, 3 PM in CB 2202 [https://homeostasis.scs.carleton.ca/~soma/distos/2019f/comp4000-20191213-finalreview.mp4 (audio)]


===December 15, 2019===


* [http://research.google.com/archive/gfs-sosp2003.pdf Sanjay Ghemawat et al., "The Google File System" (SOSP 2003)]
Final Exam, 7 PM in AT 102
* [https://www.usenix.org/legacy/events/osdi06/tech/burrows.html Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI 2006)]


===December 22, 2019===


* [http://www.usenix.org/events/osdi06/tech/weil.html Weil et al., Ceph: A Scalable, High-Performance Distributed File System (OSDI 2006)].
Final Projects due
* [http://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)]


Background (optional but helpful):
===Other readings===
* [http://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables]
* [http://en.wikipedia.org/wiki/Kademlia Wikipedia's article on Kademlia]
* [http://en.wikipedia.org/wiki/Tapestry_%28DHT%29 Wikipedia's article on Tapestry]
 
 
* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]
* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]
 
 
* [http://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)]
* [https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi Martin Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning" (OSDI 2016)]
 
* [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)]
* [http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf Lakshman & Malik, "Cassandra - A Decentralized Structured Storage System" (LADIS 2009)]
 
* [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar Muralidhar et al., "f4: Facebook's Warm BLOB Storage System" (OSDI 2014)]


Containers & Orchestration
Containers & Orchestration
* Wikipedia, [https://en.wikipedia.org/wiki/Operating-system-level_virtualization Operating-System-Level Virtualization]
* Wikipedia, [https://en.wikipedia.org/wiki/Operating-system-level_virtualization Operating-System-Level Virtualization]
* [https://en.wikipedia.org/wiki/Docker_(software) Wikipedia article on Docker]
* [https://en.wikipedia.org/wiki/Docker_(software) Wikipedia article on Docker]
* Burns et al., "Borg, Omega, and Kubernetes" (ACM Queue Jan/Feb 2016) [https://doi.org/10.1145/2898442.2898444 (DOI)]
* [https://docs.openshift.com/container-platform/3.11/architecture/index.html Openshift 3.11 Architecture]
* [https://docs.openshift.com/container-platform/3.11/architecture/index.html Openshift 3.11 Architecture]



Latest revision as of 22:13, 19 March 2020

Course Outline

Here is the course outline.

Group Report Grading

Make sure your reports have the following (10 points total):

  • Briefly summarize the papers so report can be read by anyone (2 points)
  • Address most of the assigned discussion questions (4 points)
  • Write in an understandable form using complete sentences and paragraphs (2 points)
  • Reflect the discussion you had, reporting the questions that came up and points of debate (2 points)

Project Help

To develop your literature review or research proposal, start with a single research paper that you find interesting and that is related to distributed operating systems in some way.

To begin selecting a paper, I suggest that you:

  • search on Google Scholar using keywords relating to your interests, and/or
  • browse the proceedings of major conferences that publish work related to distributed operating systems.

The main operating system conferences are OSDI and ACM SOSP (sosp.org,ACM DL). Note that not all the work here is on distributed operating systems! Also, many other conferences publish some work related to distributed operating systems, e.g. NSDI.

To help you write a literature review or the background of a research paper, read the following:

Class Schedule & Readings

September 4, 2019 (in person)

None

September 9, 2019 (online)

The Early Internet:

September 11, 2019 (online)

The Mother of All Demos:

September 16, 2019 (online)

The Alto:

September 18, 2019 (online)

Multics & UNIX:

Optional: Browse around the Multicians website.

September 23, 2019 (online)

LOCUS & NFS

September 25, 2019 (online)

Remote Procedure Calls

September 30, 2019 (online)

Distributed Shared Memory

October 2, 2019 (online)

Sprite, AFS, & Literature reviews

October 7, 2019 (online)

Amoeba & Clouds

October 9, 2019 (online)

Plan 9 & Inferno (no group report)

Second half of class will be a review for the midterm.

October 16, 2019 (in person)

Midterm exam (in class)

October 28, 2019 (online)

BOINC & Tapestry

Background (optional but helpful):

October 30, 2019 (online)

Farsite & Oceanstore

Project Proposal due Nov. 3rd.

November 4, 2019 (online)

GFS & Chubby

November 6, 2019 (online)

MapReduce & BigTable

November 11, 2019 (online)

NASD & Ceph

November 13, 2019 (online)

Cassandra & Dynamo

November 18, 2019 (online)

Haystack & F4

November 20, 2019 (online)

Spanner & Tensorflow

November 25, 2019 (online)

HTCondor

November 27, 2019 (online)

SCOPE & Yarn

December 2, 2019 (online)

Borg, Omega, Kubernetes

December 4, 2019 (online)

Zookeeper & Sapphire

December 6, 2019 (online)

Wrap-up Discussion

December 13, 2019

Final Exam Review, 3 PM in CB 2202 (audio)

December 15, 2019

Final Exam, 7 PM in AT 102

December 22, 2019

Final Projects due

Other readings

Containers & Orchestration

"Serverless Computing"