Soma-notes - User contributions [en]

DistOS 2018F 2018-11-28

2018-11-28T17:12:03Z

Sheldon: /* Notes */

==Readings==
"Serverless Computing"

* [https://en.wikipedia.org/wiki/Serverless_computing Wikipedia article on Serverless Computing]
* [https://en.wikipedia.org/wiki/Google_App_Engine Wikipedia article on Google App Engine]
* [https://en.wikipedia.org/wiki/AWS_Lambda Wikipedia article on AWS Lambda]
* [https://cloud.google.com/appengine/ Google App Engine]
* [https://docs.aws.amazon.com/lambda/latest/dg/welcome.html AWS Lambda Developer Guide]
* [https://serverless.com/framework/docs/ serverless documentation]

==Notes==
The dream: to put app on the cloud and have it scale up automatically without having to worry about implementation.

This is not similar to UNIX at all, it is a very restricted run-time with specific symantics. Kinda the dream and kinda not, what is it all in service of? The web.

In the marketing, google app engine and aws lambda and google function. Google app engine came first but aws lambda is more functional but they don’t look that different. Google app engine more for web apps and aws lambda is supposed to handle functions but they are all restful so they look like web clients. Send events and steams of data so more then web but using web interfaces. The IO is the web and that is the key thing, the file system has been replaced by web urls and rpcs have been replaced by web requests and how it is all tied together because it is kinda stateless. HTTP was a stateless protocol. That is how all of this is built and how the web won, simple and stateless...only had to add in state as needed to so scalable yet functional. Can hack on functionality but not scalability otherwise will fail badly. What is the architecture. What is the programming model? A language run-time environment, easy to handle incoming requests...a function that will be called when the request is made and routed through who knows how many machines and if need to access some sort of state need to access a database using standard libraries...so manipulate state and then code starts running on incoming request. That is it, that is your model. AWS lambda and Google app engine, code does not run if nothing to do, purely event driven. Bind to influence and run function when get requests....don’t control the part of the system that runs in response to requests...don’t control that part of the system. All you control is what you do in response to the requests. Other languages are now supported but define function that responds to events. There is no main loop, running in response to events. Use a data-store like SQL and standard routines to access it. Upload a file to system, store as a blob in the database. Not a hierarchical key-value store. More of a database than a file system. That is the programming model, a model for a web application. When data becomes available, get a call and code can retrieve it and go process it and store results in a database. Does this model make sense for a long running batch job? No this is more for IO heavy or dynamic website and other business logic for an application. Most of the stuff we do now do not require a lot of cycles.

Closest we have to ideal of a platform to build distributed applications. Nothing academically about this.
How are they running something like NodeJS on the back-end? Containers (share hardware) and VM (to allocate the hardware) and orchestration. Spin up to service requests until loaded and then spin up another one etc...just the web, same technology that has been in development for the last 20 years. That is the weird thing, nobody designed this...this is stuff that was lying around and then decided to build it by putting the pieces together...api’, isolation etc. So it a complete mess. Kubernetes as a technology is very complicated and overkill for most organizations. So some folks are dealing with the pain of setting it up and then they sell it. don’t want to manage thousands of machines when only need one...then when need more, someone else can take care of that. But environment providing is a web server talking to a database. How they do it? We have talked about it. All stateless between these services, have something else you want to do? Make it look like a web application....log event and returns a response but instead of to the app that generated the event, store in a database or generate an alert.

That wasn’t the goal or plan, no sane comp sci would have built it this way but when put it together, it does work. And looking at pieces individually it is understandable, put it all together, it is crazy. Is this propitiatory? What is offered is different in details but fundamentally the same. All the languages are open source and all the APIs are well known and then people building copies with their own stuff on top. Tie yourself to API services, might be some lock-in but if open source, someone else may have used the same foundation and then built more stuff on top and can move code to them.

Legal framework.....with encryption can make things private.

Web assembly: JavaScript, language of the web then did Node.JS....this sucks, want to run all langauges with sandbox model and machine code. Web Assembly, assembly code for the web. The target for a compiler that will run within a sandbox and can integrate with the DOM and JavaScript and can share back and forth running at native speed but isolated. How everything is going to run period. Service, client everywhere. B/c we want sand-boxing in the browser and server since it makes containers better. Better then the stupid stuff being done with Linux with the POSIX API to isolate things. POSIX is going away, will be there under the surface and for hosting but, we will not be talking about it. Like GO as a language, GO is all statically linked. Have a GO binary and it is just a self-contained thing. GO Binary is self-contained and statically linked since they are thinking like containers...reliability and reproducible so containerized from the get go. A container contains one binary, really simple with no dependency. Just linux kernel binary interface....and web assembly will likely merge in the future...what will be preserved is web assembly b/c don’t need POSIX, mostly gets in the way. We just need to support the web technologies. Which is weird since UNIX was successful but, that will push it one it’s way out...pushed far down so hard to see it.

Wrapping of state so that it can just move around. Container might be downloaded to laptop or phone to run, bundles of state but a cashed version of what is elsewhere. Why store permanent state on your own device. Copy might be on server or cloud, does not matter as long as it is there.

Management of state is what is painful and so it is getting abstracted and pushed away so state management is becoming some sort of service. So this is fascinating and complexity underneath it.

We find ways to punish when trust is violated but we need to trust otherwise have a lot of waist and a lot of trouble. Crypto-currencies is trying not to trust but shifted trust elsewhere and now lots of problem.

Test:

Question 2: f4 encrypted files and called it untrusted. No it’s not, it is completely trusted b/c they have all the keys and they have all the data. Just makes it easier to delete. AFS did not trust clients and OceanStore, Farsight and BOINC has the largest level of distrust. Otherwise does trust everything.

DistOS 2018F 2018-11-28

2018-11-28T17:10:01Z

Sheldon: /* Readings */

DistOS 2018F 2018-11-26

2018-11-26T18:47:09Z

Sheldon: /* Notes */

==Readings==

Containers & Orchestration
* Wikipedia, [https://en.wikipedia.org/wiki/Operating-system-level_virtualization Operating-System-Level Virtualization]
* [https://en.wikipedia.org/wiki/Docker_(software) Wikipedia article on Docker]
* Burns et al., "Borg, Omega, and Kubernetes" (ACM Queue Jan/Feb 2016) [https://doi.org/10.1145/2898442.2898444 (DOI)]
* [https://docs.openshift.com/container-platform/3.11/architecture/index.html Openshift 3.11 Architecture]

==Notes==

Lecture Nov 26 - In-class notes

language-based virtualisation

 |
 |
 |
 |
 p p p | p p | p p
 r r r | r r | r r
 o o o | o o | o o
 c c c | c c | c c
 e e e | e e | e e
 s s s | s s | s s
 s s s | s s | s s


 <nowiki> ------------------------</nowiki>
 kernel | kernel | kernel
 <nowiki> ------------------------ </nowiki>
 hypervisor
 <nowiki> ------------------------ </nowiki>
 hardware

 Binary compilation on the fly, with JavaScript, want to run arbitrary things asm.js and native client.
Web assembly is java virtual machine byte codes, instead of a byte code language to run java it is designed to run almost anything. Environment for running code.
Hardware virtual machines (hyper-visors) -->
x86 was only one kernel run on it, took classic x86, the CPUs did not have hardware virtualisation built-in. So they inserted software, binaries on the fly, so it wouldn’t do things but now, modern processes support this.
Page tables are managed by every kernel so would have to have a page table of page tables (shadow page tables).
Hardware virtualisation, the CPU itself can walk through those two levels. Faster in hardware but we can do it all in software but slower. Hypervisor is significantly faster but there is always a memory overhead, b/c you have chunkiness. So need more memory to do this.

OS level virtualisation: Puts processes into name-spaces, groups them together so that all the processes cannot see each-other...chrooted environment ---> i.e. FTP servers
Jails, kernels tweaked so that you cannot get out of your isolation.

With hypervisors, have a nice clean hardware like interface to a kernel, lower level abstraction so simpler so can multiplex between kernels but the problem, running multiple kernels which is always overhead and one kernel for each separation you have done. Virtual appliances...kinda dumb and waistful and need integration between them to share resources between them. Webhost is OS level virtualisation. Kernel mods to run separate user lands so can be root on web server but still on the same kernel despite being separated.

With containers --> what is a container...is OS level virtualisation, just the grouping but managed .... how do you get your application running in production. Why does apps break when you send it to another environments...libraries forgot to be listed as a dependency or needed a config. environment...build apps, not isolated, not packaged together.

Linux distros, packages management...lots of work to do correctly...people deploying the packages are not the developer of the apps...specialists in the distro to package for the distro...i.e. people at Ubuntu....containers, let get away from that...the apps + all dependencies, files etc. An OS image to deploy into containers.

Developer can make for own machine and then distribute to a cluster and it runs properly! That is great!

Instead of process migration or entire virtual machines with kernels, doing containers. Light enough weight yet encompass all the dependencies.

To deploy a distributed app...components are containers...service oriented architecture...instead of one process across multiple systems....kubernetes, how you describe containers, how deployed and talk to each-other....grow based on load, instances etc...different containers and it orchestrates the running of all the instances. Is the kubernetes infrastructure trusted or un-trusted...highly trusted, within a trusted environment....security story with containers.

Hypervisor, more isolation, there are attacks to breach the barrier between them but AWS are based on running VM machines between different customers, each get their own kernel but Amazon does it to enforce strong isolation but a customer will use containers.
Containers should not be from un-trusted sources...if a container is attacked by another, all bets are off...why you implement a separate kernel...infrastructure as a service provider...run a VM, full Kernel for Linux or Windows, to control but they don’t trust you so you are isolated but can run kubernetes on top of it b/c deploying the entire kubernetes infrastructure.

Docker--> the first to make containers, OS level virtualisation and now everyone is doing it. There are competitors...want to make containers and setup to run containers in response to load, monitoring and specify rules with kubernetes.

Abstraction with OS that actually works. Does Google use it, no. What do they use instead...Borg ... all the problems that Borg has, why do they not use Kubernetes despite being better engineered...why do they still use Borg...b/c they started with Borg and using a distributed systems with Borg...change infrastructure over, transition over to it is difficult so most of their infrastructure is still running Borg....better at other places b/c they are still using legacy stuff... all broken, quirky and messed up....nice and clean at other places. Folks at Amazon, actually have b/c everything is separated...service oriented architecture, replace the services as they go.

Containers are a Linux technology, most run Debian user-level. Different container formats, full power of Linux, different ways to do it. Linux API is become the default execution environment. The bad thing about it, now have to provide the entire OS...if leave a mess inside, it might work but what about security updates...user-land so any libraries inside there are frozen...how do you update the container...deploy a new container...treat it as immutable...functional-like...managing state....application is what you manage and the application is just a container.

Dev-ops...what the developer does is what goes into production...that is good but potentially not good if they do a silly thing....makes updating infrastructure easier and updating containers becomes minimal...to support whatever features you have in the OS. Kernel backwards compatibility at the system call.

Developer, responsible for deploying OSs and when there is a problem, breaks, they will come back to you...all they know what to do is to run it.

Ties into server-less computing. Will stick around b/c this feels durable, seems like it works. Uses old concepts and reuses them inside a container like users and groups inside the container b/c again it is a whole OS.

This is the future of deploying applications. The new trend in computing. Load-balanced etc. This is how you deploy a web app and everything is becoming a web app. Containers may change but in an evolutionary way, what OS runs inside of it. The Linux user land might change...web assembly binaries directly supported.

DistOS 2018F 2018-11-26

2018-11-26T18:44:15Z

Sheldon: /* Notes */

==Readings==

Containers & Orchestration
* Wikipedia, [https://en.wikipedia.org/wiki/Operating-system-level_virtualization Operating-System-Level Virtualization]
* [https://en.wikipedia.org/wiki/Docker_(software) Wikipedia article on Docker]
* Burns et al., "Borg, Omega, and Kubernetes" (ACM Queue Jan/Feb 2016) [https://doi.org/10.1145/2898442.2898444 (DOI)]
* [https://docs.openshift.com/container-platform/3.11/architecture/index.html Openshift 3.11 Architecture]

==Notes==

Lecture Nov 26 - In-class notes

language-based virtualisation

 |
 |
 |
 \
 p p p | p p | p p
 r r r | r r | r r
 o o o | o o | o o
 c c c | c c | c c
 e e e | e e | e e
 s s s | s s | s s
 s s s | s s | s s
 <nowiki> ------------------------</nowiki>
 kernel | kernel | kernel
 <nowiki> ------------------------ </nowiki>
 hypervisor
 <nowiki> ------------------------ </nowiki>
 hardware

 binary compilation on the fly, with JavaScript, want to run arbitrary things asm.js and native client.
Web assembly is java virtual machine byte codes, instead of a byte code language to run java it is designed to run almost anything. Environment for running code.
Hardware virtual machines (hyper-visors) -->
x86 was only one kernel run on it, took classic x86, the CPUs did not have hardware virtualisation built-in. So they inserted software, binaries on the fly, so it wouldn’t do things but now, modern processes support this.
Page tables are managed by every kernel so would have to have a page table of page tables (shadow page tables).
Hardware virtualisation, the CPU itself can walk through those two levels. Faster in hardware but we can do it all in software but slower. Hypervisor is significantly faster but there is always a memory overhead, b/c you have chunkiness. So need more memory to do this.

OS level virtualisation: Puts processes into name-spaces, groups them together so that all the processes cannot see each-other...chrooted environment ---> i.e. FTP servers
Jails, kernels tweaked so that you cannot get out of your isolation.

With hypervisors, have a nice clean hardware like interface to a kernel, lower level abstraction so simpler so can multiplex between kernels but the problem, running multiple kernels which is always overhead and one kernel for each separation you have done. Virtual appliances...kinda dumb and waistful and need integration between them to share resources between them. Webhost is OS level virtualisation. Kernel mods to run separate user lands so can be root on web server but still on the same kernel despite being separated.

With containers --> what is a container...is OS level virtualisation, just the grouping but managed .... how do you get your application running in production. Why does apps break when you send it to another environments...libraries forgot to be listed as a dependency or needed a config. environment...build apps, not isolated, not packaged together.

Linux distros, packages management...lots of work to do correctly...people deploying the packages are not the developer of the apps...specialists in the distro to package for the distro...i.e. people at Ubuntu....containers, let get away from that...the apps + all dependencies, files etc. An OS image to deploy into containers.

Developer can make for own machine and then distribute to a cluster and it runs properly! That is great!

Instead of process migration or entire virtual machines with kernels, doing containers. Light enough weight yet encompass all the dependencies.

To deploy a distributed app...components are containers...service oriented architecture...instead of one process across multiple systems....kubernetes, how you describe containers, how deployed and talk to each-other....grow based on load, instances etc...different containers and it orchestrates the running of all the instances. Is the kubernetes infrastructure trusted or un-trusted...highly trusted, within a trusted environment....security story with containers.

Hypervisor, more isolation, there are attacks to breach the barrier between them but AWS are based on running VM machines between different customers, each get their own kernel but Amazon does it to enforce strong isolation but a customer will use containers.
Containers should not be from un-trusted sources...if a container is attacked by another, all bets are off...why you implement a separate kernel...infrastructure as a service provider...run a VM, full Kernel for Linux or Windows, to control but they don’t trust you so you are isolated but can run kubernetes on top of it b/c deploying the entire kubernetes infrastructure.

Docker--> the first to make containers, OS level virtualisation and now everyone is doing it. There are competitors...want to make containers and setup to run containers in response to load, monitoring and specify rules with kubernetes.

Abstraction with OS that actually works. Does Google use it, no. What do they use instead...Borg ... all the problems that Borg has, why do they not use Kubernetes despite being better engineered...why do they still use Borg...b/c they started with Borg and using a distributed systems with Borg...change infrastructure over, transition over to it is difficult so most of their infrastructure is still running Borg....better at other places b/c they are still using legacy stuff... all broken, quirky and messed up....nice and clean at other places. Folks at Amazon, actually have b/c everything is separated...service oriented architecture, replace the services as they go.

Containers are a Linux technology, most run Debian user-level. Different container formats, full power of Linux, different ways to do it. Linux API is become the default execution environment. The bad thing about it, now have to provide the entire OS...if leave a mess inside, it might work but what about security updates...user-land so any libraries inside there are frozen...how do you update the container...deploy a new container...treat it as immutable...functional-like...managing state....application is what you manage and the application is just a container.

Dev-ops...what the developer does is what goes into production...that is good but potentially not good if they do a silly thing....makes updating infrastructure easier and updating containers becomes minimal...to support whatever features you have in the OS. Kernel backwards compatibility at the system call.

Developer, responsible for deploying OSs and when there is a problem, breaks, they will come back to you...all they know what to do is to run it.

Ties into server-less computing. Will stick around b/c this feels durable, seems like it works. Uses old concepts and reuses them inside a container like users and groups inside the container b/c again it is a whole OS.

This is the future of deploying applications. The new trend in computing. Load-balanced etc. This is how you deploy a web app and everything is becoming a web app. Containers may change but in an evolutionary way, what OS runs inside of it. The Linux user land might change...web assembly binaries directly supported.

DistOS 2018F 2018-11-26

2018-11-26T18:34:29Z

Sheldon: /* Notes */

==Readings==

Containers & Orchestration
* Wikipedia, [https://en.wikipedia.org/wiki/Operating-system-level_virtualization Operating-System-Level Virtualization]
* [https://en.wikipedia.org/wiki/Docker_(software) Wikipedia article on Docker]
* Burns et al., "Borg, Omega, and Kubernetes" (ACM Queue Jan/Feb 2016) [https://doi.org/10.1145/2898442.2898444 (DOI)]
* [https://docs.openshift.com/container-platform/3.11/architecture/index.html Openshift 3.11 Architecture]

==Notes==

Lecture Nov 26 - In-class notes

language-based virtualisation
|
|
|
\
p p p | p p | p p
r r r | r r | r r
o o o | o o | o o
c c c | c c | c c
e e e | e e | e e
s s s | s s | s s
s s s | s s | s s
------------------------
kernel | kernel | kernel
------------------------
hypervisor
------------------------
hardware

binary compilation on the fly, with JavaScript, want to run arbitrary things asm.js and native client.
Web assembly is java virtual machine byte codes, instead of a byte code language to run java it is designed to run almost anything. Environment for running code.
Hardware virtual machines (hyper-visors) -->
x86 was only one kernel run on it, took classic x86, the CPUs did not have hardware virtualisation built-in. So they inserted software, binaries on the fly, so it wouldn’t do things but now, modern processes support this.
Page tables are managed by every kernel so would have to have a page table of page tables (shadow page tables).
Hardware virtualisation, the CPU itself can walk through those two levels. Faster in hardware but we can do it all in software but slower. Hypervisor is significantly faster but there is always a memory overhead, b/c you have chunkiness. So need more memory to do this.

OS level virtualisation: Puts processes into name-spaces, groups them together so that all the processes cannot see each-other...chrooted environment ---> i.e. FTP servers
Jails, kernels tweaked so that you cannot get out of your isolation.

With hypervisors, have a nice clean hardware like interface to a kernel, lower level abstraction so simpler so can multiplex between kernels but the problem, running multiple kernels which is always overhead and one kernel for each separation you have done. Virtual appliances...kinda dumb and waistful and need integration between them to share resources between them. Webhost is OS level virtualisation. Kernel mods to run separate user lands so can be root on web server but still on the same kernel despite being separated.

With containers --> what is a container...is OS level virtualisation, just the grouping but managed .... how do you get your application running in production. Why does apps break when you send it to another environments...libraries forgot to be listed as a dependency or needed a config. environment...build apps, not isolated, not packaged together.

Linux distros, packages management...lots of work to do correctly...people deploying the packages are not the developer of the apps...specialists in the distro to package for the distro...i.e. people at Ubuntu....containers, let get away from that...the apps + all dependencies, files etc. An OS image to deploy into containers.

Developer can make for own machine and then distribute to a cluster and it runs properly! That is great!

Instead of process migration or entire virtual machines with kernels, doing containers. Light enough weight yet encompass all the dependencies.

To deploy a distributed app...components are containers...service oriented architecture...instead of one process across multiple systems....kubernetes, how you describe containers, how deployed and talk to each-other....grow based on load, instances etc...different containers and it orchestrates the running of all the instances. Is the kubernetes infrastructure trusted or un-trusted...highly trusted, within a trusted environment....security story with containers.

Hypervisor, more isolation, there are attacks to breach the barrier between them but AWS are based on running VM machines between different customers, each get their own kernel but Amazon does it to enforce strong isolation but a customer will use containers.
Containers should not be from un-trusted sources...if a container is attacked by another, all bets are off...why you implement a separate kernel...infrastructure as a service provider...run a VM, full Kernel for Linux or Windows, to control but they don’t trust you so you are isolated but can run kubernetes on top of it b/c deploying the entire kubernetes infrastructure.

Docker--> the first to make containers, OS level virtualisation and now everyone is doing it. There are competitors...want to make containers and setup to run containers in response to load, monitoring and specify rules with kubernetes.

Abstraction with OS that actually works. Does Google use it, no. What do they use instead...Borg ... all the problems that Borg has, why do they not use Kubernetes despite being better engineered...why do they still use Borg...b/c they started with Borg and using a distributed systems with Borg...change infrastructure over, transition over to it is difficult so most of their infrastructure is still running Borg....better at other places b/c they are still using legacy stuff... all broken, quirky and messed up....nice and clean at other places. Folks at Amazon, actually have b/c everything is separated...service oriented architecture, replace the services as they go.

Containers are a Linux technology, most run Debian user-level. Different container formats, full power of Linux, different ways to do it. Linux API is become the default execution environment. The bad thing about it, now have to provide the entire OS...if leave a mess inside, it might work but what about security updates...user-land so any libraries inside there are frozen...how do you update the container...deploy a new container...treat it as immutable...functional-like...managing state....application is what you manage and the application is just a container.

Dev-ops...what the developer does is what goes into production...that is good but potentially not good if they do a silly thing....makes updating infrastructure easier and updating containers becomes minimal...to support whatever features you have in the OS. Kernel backwards compatibility at the system call.

Developer, responsible for deploying OSs and when there is a problem, breaks, they will come back to you...all they know what to do is to run it.

Ties into server-less computing. Will stick around b/c this feels durable, seems like it works. Uses old concepts and reuses them inside a container like users and groups inside the container b/c again it is a whole OS.

This is the future of deploying applications. The new trend in computing. Load-balanced etc. This is how you deploy a web app and everything is becoming a web app. Containers may change but in an evolutionary way, what OS runs inside of it. The Linux user land might change...web assembly binaries directly supported.

DistOS 2018F 2018-11-26

2018-11-26T18:26:15Z

Sheldon: /* Readings */

DistOS 2018F 2018-11-19

2018-11-19T17:28:21Z

Sheldon: /* Notes */

==Readings==

* Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004) [http://dx.doi.org/10.1109/GRID.2004.14 (DOI)] [http://ieeexplore.ieee.org.proxy.library.carleton.ca/stamp/stamp.jsp?tp=&arnumber=1382809 (Proxy)]

==Notes==
Lecture Nov 19

- integrate, don’t explain the systems. Talk about the systems and the concepts.
- essay form: intro, middle, conclusion.
- 3 examples....could mention 8...make references, name drop

Question 1:
- Unix Process model: what is a process model? The programming model, what a programmer sees when they run a program, what does the virtual computer look like, how does it function. Not simple, lots of hardware to implement. Virtual CPU is different from JVM or higher level language. UNIX has a specific way to abstract things, the UNIX process model does not scale, cannot run a process across many computers and run it like a process....had to do some fiddling with it. NFS...POSIX is stateful when open and close but NFS is stateless....create .NFS files...AFS messed up close....LOCUS had this run thing, instead of fork could just run it on another node....distributing makes it not look like UNIX and how? Not even going to distribute a process but at other levels of abstraction. Cannot do UNIX, can do almost like UNIX but....

Question 2:
- Caching: improving performance but it is not a cost free thing. If cache, need to make sure things are consistent, how do you synchronize things...sacrifice performance or add complexity...talk about consistency and durability with caching. Spend more time talking about ideas opposed to describing the system, talk about the ideas first.
Question 3:
- trusted vs. Un-trusted
Farsight: untrusted compared to NFS...trust the kernels but not the user space...have to trust something and don’t trust something else....i.e. user input but trusting the computer to run the code properly. Trusted you have less cost in terms of sharing, follow protocol, may have to do with failing but not malicious....if malicious, cannot trust computations they do. Most systems, first half about un-trusted...Oceanstore and Farsight were most untrusted. But AFS, people didn’t get the trust boundaries...trust the servers but not the workstations...access only their files and only for a limited period of time.
Question 4:
Concurrent access...access a file from more than one reader and writer what happens? Do you serialize or do something else.

Augmentation of human intelligence, bicycle for the mind demonstrated through Mother of All Demos doing really ordinary things...lists and collaboration...it didn’t inform what we were reading about...UNIX...NO! Powerful but not about HCI, lots of bad human factors...can do user groups but MULTIX and UNIX was time-sharing, resource sharing in computer networks was letting programmers run programs...not live our lives on computers and collaborate. It was all about resource sharing not the vision of Anglebart. What was Facebook’s big fancy systems for...photo sharing, how do we index the world...lookup email and shopping carts, the web started out for collaboration, hypertext and sharing information and comments to web pages...two way linking changed to one-way is Mother of All Demos vision....all comp scientists were making systems that were pathetic until they implemented his vision then they built systems that really scaled....not close at first b/c they would let people work in their own bubble but working together and collaborating was not there...architecture by big companies could have been implemented earlier in the 80s but they were not trying to solve that problem so didn’t need their systems to scale in that way.....it’s not his vision, it’s the other vision in the first half of the course.
Question 5:
accessing data from multiple readers and writers.

BOINC:

System to share resources to do computations. Central system that gives out work units.
It is concurrent, highly concurrent.
What is the process model? Small work unites redundantly, what is the system closest in model discussed earlier. Map-reduce....bunch of work stations and you give them each a unit of work then, that unit of work, once done, comes back and then combine the results. That reduce operation was often the bottleneck.
What is the key difference between map reduce and BOINC. BOINC worker nodes are all untrusted which is why work units are given out redundantly. Map reduce does not do that, maybe a node might fail but they don’t assume someone has engineered their node to get fake results. That is what BOINC is designed for. What is the key feature that unified map reduce and BOINC....a problem they can divide up and cannot require communication between the pieces while calcs being done. So embarrassingly parallel. Map reduce, can be distributed to some degree, a functional operation that you get partial answers that can be combined together...assumed to be centralized at reduce but power of system is distribution.

What have we solved and what have we not solved? We have not solved the state sharing problem. It is unsolvable, sharing state is expensive. Can pretend that it is trivial on a single box. Work on zero copies and cache aware. If you want good performance, expensive. Network link, sharing state is painful so...avoid. Architectures that succeed put off and minimize sharing state...turn things that require acordination to be done seperately. Transaction log...time stamps and then see how they all fit together...keep fast until merge results and share enough state so that we don’t step on the toes of other parts.

BOINC, not sharing state, not trusting nodes, but can work for them. Not good for a weather simulation.

Cluster of small computers is a super computer. Massively parallel....lots of processing power, lots of disk and memory and CPUs but is lots of AWS instances a super computer? A super computer is a system with a architecture where exchanging state is less expensive...high bandwidth and low latency...sharing state efficiently. Some computations like simulations need it. US department of energy uses super computers for simulations...weather, how things happen, coordinating state.
Facebook and Amazon...real world, don’t need that, don’t need to share state like that, minimize sharing of state, get scale-ability. A subclass of problems. The web maps very well onto this but other things don’t.
80s and 90s, some parallel algorithms got used, other were not used.
Hot area that does require coordination of state...AI, matrix multiplication. Expensive, high-end computer science. Now advances in parallel processing while minimizing coordination of state...functional programming framework...make it paralleled by minimizing the sharing of state.....don’t have mutable state, the enemy of scaling. That is what people are implementing. Learn functional programming? YES. Worth spending time learning those ideas. What are the problems we are dealing with...sharing state, we cannot do it so, lets not. Can replicate the sharing of state but coordinating mutable state will kill performance. Everything kinda looks BOINC like at the end.

Test 2:

Questions to discuss:

- Sharing state:
Strategies for sharing state; file sharing state, between processes...otherwise just talking about networking.
Earlier in semester, sharing state among programs like process migration...clouds...memory stays put and CPU state moves around... how functions work...clouds just says across many systems, not just one system. Otherwise, more like process migration. Distributed shared memory, virtual memory across a cluster...multi-threaded process across many CPUs, page it out of one and page it into another. Might want to make a copy of it...all the same stuff but doing it at the level of memory for a process. The way anything is fast, have memory locality...distributed shared memory hides memory locality from your program. When access things that are not local, screwed so program must be aware of locality...why are you bothering...might as well manage state explicitly. Process concept, can access all memory fast relatively speaking.

Protein folding as a physics problem, it is a simulation...protein folding does not do that they do similarity...one sequence compared to another sequence...an hypothesis...physics based in the sense, have to try a little part of a simulation to make a boundary computation that can run on a computer....a couple of time steps.

Depends on how local you can make it. Protein interactions is another problem...computer science, how to solve the problem. Quantom computation...physics simulations.

Sharing state of a process we have seen but mostly it is process migration. How do you share access to data? Data taking on two basic forms, a file or a database of some kind. SQL or key value store or heirchical file system.

How do we access in parallel? Embarrassingly parallel, anti-social approach. The moment there are interactions what do we do? Concurrent reads and writes....concurrent writes influence concurrent reads. One way is to serialize access. Lots and lots of reads but the moment a writer comes along, need lock control....maybe serialize access to it...not scaleable. In your system, amount of collaboration is small, this is fine. What is the other solution for sharing access to a file? Appends. What is appending really? How is depend different from a write? Versioning, what is versioning. Taking shared state and parallising it...not really sharing state. Almost making it immutable....modelled as a set of new versions of the same file...latest version is the copy of the file you use...functional programming....appending onto it...a disciplined way of mutating data so that most of it is immutable...only mutation is adding something new (not deleting) easy to coordinate. Binding, scope is similar to versioning, appending is mutation, changing state but almost immutable. When doing appending, cannot do strict appending b/c it requires ordering which is expensive at scale so systems like GFS do not bother...duplicates maybe and then have to deal with that. Tricks to avoid sharing state. If have to share state, then have to serialize.

Platforms vs. Products. What is the difference?

Systems from Google: Systems connected together in layers, built on-top of previous solutions. Hierarchical. Intedepenency. Google philosophy is closer to Windows in how it is built. It all fits together and have dependencies on each other. When it all works together, works really well but have to be part of the system to use. A whole way of doing things. Insider vs. Outsiders. If allowed to outsiders, they find vulnerabilities and exploited so services so, there are trust boundaries. But trust enables a certain level of efficiency...scalled in term of computations not in terms of developers. Work at higher levels of abstraction.

Amazon: teams worked independently. Building pieces and solutions made available to everyone. Teams provide services. Individual services are self contained that anyone can use. Not higher level of abstraction but can assemble your with the services. Can just learn what you want, use the component and adopt it.

This is why Google is less successful in the cloud market. Google App engine...run app on Google infrastructure...own database and process model, write code to fit into their infrastructure that scales, is reliable etc. But you are writing to their specific system. But in Amazon, customized to you....some stuff independent some dependent but if find the piece you want, master it great! All of it, not possible.

Companies ship their org chart....timeline...this is how you are going to do things...cannot be trusting each-other, isolating groups. Build services to deal with too many requests. Amazon has been trying to do lots of things in an aggressive way, Darwinian process. Google doesn’t have that, much more top down development. On Amazon, if it is surviving, it survives, dying it dies. Amazon is more bottom up, Google is much more top down.

Talk about concepts on the test and drop references as appropriate.

DistOS 2018F 2018-11-14

2018-11-14T18:13:14Z

Sheldon: /* Notes */

==Readings==

* [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar Muralidhar et al., "f4: Facebook's Warm BLOB Storage System" (OSDI 2014)]

==Notes==
Lecture Nov 14: Haystack & F4

CDN (Canadian Dairy Network), when talking about Haystack, talking about CDNs. What is a CDN...Content Distribution Network. Large-scale cache for data. The idea is you have servers replicated around the world. Servers close to people and the things people want...instead of going to main server, go to local server and get a copy from there. Pioneer Akamai .... folks from MIT....models that became content distribution networks necessary for large scale systems. Reduces latency of page loads. Do not serve entire website, CDNs are bad at serving dynamic content such as email...a web app is not in a CDN b/c it makes no sense...no-one should be asking for the same email, should not be replicated....what you see is specific to you does not make sense. Only makes sense if replicated across multiple page views. Code on client is going to have to get custom data, that is where a CDN does not work. See Figure 3...to CDN and to Haystack storage. Original solution....photos are the problem...original solution was NFS....this sucks but why? Bad performance....too many disk accesses why? Metadata...having to go access the iNode and then the actual contents of the file was too much...for a normal file system, of course you have to separate the metadata from the data....metadata has different access patterns, did not make sense for Facebook...didn’t want separate reads for both...why can you get away with merging metadata with data...the needle....figure 5....metadata and data are intertwined....why can they get away with the format....the data is immutable. The game changes with how you deal with metadata or data b/c they don’t change...Where is the photo, find the photo and then everything about it in one place. Keep track of headers then read everything about it in one go. Just need the offset, do not need a separate iNode...no pointer to iNode, just need big set of files where it is and the offset of the photo...reduce file operations and thus increase performance. It is just realizing that your data is immutable. Separating metadata and data together, fast access pattern.

Have a photo name, gets you speed. Has to be immutable, fast is not good if it is not durable. We need protection and redundancy. Don’t store every photo once, store multiple times. The indexes are in memory.

F4 is built on the Hadoop and Haystack uses FX

Only have to touch the disk to read it faster...at this scale not using solid state disks, way too expensive. Haystack great for serving photos, what are the problems with it? Replication factors...reason for fractional numbers, in practice, have failures and so get fractions...replicates between 3-4 times when you get 3.6, on average at least three copies....too expensive, everyone’s photos between 3-4 times each, if you can remove one replication, save on storage costs....2010-2014 when people started paying attention to Facebook, specifically privacy. Diff between Haystack and F4, F4 deletes quickly while Haystack only marked items for deletion. Cheaper storage, better deletes. What is the trick, how did they do it? Same thing as used in RAID5, parity bits to track data and do something with it...good enough to erase data, stripe it. Encrypt everything, every photo has an encryption key stored in a separate database. If you delete the encryption key, you delete the data. B/c modern systems replicate data everywhere, logs journals etc. All over the place to guard against failures...copies on top of copies on top of copies on every scale. If you encrypt, delete the key, everything is gone. Haystack becomes the photo cache, photos being accessed quickly. For worm storage, F4 is used for that...why not use f4 for everything. Parity stuff and it has fewer replicas to read from, with multiple replicas, can read from them in parallel so Haystack is good for hot stuff while F4 is better for the colder but not completely cold.

Amazon Glacier...cheap storage, really cheap but cannot access quickly, from Glacier to S3 it take hours. Not online, might be in tapes sitting offline, so not good for immediate access data. So it will take long from f4 to haystack but not that long, a couple of seconds.

Cold storage for disaster recovery....traditionally what cold storage is about but not useful for online services.
Haystack has a durability guarantee with replication but also a performance benefit. How much engineering goes into these seemingly trivial uses.

DistOS 2018F 2018-11-14

2018-11-14T18:08:10Z

Sheldon: /* Notes */

==Readings==

* [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar Muralidhar et al., "f4: Facebook's Warm BLOB Storage System" (OSDI 2014)]

==Notes==
Lecture Nov 14: Haystack & F4

CDN, when talking about Haystack, talking about CDNs. What is a CDN...content distribution network. Large-scale cache for data. The idea is you have servers replicated around the world. Servers close to people and the things people want...instead of going to main server, go to local server and get a copy from there. Pioneer Akamai .... folks from MIT....models that became content distribution networks necessary for large scale systems. Reduces latency of page loads. Do not serve entire website, CDNs are bad at serving dynamic content such as email...a web app is not in a CDN b/c it makes no sense...no-one should be asking for the same email, should not be replicated....what you see is specific to you does not make sense. Only makes sense if replicated across multiple page views. Code on client is going to have to get custom data, that is where a CDN does not work. See Figure 3...to CDN and to Haystack storage. Original solution....photos are the problem...original solution was NFS....this sucks but why? Bad performance....too many disk accesses why? Metadata...having to go access the iNode and then the actual contents of the file was too much...for a normal file system, of course you have to separate the metadata from the data....metadata has different access patterns, did not make sense for Facebook...didn’t want separate reads for both...why can you get away with merging metadata with data...the needle....figure 5....metadata and data are intertwined....why can they get away with the format....the data is immutable. The game changes with how you deal with metadata or data b/c they don’t change...Where is the photo, find the photo and then everything about it in one place. Keep track of headers then read everything about it in one go. Just need the offset, do not need a separate iNode...no pointer to iNode, just need big set of files where it is and the offset of the photo...reduce file operations and thus increase performance. It is just realizing that your data is immutable. Separating metadata and data together, fast access pattern.

Have a photo name, gets you speed. Has to be immutable, fast is not good if it is not durable. We need protection and redundancy. Don’t store every photo once, store multiple times. The indexes are in memory.

F4 is built on the Hadoop and Haystack uses FX

Only have to touch the disk to read it faster...at this scale not using solid state disks, way too expensive. Haystack great for serving photos, what are the problems with it? Replication factors...reason for fractional numbers, in practice, have failures and so get fractions...replicates between 3-4 times when you get 3.6, on average at least three copies....too expensive, everyone’s photos between 3-4 times each, if you can remove one replication, save on storage costs....2010-2014 when people started paying attention to Facebook, specifically privacy. Diff between Haystack and F4, F4 deletes quickly while Haystack only marked items for deletion. Cheaper storage, better deletes. What is the trick, how did they do it? Same thing as used in RAID5, parity bits to track data and do something with it...good enough to erase data, stripe it. Encrypt everything, every photo has an encryption key stored in a separate database. If you delete the encryption key, you delete the data. B/c modern systems replicate data everywhere, logs journals etc. All over the place to guard against failures...copies on top of copies on top of copies on every scale. If you encrypt, delete the key, everything is gone. Haystack becomes the photo cache, photos being accessed quickly. For worm storage, F4 is used for that...why not use f4 for everything. Parity stuff and it has fewer replicas to read from, with multiple replicas, can read from them in parallel so Haystack is good for hot stuff while F4 is better for the colder but not completely cold.

Amazon Glacier...cheap storage, really cheap but cannot access quickly, from Glacier to S3 it take hours. Not online, might be in tapes sitting offline, so not good for immediate access data. So it will take long from f4 to haystack but not that long, a couple of seconds.

Cold storage for disaster recovery....traditionally what cold storage is about but not useful for online services.
Haystack has a durability guarantee with replication but also a performance benefit. How much engineering goes into these seemingly trivial uses.

DistOS 2018F 2018-11-12

2018-11-12T18:23:53Z

Sheldon: /* Notes */

==Readings==

* [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)]
* [http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf Lakshman & Malik, "Cassandra - A Decentralized Structured Storage System" (LADIS 2009)]

==Notes==
Lecture:

Dynamo:

What is Dynamo the solution too? Shopping carts and web sessions.

When you put something in shopping cart, it stays, even months later. Shopping carts matter but have weird priorities compared to the past. First care about availability. Session for it, and if not, give a new session. And this matters at i.e. Christmas.

This does not look like how Google builds things, what is different?
What does big table depend on? Gfs and chubby.
Dynamo is a standalone service to solve the problem. Why doesn’t Google do that? It is about philosophy. The two pizza rule...teams to be no bigger than could be fed with two pizzas. A bunch of small teams, they want each team to be autonomous. They don’t want hierarchy of management. Relatively small teams, need to be able to work on a self contained project that does not tightly couple you to another team. Service oriented architecture. Other folks inside Amazon when use your service, using an API that can be exposed to the world.
Contrasted with Google, designed to be trusted in a trusted environment that is not exposed to the world. Denial or service attacks have to be considered from the beginning from Amazon than Google. Amazon tries to build a platform, Google tries to provide a product.
Microsoft does this kind of, internal APIs.
AWS, this is the API we use, don’t get a different one. If you don’t like Amazon’s AWS, go do your own thing.
Number of services offered by AWS is insane and redundancy among them with overlapping capabilities. Kind of Darwinian. AWS webpage is a thin layer over internal APIs...org chart...look at a product, what they are selling is a reflection of the way the organization is organized. Amazon’s is very flat, Google is not. Google Hierarchical. Not that one is better than the other, just different. Amazon is winning the platform race.

Facebook Cassandra:

Inbox search, messages to search. What they developed....why is this a funny thing. Google is based on searching...Facebook, no. Social graph, Facebook wall...now need to allow people to search. What is the search optimized for? Writes. Want to have inbox search running consistently. How often queried vs. Writes. Optimized for the writes not for the queries...mostly data dumps, occasionally searching. Log structure almost output, everything into memory for index and searching. Someone’s inbox can probably fit into a node. Not a big data problem in the sense of a google search. ...limited search...load into memory and do a quick search of it there, which is very different. Why does that matter? Solving a specialized problem....don’t need a relational database, no schema, free-form search.

What is similar in how they are implemented. Do these use Paxos? Cassandra gossip protocol....a ring, n-1 nodes. Scuttle-bud .... elects a leader. All the master does is what the replicas does...problem partitioning. This is where consistent hashing comes in. Rows grows or shrinks, don’t have to rehash everything. Otherwise, have to reorganize the data. Consistent hashing, only a fraction of it. Seeing how the same ideas are getting applied again and again. Gossip protocols, have not seen that. Paxos, small number of nodes that maintain state, consistent view of that which is replicated in a hierarchy. Harder to get strict guarantees out of gossip kind of thing but better performance to handle incoming data rather than consistency .... Cassandra consistency model can change. Ring structures in gossip protocols go together. Talk to neighbours and a few more neighbours for communication is on the ring. Redundant but not too redundant.

Diff between inbox searching and shopping cart...Dynamo was a key value store, no search. Cassandra needs to do search on a node....optimized for writes but when do a search, sucked into a node’s memory to do searching. Not doing a lot of indexing, just metadata for where messages are stored. If doing relatively infrequent searches, might as well load it into memory....how often will they search their inbox? Otherwise, if infrequent, why bother generating an index. Specialized systems for solving specific problems. How much space there is in the design when you make specialized solutions. General solutions are either limited or broke. Compared to ceph, this is simpler but optimized for a specific use case. Always in comp sci, want to make a general solution but in experience, only way to get useful systems for many different scenarios is to use them in many different scenarios. Why large organizations have successes at solving internal problems and then export them. Google had the scale, had the resources but, were not in the business of selling it (internal use only). Amazon did the opposite. Google is catching up to AWS. Don’t have the correct culture, requires a major culture shift.

Seeing patterns and how infrastructure is developing. The goal, if you see a new paper...how if fits in with the rest...if you see a design, let’s build this? What problem are you solving...can go out there and have an idea of the systems that are there and what is the problem that is being solved there....rather than building your own system. Solve a problem that doesn’t fit, maybe not an off the shelf solution but need to be able to recognize it...availability, consistency....get away from this is the perfect system or best solution. What is right for the problem you are trying to solve....what infrastructure do you have? Do you want to use that infrastructure?

DistOS 2018F 2018-11-07

2018-11-09T02:56:14Z

Sheldon: /* Notes */

==Readings==

* [http://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)]
* [https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi Martin Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning" (OSDI 2016)]

==Notes==
In-Class Lecture notes:

MapReduce:

Parallel computations
Some things are inherently serial so, with computation, some things will take the same time regardless of cores or processors.
Palatalization hard, some amount of coordination between the systems to allow them to work together.
Systems that scale to large number of cores, systems. The ones that minimize coordination succeed.
For instance, GFS, they did a lot of things to reduce coordination to allow it to scale....does cause a bit of a mess but to clean up the mess would slow things down.
Models of computation that take advantage of these systems, MapReduce is the cleanest analysis. What sort of analysis do they talk about doing. Indexing, counting across things, grep....a search engine is just a big grep.
Why not run grep individually on all the computers, why do you need the framework?
Coordinating and consolidating the results of the machine. All MapReduce is, pieces of data computations done and then combine together.
Function programming aspect....stateless...don’t maintain state. The state of variables does not change (only binding can change). In a parallel system, coordinating maintaining state. If no state, don’t need to coordinate. Map & Reduce...if stateful, might have to undo computations if they mess up, side-affects. Could not run the code over and over again on the same data. If made purely functional, the answer will be the same no-matter how many times it has run. Fault tolerance. Duplicating work to make sure the overall computation finishes on time. Run multiple times to ensure the same answers. Aspects of fault tolerance show up when not doing functional programming. 10000 machines to run a job, computers can fail during computation and you don’t care b/c we don’t care about maintaining state....master server gathers together, combining function to get results. When you do a search, everything you get has been pre-computed.

MapReduce is an ad-hoc kinda that is not used anymore due to fundamental limitations but is the correct paradigm. Limitation: problem has to fit into the map and reduce...only coordination on the reduce side, no coordination on the map side. TensorFlow is not embarrassingly parallel at all. Have to worry about interactions. What is the model to break them up. Unit is a Tensor. A multidimensional array. Lot of mathematics that are defined on multidimensional arrays. TensorFlow will breakup computation into tensors. Functional programming in a sense but in a different context. Doing mathematics, if can define the math at a high level, the underlying system can refactor it at a high level. How it fits together, not our problem....divided up into Tensors and math ops on Tensors and parallelism.

AI is primary task for TensorFlow. Why not do other things like this? Large-scale simulations, other use case for clusters. Are maintaining large amount of state, in those systems, do partitioning based on state.
Game world...game parallelization. Divide the world into different places, going from one world to another could be going from one server to another.

Why do we need insane computations for doing AI....optimizing it and AI algorithms today are very stupid. Need lots and lots of data with lots and lots of samples. The learning is not advanced the pattern abstraction is almost brute force. How many miles AI driven cars have driven...to train a single model. How many miles does a person have to drive to learn how to drive.....driving is a task based on a world model we have been building our whole lives. Self driving cars, have to teach them how the world works...don’t have a world model. Can you learn how the world works from driving a car.

Check-Pointing in MapReduce...why not?....failures that happen during the computation don’t matter, just restart and do that part....but TensorFlow cares a lot more about reliability thing.
The master, how long does a MapReduce job run? Not long, a few minutes to an hour tops. Parallelism has made everything quick. What kind of neural net is TensorFlow training? Facial recognition, recommendations, translations...models running for a long time with TBs of input for days or weeks...lots of state being created of the entire neural networks and save along the line so implemented check-pointing. What do you care about saving? b/c saving all the state is expensive...save every hour and results that are good when neural network is doing well.

Genetic algorithms...the next wave in AI.

Fitness function, some layers are garbage, some have good fitness and then combine them together (using a operation called crossover)...mutation, flip some bits to reproduce solutions and then do it again...recompute fitness...an abstraction of natural evolution.

DistOS 2018F 2018-11-28

2018-11-06T19:13:21Z

Sheldon: Created page with "==Readings== ==Notes=="

==Readings==

==Notes==

DistOS 2018F 2018-11-26

2018-11-06T19:13:12Z

Sheldon: Created page with "==Readings== ==Notes=="

==Readings==

==Notes==

Distributed OS: Fall 2018

2018-11-06T19:12:51Z

Sheldon: /* November 28, 2018 */

==Course Outline==

[[Distributed OS: Fall 2018 Course Outline|Here]] is the course outline.

==Project Help==

To develop your literature review or research proposal, start with a single research paper that you find interesting and that is related to distributed operating systems in some way.

To begin selecting a paper, I suggest that you:
* search on Google Scholar using keywords relating to your interests, and/or
* browse the proceedings of major conferences that publish work related to distributed operating systems.

The main operating system conferences are [https://www.usenix.org/conferences/byname/179 OSDI] and ACM SOSP ([http://sosp.org/ sosp.org],[http://dl.acm.org/event.cfm?id=RE208&CFID=475138068&CFTOKEN=43996267 ACM DL]). Note that not all the work here is on distributed operating systems! Also, many other conferences publish some work related to distributed operating systems, e.g. [https://www.usenix.org/conferences/byname/178 NSDI].

To help you write a literature review or the background of a research paper, read the following:
* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://www.writing.utoronto.ca/advice/specific-types-of-writing/literature-review Taylor, "The Literature Review: A Few Tips On Conducting It"]

==Assigned Readings==

===[[DistOS 2018F 2018-09-10|September 10, 2018]]===

The Early Internet:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/kahn1972-resource.pdf Robert E. Kahn, "Resource-Sharing Computer Communications Networks" (1972)] [http://dx.doi.org/10.1109/PROC.1972.8911 (DOI)]
* [https://archive.org/details/ComputerNetworks_TheHeraldsOfResourceSharing Computer Networks: The Heralds of Resource Sharing (1972)] - video

===[[DistOS 2018F 2018-09-12|September 12, 2018]]===

The Mother of All Demos:
* [http://www.dougengelbart.org/firsts/dougs-1968-demo.html Doug Engelbart Institute, "Doug's 1968 Demo"]. You may want to focus on the [http://dougengelbart.org/events/1968-demo-highlights.html highlights] or the [http://sloan.stanford.edu/MouseSite/1968Demo.html annotated clips].
* [http://en.wikipedia.org/wiki/The_Mother_of_All_Demos Wikipedia's page on "The Mother of all Demos"]

===[[DistOS 2018F 2018-09-17|September 17, 2018]]===

The Alto:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/alto.pdf Thacker et al., "Alto: A Personal computer" (1979)] ([https://archive.org/details/bitsavers_xeroxparcttoAPersonalComputer_6560658 archive.org])

===[[DistOS 2018F 2018-09-19|September 19, 2018]]===

Multics & UNIX:
* [http://en.wikipedia.org/wiki/Multics Wikipedia article on Multics]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/unix.pdf Dennis M. Ritchie and Ken Thompson, "The UNIX Time-Sharing System" (1974)]

Optional: Browse around [http://www.multicians.org/ the Multicians website].

===[[DistOS 2018F 2018-09-24|September 24, 2018]]===

LOCUS & Sprite:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/walker-locus.pdf Bruce Walker et al., "The LOCUS Distributed Operating System." (1983)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/ousterhout-sprite.pdf John Ousterhout et al., "The Sprite Network Operating System" (1987)]

===[[DistOS 2018F 2018-09-26|September 26, 2018]]===

V, Amoeba, Clouds:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/cheriton-v.pdf David R. Cheriton, "The V Distributed System." (1988)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/tanenbaum-amoeba.pdf Andrew Tannenbaum et al., "The Amoeba System" (1990)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/clouds-dasgupta.pdf Partha Dasgupta et al., "The Clouds Distributed Operating System" (1991)]

===[[DistOS 2018F 2018-10-01|October 1, 2018]]===

NFS & AFS (+ Literature reviews)
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/sandberg-nfs.pdf Russel Sandberg et al., "Design and Implementation of the Sun Network Filesystem" (1985)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/howard-afs.pdf John H. Howard et al., "Scale and Performance in a Distributed File System" (1988)]

* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://advice.writing.utoronto.ca/types-of-writing/literature-review/ Taylor, "The Literature Review: A Few Tips On Conducting It"] [http://homeostasis.scs.carleton.ca/~soma/distos/2018f/taylor-literature-review.pdf (PDF)]

===[[DistOS 2018F 2018-10-03|October 3, 2018]]===

Farsite & Oceanstore
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/adya-farsite-intro.pdf Atul Adya et al.,"FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment" (2002)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/bolosky-farsite-retro.pdf William J. Bolosky et al., "The Farsite Project: A Retrospective" (2007)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)]

===October 8, 2018===

'''Thanksgiving!'''

===[[DistOS 2018F 2018-10-10|October 10, 2018]]===

Plan 9 & Inferno
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/presotto-plan9.pdf Presotto et. al, Plan 9, A Distributed System (1991)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/pike-plan9.pdf Pike et al., Plan 9 from Bell Labs (1995)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2018f/doward97-inferno.pdf Doward et al., The Inferno Operating System (1997)]
* [http://www.inferno-os.info/inferno/ Inferno OS website] (browse)

===[[DistOS 2018F 2018-10-15|October 15, 2018]]===

'''Test 1 review'''

===October 17, 2018===

'''Test 1'''

===[[DistOS 2018F 2018-10-29|October 29, 2018]]===

* [http://www.usenix.org/events/osdi06/tech/weil.html Weil et al., Ceph: A Scalable, High-Performance Distributed File System (OSDI 2006)].
* [http://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)]

Background (optional but helpful):
* [http://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables]
* [http://en.wikipedia.org/wiki/Kademlia Wikipedia's article on Kademlia]
* [http://en.wikipedia.org/wiki/Tapestry_%28DHT%29 Wikipedia's article on Tapestry]

===[[DistOS 2018F 2018-10-31|October 31, 2018]]===

'''Project Proposal Due'''

* [http://research.google.com/archive/gfs-sosp2003.pdf Sanjay Ghemawat et al., "The Google File System" (SOSP 2003)]
* [https://www.usenix.org/legacy/events/osdi06/tech/burrows.html Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI 2006)]

===[[DistOS 2018F 2018-11-05|November 5, 2018]]===

* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]
* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]

===[[DistOS 2018F 2018-11-07|November 7, 2018]]===

* [http://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)]
* [https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi Martin Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning" (OSDI 2016)]

===[[DistOS 2018F 2018-11-12|November 12, 2018]]===

* [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)]
* [http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf Lakshman & Malik, "Cassandra - A Decentralized Structured Storage System" (LADIS 2009)]

===[[DistOS 2018F 2018-11-14|November 14, 2018]]===

* [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar Muralidhar et al., "f4: Facebook's Warm BLOB Storage System" (OSDI 2014)]

===[[DistOS 2018F 2018-11-19|November 19, 2018]]===

* Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004) [http://dx.doi.org/10.1109/GRID.2004.14 (DOI)] [http://ieeexplore.ieee.org.proxy.library.carleton.ca/stamp/stamp.jsp?tp=&arnumber=1382809 (Proxy)]

===November 21, 2018===

Test 2

===[[DistOS 2018F 2018-11-26|November 26, 2018]]===

===[[DistOS 2018F 2018-11-28|November 28, 2018]]===

Distributed OS: Fall 2018

2018-11-06T19:12:33Z

Sheldon: /* November 26, 2018 */

DistOS 2018F 2018-11-19

2018-11-06T19:12:02Z

Sheldon: Created page with "==Readings== * Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004) [http://dx.doi.org/10.1109/GRID.2004.14 (DOI)] [http://ieeexplore.i..."

Distributed OS: Fall 2018

2018-11-06T19:11:43Z

Sheldon: /* November 19, 2018 */

==Course Outline==

[[Distributed OS: Fall 2018 Course Outline|Here]] is the course outline.

==Project Help==

To develop your literature review or research proposal, start with a single research paper that you find interesting and that is related to distributed operating systems in some way.

To begin selecting a paper, I suggest that you:
* search on Google Scholar using keywords relating to your interests, and/or
* browse the proceedings of major conferences that publish work related to distributed operating systems.

The main operating system conferences are [https://www.usenix.org/conferences/byname/179 OSDI] and ACM SOSP ([http://sosp.org/ sosp.org],[http://dl.acm.org/event.cfm?id=RE208&CFID=475138068&CFTOKEN=43996267 ACM DL]). Note that not all the work here is on distributed operating systems! Also, many other conferences publish some work related to distributed operating systems, e.g. [https://www.usenix.org/conferences/byname/178 NSDI].

To help you write a literature review or the background of a research paper, read the following:
* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://www.writing.utoronto.ca/advice/specific-types-of-writing/literature-review Taylor, "The Literature Review: A Few Tips On Conducting It"]

==Assigned Readings==

===[[DistOS 2018F 2018-09-10|September 10, 2018]]===

The Early Internet:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/kahn1972-resource.pdf Robert E. Kahn, "Resource-Sharing Computer Communications Networks" (1972)] [http://dx.doi.org/10.1109/PROC.1972.8911 (DOI)]
* [https://archive.org/details/ComputerNetworks_TheHeraldsOfResourceSharing Computer Networks: The Heralds of Resource Sharing (1972)] - video

===[[DistOS 2018F 2018-09-12|September 12, 2018]]===

The Mother of All Demos:
* [http://www.dougengelbart.org/firsts/dougs-1968-demo.html Doug Engelbart Institute, "Doug's 1968 Demo"]. You may want to focus on the [http://dougengelbart.org/events/1968-demo-highlights.html highlights] or the [http://sloan.stanford.edu/MouseSite/1968Demo.html annotated clips].
* [http://en.wikipedia.org/wiki/The_Mother_of_All_Demos Wikipedia's page on "The Mother of all Demos"]

===[[DistOS 2018F 2018-09-17|September 17, 2018]]===

The Alto:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/alto.pdf Thacker et al., "Alto: A Personal computer" (1979)] ([https://archive.org/details/bitsavers_xeroxparcttoAPersonalComputer_6560658 archive.org])

===[[DistOS 2018F 2018-09-19|September 19, 2018]]===

Multics & UNIX:
* [http://en.wikipedia.org/wiki/Multics Wikipedia article on Multics]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/unix.pdf Dennis M. Ritchie and Ken Thompson, "The UNIX Time-Sharing System" (1974)]

Optional: Browse around [http://www.multicians.org/ the Multicians website].

===[[DistOS 2018F 2018-09-24|September 24, 2018]]===

LOCUS & Sprite:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/walker-locus.pdf Bruce Walker et al., "The LOCUS Distributed Operating System." (1983)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/ousterhout-sprite.pdf John Ousterhout et al., "The Sprite Network Operating System" (1987)]

===[[DistOS 2018F 2018-09-26|September 26, 2018]]===

V, Amoeba, Clouds:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/cheriton-v.pdf David R. Cheriton, "The V Distributed System." (1988)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/tanenbaum-amoeba.pdf Andrew Tannenbaum et al., "The Amoeba System" (1990)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/clouds-dasgupta.pdf Partha Dasgupta et al., "The Clouds Distributed Operating System" (1991)]

===[[DistOS 2018F 2018-10-01|October 1, 2018]]===

NFS & AFS (+ Literature reviews)
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/sandberg-nfs.pdf Russel Sandberg et al., "Design and Implementation of the Sun Network Filesystem" (1985)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/howard-afs.pdf John H. Howard et al., "Scale and Performance in a Distributed File System" (1988)]

* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://advice.writing.utoronto.ca/types-of-writing/literature-review/ Taylor, "The Literature Review: A Few Tips On Conducting It"] [http://homeostasis.scs.carleton.ca/~soma/distos/2018f/taylor-literature-review.pdf (PDF)]

===[[DistOS 2018F 2018-10-03|October 3, 2018]]===

Farsite & Oceanstore
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/adya-farsite-intro.pdf Atul Adya et al.,"FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment" (2002)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/bolosky-farsite-retro.pdf William J. Bolosky et al., "The Farsite Project: A Retrospective" (2007)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)]

===October 8, 2018===

'''Thanksgiving!'''

===[[DistOS 2018F 2018-10-10|October 10, 2018]]===

Plan 9 & Inferno
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/presotto-plan9.pdf Presotto et. al, Plan 9, A Distributed System (1991)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/pike-plan9.pdf Pike et al., Plan 9 from Bell Labs (1995)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2018f/doward97-inferno.pdf Doward et al., The Inferno Operating System (1997)]
* [http://www.inferno-os.info/inferno/ Inferno OS website] (browse)

===[[DistOS 2018F 2018-10-15|October 15, 2018]]===

'''Test 1 review'''

===October 17, 2018===

'''Test 1'''

===[[DistOS 2018F 2018-10-29|October 29, 2018]]===

* [http://www.usenix.org/events/osdi06/tech/weil.html Weil et al., Ceph: A Scalable, High-Performance Distributed File System (OSDI 2006)].
* [http://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)]

Background (optional but helpful):
* [http://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables]
* [http://en.wikipedia.org/wiki/Kademlia Wikipedia's article on Kademlia]
* [http://en.wikipedia.org/wiki/Tapestry_%28DHT%29 Wikipedia's article on Tapestry]

===[[DistOS 2018F 2018-10-31|October 31, 2018]]===

'''Project Proposal Due'''

* [http://research.google.com/archive/gfs-sosp2003.pdf Sanjay Ghemawat et al., "The Google File System" (SOSP 2003)]
* [https://www.usenix.org/legacy/events/osdi06/tech/burrows.html Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI 2006)]

===[[DistOS 2018F 2018-11-05|November 5, 2018]]===

* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]
* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]

===[[DistOS 2018F 2018-11-07|November 7, 2018]]===

* [http://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)]
* [https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi Martin Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning" (OSDI 2016)]

===[[DistOS 2018F 2018-11-12|November 12, 2018]]===

* [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)]
* [http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf Lakshman & Malik, "Cassandra - A Decentralized Structured Storage System" (LADIS 2009)]

===[[DistOS 2018F 2018-11-14|November 14, 2018]]===

* [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar Muralidhar et al., "f4: Facebook's Warm BLOB Storage System" (OSDI 2014)]

===[[DistOS 2018F 2018-11-19|November 19, 2018]]===

* Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004) [http://dx.doi.org/10.1109/GRID.2004.14 (DOI)] [http://ieeexplore.ieee.org.proxy.library.carleton.ca/stamp/stamp.jsp?tp=&arnumber=1382809 (Proxy)]

===November 21, 2018===

Test 2

===[[DistOS 2018F 2018-10-29|November 26, 2018]]===

===[[DistOS 2018F 2018-10-29|November 28, 2018]]===

DistOS 2018F 2018-11-14

2018-11-06T19:11:15Z

Sheldon: Created page with "==Readings== * [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010..."

Distributed OS: Fall 2018

2018-11-06T19:10:55Z

Sheldon: /* November 14, 2018 */

==Course Outline==

[[Distributed OS: Fall 2018 Course Outline|Here]] is the course outline.

==Project Help==

To develop your literature review or research proposal, start with a single research paper that you find interesting and that is related to distributed operating systems in some way.

To begin selecting a paper, I suggest that you:
* search on Google Scholar using keywords relating to your interests, and/or
* browse the proceedings of major conferences that publish work related to distributed operating systems.

The main operating system conferences are [https://www.usenix.org/conferences/byname/179 OSDI] and ACM SOSP ([http://sosp.org/ sosp.org],[http://dl.acm.org/event.cfm?id=RE208&CFID=475138068&CFTOKEN=43996267 ACM DL]). Note that not all the work here is on distributed operating systems! Also, many other conferences publish some work related to distributed operating systems, e.g. [https://www.usenix.org/conferences/byname/178 NSDI].

To help you write a literature review or the background of a research paper, read the following:
* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://www.writing.utoronto.ca/advice/specific-types-of-writing/literature-review Taylor, "The Literature Review: A Few Tips On Conducting It"]

==Assigned Readings==

===[[DistOS 2018F 2018-09-10|September 10, 2018]]===

The Early Internet:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/kahn1972-resource.pdf Robert E. Kahn, "Resource-Sharing Computer Communications Networks" (1972)] [http://dx.doi.org/10.1109/PROC.1972.8911 (DOI)]
* [https://archive.org/details/ComputerNetworks_TheHeraldsOfResourceSharing Computer Networks: The Heralds of Resource Sharing (1972)] - video

===[[DistOS 2018F 2018-09-12|September 12, 2018]]===

The Mother of All Demos:
* [http://www.dougengelbart.org/firsts/dougs-1968-demo.html Doug Engelbart Institute, "Doug's 1968 Demo"]. You may want to focus on the [http://dougengelbart.org/events/1968-demo-highlights.html highlights] or the [http://sloan.stanford.edu/MouseSite/1968Demo.html annotated clips].
* [http://en.wikipedia.org/wiki/The_Mother_of_All_Demos Wikipedia's page on "The Mother of all Demos"]

===[[DistOS 2018F 2018-09-17|September 17, 2018]]===

The Alto:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/alto.pdf Thacker et al., "Alto: A Personal computer" (1979)] ([https://archive.org/details/bitsavers_xeroxparcttoAPersonalComputer_6560658 archive.org])

===[[DistOS 2018F 2018-09-19|September 19, 2018]]===

Multics & UNIX:
* [http://en.wikipedia.org/wiki/Multics Wikipedia article on Multics]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/unix.pdf Dennis M. Ritchie and Ken Thompson, "The UNIX Time-Sharing System" (1974)]

Optional: Browse around [http://www.multicians.org/ the Multicians website].

===[[DistOS 2018F 2018-09-24|September 24, 2018]]===

LOCUS & Sprite:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/walker-locus.pdf Bruce Walker et al., "The LOCUS Distributed Operating System." (1983)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/ousterhout-sprite.pdf John Ousterhout et al., "The Sprite Network Operating System" (1987)]

===[[DistOS 2018F 2018-09-26|September 26, 2018]]===

V, Amoeba, Clouds:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/cheriton-v.pdf David R. Cheriton, "The V Distributed System." (1988)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/tanenbaum-amoeba.pdf Andrew Tannenbaum et al., "The Amoeba System" (1990)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/clouds-dasgupta.pdf Partha Dasgupta et al., "The Clouds Distributed Operating System" (1991)]

===[[DistOS 2018F 2018-10-01|October 1, 2018]]===

NFS & AFS (+ Literature reviews)
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/sandberg-nfs.pdf Russel Sandberg et al., "Design and Implementation of the Sun Network Filesystem" (1985)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/howard-afs.pdf John H. Howard et al., "Scale and Performance in a Distributed File System" (1988)]

* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://advice.writing.utoronto.ca/types-of-writing/literature-review/ Taylor, "The Literature Review: A Few Tips On Conducting It"] [http://homeostasis.scs.carleton.ca/~soma/distos/2018f/taylor-literature-review.pdf (PDF)]

===[[DistOS 2018F 2018-10-03|October 3, 2018]]===

Farsite & Oceanstore
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/adya-farsite-intro.pdf Atul Adya et al.,"FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment" (2002)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/bolosky-farsite-retro.pdf William J. Bolosky et al., "The Farsite Project: A Retrospective" (2007)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)]

===October 8, 2018===

'''Thanksgiving!'''

===[[DistOS 2018F 2018-10-10|October 10, 2018]]===

Plan 9 & Inferno
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/presotto-plan9.pdf Presotto et. al, Plan 9, A Distributed System (1991)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/pike-plan9.pdf Pike et al., Plan 9 from Bell Labs (1995)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2018f/doward97-inferno.pdf Doward et al., The Inferno Operating System (1997)]
* [http://www.inferno-os.info/inferno/ Inferno OS website] (browse)

===[[DistOS 2018F 2018-10-15|October 15, 2018]]===

'''Test 1 review'''

===October 17, 2018===

'''Test 1'''

===[[DistOS 2018F 2018-10-29|October 29, 2018]]===

* [http://www.usenix.org/events/osdi06/tech/weil.html Weil et al., Ceph: A Scalable, High-Performance Distributed File System (OSDI 2006)].
* [http://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)]

Background (optional but helpful):
* [http://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables]
* [http://en.wikipedia.org/wiki/Kademlia Wikipedia's article on Kademlia]
* [http://en.wikipedia.org/wiki/Tapestry_%28DHT%29 Wikipedia's article on Tapestry]

===[[DistOS 2018F 2018-10-31|October 31, 2018]]===

'''Project Proposal Due'''

* [http://research.google.com/archive/gfs-sosp2003.pdf Sanjay Ghemawat et al., "The Google File System" (SOSP 2003)]
* [https://www.usenix.org/legacy/events/osdi06/tech/burrows.html Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI 2006)]

===[[DistOS 2018F 2018-11-05|November 5, 2018]]===

* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]
* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]

===[[DistOS 2018F 2018-11-07|November 7, 2018]]===

* [http://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)]
* [https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi Martin Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning" (OSDI 2016)]

===[[DistOS 2018F 2018-11-12|November 12, 2018]]===

* [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)]
* [http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf Lakshman & Malik, "Cassandra - A Decentralized Structured Storage System" (LADIS 2009)]

===[[DistOS 2018F 2018-11-14|November 14, 2018]]===

* [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar Muralidhar et al., "f4: Facebook's Warm BLOB Storage System" (OSDI 2014)]

===[[DistOS 2018F 2018-10-29|November 19, 2018]]===

* Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004) [http://dx.doi.org/10.1109/GRID.2004.14 (DOI)] [http://ieeexplore.ieee.org.proxy.library.carleton.ca/stamp/stamp.jsp?tp=&arnumber=1382809 (Proxy)]

===November 21, 2018===

Test 2

===[[DistOS 2018F 2018-10-29|November 26, 2018]]===

===[[DistOS 2018F 2018-10-29|November 28, 2018]]===

DistOS 2018F 2018-11-12

2018-11-06T19:10:16Z

Sheldon: Created page with "==Readings== * [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)] * [ht..."

Distributed OS: Fall 2018

2018-11-06T19:09:47Z

Sheldon: /* November 12, 2018 */

==Course Outline==

[[Distributed OS: Fall 2018 Course Outline|Here]] is the course outline.

==Project Help==

To develop your literature review or research proposal, start with a single research paper that you find interesting and that is related to distributed operating systems in some way.

To begin selecting a paper, I suggest that you:
* search on Google Scholar using keywords relating to your interests, and/or
* browse the proceedings of major conferences that publish work related to distributed operating systems.

The main operating system conferences are [https://www.usenix.org/conferences/byname/179 OSDI] and ACM SOSP ([http://sosp.org/ sosp.org],[http://dl.acm.org/event.cfm?id=RE208&CFID=475138068&CFTOKEN=43996267 ACM DL]). Note that not all the work here is on distributed operating systems! Also, many other conferences publish some work related to distributed operating systems, e.g. [https://www.usenix.org/conferences/byname/178 NSDI].

To help you write a literature review or the background of a research paper, read the following:
* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://www.writing.utoronto.ca/advice/specific-types-of-writing/literature-review Taylor, "The Literature Review: A Few Tips On Conducting It"]

==Assigned Readings==

===[[DistOS 2018F 2018-09-10|September 10, 2018]]===

The Early Internet:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/kahn1972-resource.pdf Robert E. Kahn, "Resource-Sharing Computer Communications Networks" (1972)] [http://dx.doi.org/10.1109/PROC.1972.8911 (DOI)]
* [https://archive.org/details/ComputerNetworks_TheHeraldsOfResourceSharing Computer Networks: The Heralds of Resource Sharing (1972)] - video

===[[DistOS 2018F 2018-09-12|September 12, 2018]]===

The Mother of All Demos:
* [http://www.dougengelbart.org/firsts/dougs-1968-demo.html Doug Engelbart Institute, "Doug's 1968 Demo"]. You may want to focus on the [http://dougengelbart.org/events/1968-demo-highlights.html highlights] or the [http://sloan.stanford.edu/MouseSite/1968Demo.html annotated clips].
* [http://en.wikipedia.org/wiki/The_Mother_of_All_Demos Wikipedia's page on "The Mother of all Demos"]

===[[DistOS 2018F 2018-09-17|September 17, 2018]]===

The Alto:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/alto.pdf Thacker et al., "Alto: A Personal computer" (1979)] ([https://archive.org/details/bitsavers_xeroxparcttoAPersonalComputer_6560658 archive.org])

===[[DistOS 2018F 2018-09-19|September 19, 2018]]===

Multics & UNIX:
* [http://en.wikipedia.org/wiki/Multics Wikipedia article on Multics]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/unix.pdf Dennis M. Ritchie and Ken Thompson, "The UNIX Time-Sharing System" (1974)]

Optional: Browse around [http://www.multicians.org/ the Multicians website].

===[[DistOS 2018F 2018-09-24|September 24, 2018]]===

LOCUS & Sprite:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/walker-locus.pdf Bruce Walker et al., "The LOCUS Distributed Operating System." (1983)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/ousterhout-sprite.pdf John Ousterhout et al., "The Sprite Network Operating System" (1987)]

===[[DistOS 2018F 2018-09-26|September 26, 2018]]===

V, Amoeba, Clouds:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/cheriton-v.pdf David R. Cheriton, "The V Distributed System." (1988)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/tanenbaum-amoeba.pdf Andrew Tannenbaum et al., "The Amoeba System" (1990)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/clouds-dasgupta.pdf Partha Dasgupta et al., "The Clouds Distributed Operating System" (1991)]

===[[DistOS 2018F 2018-10-01|October 1, 2018]]===

NFS & AFS (+ Literature reviews)
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/sandberg-nfs.pdf Russel Sandberg et al., "Design and Implementation of the Sun Network Filesystem" (1985)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/howard-afs.pdf John H. Howard et al., "Scale and Performance in a Distributed File System" (1988)]

* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://advice.writing.utoronto.ca/types-of-writing/literature-review/ Taylor, "The Literature Review: A Few Tips On Conducting It"] [http://homeostasis.scs.carleton.ca/~soma/distos/2018f/taylor-literature-review.pdf (PDF)]

===[[DistOS 2018F 2018-10-03|October 3, 2018]]===

Farsite & Oceanstore
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/adya-farsite-intro.pdf Atul Adya et al.,"FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment" (2002)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/bolosky-farsite-retro.pdf William J. Bolosky et al., "The Farsite Project: A Retrospective" (2007)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)]

===October 8, 2018===

'''Thanksgiving!'''

===[[DistOS 2018F 2018-10-10|October 10, 2018]]===

Plan 9 & Inferno
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/presotto-plan9.pdf Presotto et. al, Plan 9, A Distributed System (1991)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/pike-plan9.pdf Pike et al., Plan 9 from Bell Labs (1995)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2018f/doward97-inferno.pdf Doward et al., The Inferno Operating System (1997)]
* [http://www.inferno-os.info/inferno/ Inferno OS website] (browse)

===[[DistOS 2018F 2018-10-15|October 15, 2018]]===

'''Test 1 review'''

===October 17, 2018===

'''Test 1'''

===[[DistOS 2018F 2018-10-29|October 29, 2018]]===

* [http://www.usenix.org/events/osdi06/tech/weil.html Weil et al., Ceph: A Scalable, High-Performance Distributed File System (OSDI 2006)].
* [http://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)]

Background (optional but helpful):
* [http://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables]
* [http://en.wikipedia.org/wiki/Kademlia Wikipedia's article on Kademlia]
* [http://en.wikipedia.org/wiki/Tapestry_%28DHT%29 Wikipedia's article on Tapestry]

===[[DistOS 2018F 2018-10-31|October 31, 2018]]===

'''Project Proposal Due'''

* [http://research.google.com/archive/gfs-sosp2003.pdf Sanjay Ghemawat et al., "The Google File System" (SOSP 2003)]
* [https://www.usenix.org/legacy/events/osdi06/tech/burrows.html Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI 2006)]

===[[DistOS 2018F 2018-11-05|November 5, 2018]]===

* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]
* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]

===[[DistOS 2018F 2018-11-07|November 7, 2018]]===

* [http://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)]
* [https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi Martin Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning" (OSDI 2016)]

===[[DistOS 2018F 2018-11-12|November 12, 2018]]===

* [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)]
* [http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf Lakshman & Malik, "Cassandra - A Decentralized Structured Storage System" (LADIS 2009)]

===[[DistOS 2018F 2018-10-29|November 14, 2018]]===

* [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar Muralidhar et al., "f4: Facebook's Warm BLOB Storage System" (OSDI 2014)]

===[[DistOS 2018F 2018-10-29|November 19, 2018]]===

* Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004) [http://dx.doi.org/10.1109/GRID.2004.14 (DOI)] [http://ieeexplore.ieee.org.proxy.library.carleton.ca/stamp/stamp.jsp?tp=&arnumber=1382809 (Proxy)]

===November 21, 2018===

Test 2

===[[DistOS 2018F 2018-10-29|November 26, 2018]]===

===[[DistOS 2018F 2018-10-29|November 28, 2018]]===

DistOS 2018F 2018-11-07

2018-11-06T19:08:37Z

Sheldon: Created page with "==Readings== * [http://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)] * [https://www.useni..."

Distributed OS: Fall 2018

2018-11-06T19:07:46Z

Sheldon: /* November 7, 2018 */

==Course Outline==

[[Distributed OS: Fall 2018 Course Outline|Here]] is the course outline.

==Project Help==

To develop your literature review or research proposal, start with a single research paper that you find interesting and that is related to distributed operating systems in some way.

To begin selecting a paper, I suggest that you:
* search on Google Scholar using keywords relating to your interests, and/or
* browse the proceedings of major conferences that publish work related to distributed operating systems.

The main operating system conferences are [https://www.usenix.org/conferences/byname/179 OSDI] and ACM SOSP ([http://sosp.org/ sosp.org],[http://dl.acm.org/event.cfm?id=RE208&CFID=475138068&CFTOKEN=43996267 ACM DL]). Note that not all the work here is on distributed operating systems! Also, many other conferences publish some work related to distributed operating systems, e.g. [https://www.usenix.org/conferences/byname/178 NSDI].

To help you write a literature review or the background of a research paper, read the following:
* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://www.writing.utoronto.ca/advice/specific-types-of-writing/literature-review Taylor, "The Literature Review: A Few Tips On Conducting It"]

==Assigned Readings==

===[[DistOS 2018F 2018-09-10|September 10, 2018]]===

The Early Internet:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/kahn1972-resource.pdf Robert E. Kahn, "Resource-Sharing Computer Communications Networks" (1972)] [http://dx.doi.org/10.1109/PROC.1972.8911 (DOI)]
* [https://archive.org/details/ComputerNetworks_TheHeraldsOfResourceSharing Computer Networks: The Heralds of Resource Sharing (1972)] - video

===[[DistOS 2018F 2018-09-12|September 12, 2018]]===

The Mother of All Demos:
* [http://www.dougengelbart.org/firsts/dougs-1968-demo.html Doug Engelbart Institute, "Doug's 1968 Demo"]. You may want to focus on the [http://dougengelbart.org/events/1968-demo-highlights.html highlights] or the [http://sloan.stanford.edu/MouseSite/1968Demo.html annotated clips].
* [http://en.wikipedia.org/wiki/The_Mother_of_All_Demos Wikipedia's page on "The Mother of all Demos"]

===[[DistOS 2018F 2018-09-17|September 17, 2018]]===

The Alto:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/alto.pdf Thacker et al., "Alto: A Personal computer" (1979)] ([https://archive.org/details/bitsavers_xeroxparcttoAPersonalComputer_6560658 archive.org])

===[[DistOS 2018F 2018-09-19|September 19, 2018]]===

Multics & UNIX:
* [http://en.wikipedia.org/wiki/Multics Wikipedia article on Multics]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/unix.pdf Dennis M. Ritchie and Ken Thompson, "The UNIX Time-Sharing System" (1974)]

Optional: Browse around [http://www.multicians.org/ the Multicians website].

===[[DistOS 2018F 2018-09-24|September 24, 2018]]===

LOCUS & Sprite:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/walker-locus.pdf Bruce Walker et al., "The LOCUS Distributed Operating System." (1983)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/ousterhout-sprite.pdf John Ousterhout et al., "The Sprite Network Operating System" (1987)]

===[[DistOS 2018F 2018-09-26|September 26, 2018]]===

V, Amoeba, Clouds:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/cheriton-v.pdf David R. Cheriton, "The V Distributed System." (1988)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/tanenbaum-amoeba.pdf Andrew Tannenbaum et al., "The Amoeba System" (1990)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/clouds-dasgupta.pdf Partha Dasgupta et al., "The Clouds Distributed Operating System" (1991)]

===[[DistOS 2018F 2018-10-01|October 1, 2018]]===

NFS & AFS (+ Literature reviews)
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/sandberg-nfs.pdf Russel Sandberg et al., "Design and Implementation of the Sun Network Filesystem" (1985)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/howard-afs.pdf John H. Howard et al., "Scale and Performance in a Distributed File System" (1988)]

* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://advice.writing.utoronto.ca/types-of-writing/literature-review/ Taylor, "The Literature Review: A Few Tips On Conducting It"] [http://homeostasis.scs.carleton.ca/~soma/distos/2018f/taylor-literature-review.pdf (PDF)]

===[[DistOS 2018F 2018-10-03|October 3, 2018]]===

Farsite & Oceanstore
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/adya-farsite-intro.pdf Atul Adya et al.,"FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment" (2002)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/bolosky-farsite-retro.pdf William J. Bolosky et al., "The Farsite Project: A Retrospective" (2007)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)]

===October 8, 2018===

'''Thanksgiving!'''

===[[DistOS 2018F 2018-10-10|October 10, 2018]]===

Plan 9 & Inferno
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/presotto-plan9.pdf Presotto et. al, Plan 9, A Distributed System (1991)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/pike-plan9.pdf Pike et al., Plan 9 from Bell Labs (1995)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2018f/doward97-inferno.pdf Doward et al., The Inferno Operating System (1997)]
* [http://www.inferno-os.info/inferno/ Inferno OS website] (browse)

===[[DistOS 2018F 2018-10-15|October 15, 2018]]===

'''Test 1 review'''

===October 17, 2018===

'''Test 1'''

===[[DistOS 2018F 2018-10-29|October 29, 2018]]===

* [http://www.usenix.org/events/osdi06/tech/weil.html Weil et al., Ceph: A Scalable, High-Performance Distributed File System (OSDI 2006)].
* [http://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)]

Background (optional but helpful):
* [http://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables]
* [http://en.wikipedia.org/wiki/Kademlia Wikipedia's article on Kademlia]
* [http://en.wikipedia.org/wiki/Tapestry_%28DHT%29 Wikipedia's article on Tapestry]

===[[DistOS 2018F 2018-10-31|October 31, 2018]]===

'''Project Proposal Due'''

* [http://research.google.com/archive/gfs-sosp2003.pdf Sanjay Ghemawat et al., "The Google File System" (SOSP 2003)]
* [https://www.usenix.org/legacy/events/osdi06/tech/burrows.html Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI 2006)]

===[[DistOS 2018F 2018-11-05|November 5, 2018]]===

* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]
* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]

===[[DistOS 2018F 2018-11-07|November 7, 2018]]===

* [http://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)]
* [https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi Martin Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning" (OSDI 2016)]

===[[DistOS 2018F 2018-10-29|November 12, 2018]]===

* [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)]
* [http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf Lakshman & Malik, "Cassandra - A Decentralized Structured Storage System" (LADIS 2009)]

===[[DistOS 2018F 2018-10-29|November 14, 2018]]===

* [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar Muralidhar et al., "f4: Facebook's Warm BLOB Storage System" (OSDI 2014)]

===[[DistOS 2018F 2018-10-29|November 19, 2018]]===

* Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004) [http://dx.doi.org/10.1109/GRID.2004.14 (DOI)] [http://ieeexplore.ieee.org.proxy.library.carleton.ca/stamp/stamp.jsp?tp=&arnumber=1382809 (Proxy)]

===November 21, 2018===

Test 2

===[[DistOS 2018F 2018-10-29|November 26, 2018]]===

===[[DistOS 2018F 2018-10-29|November 28, 2018]]===

DistOS 2018F 2018-11-05

2018-11-06T19:06:13Z

Sheldon: /* Notes */

==Readings==

* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]
* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]

==Notes==
Lecture Notes:

Lecture:

Why did they build big table: Other apps other than a web crawler wanted to use the GFS….BigTable, wanted it to be fast, lots of data … multidimensional sorted map (not a database)

The Ad people said that BigTable was dumb b/c they wanted to do queries to generate business reports….so they now have business reports. Every technology bears the fingerprints of the initial use cases….unix has it all over the place...weirdness everywhere.

Here: what is Google determines the technology stack. How much code is at Google, source code repository is in the billions of lines of code….what they have done? Not only complex apps, the entire infrastructure for distributed OS, services and everything built internally mostly from scratch. Pioneers, they now have the legacy code problems. …. Opensource might have something better but they are locked in. At the forefront of building stuff at web internet scale….built it and know how to do it. Now less competitive advantage b/c lots of orgs know how to build things similar, yet better.

What’s big table built on-top of? GFS and Chubby. Why not build from scratch? Chubby is specialized, what it does, it does well. One thing to note: when look at papers, the Google techs fit together….building on top of other technologies. Advantages in the amount of code required to write.

Data structure, the SSTable, the crawler stores stuff in SSTables and then this was built on top to query the SSTable rather specialized tools. How to make database-like. Versioning the SSTables and then adding index semantics and localities (a mapping).
Request stuff, how to index and find stuff.

Spanner was built on top of Colossus instead of GFS….key problem with GFS, batch-oriented...random access is not well optimized. Why Colossus? Don’t know. The systems being discussed are large systems, why this course is not a coding course. Big systems. What that means, papers published about big systems, things were glossed over. Don’t talk about everything, have to leave a lot out. Paper is saying how to solve the problem. How to implement SQL transactions on a global scale. How to implement an entire system, they don’t go through that.

Memtable (5.3): what is it? A log-structured file system. Writes, don’t overwrite things that have written directly. 5 versions of a file, which versions are the current one, the rest are old. Whatever is in memory, will dump again etc. but with a log structure, what you do is maintain new new new new new and then clean up the old stuff. Have a compaction thing. Take current stuff, put it at the end and the cleanup...this is an important pattern to see with of systems.

How does a normal file system work? Have blocks
Hard drives (mechanical): sequential read and writes are fast. While random, moving head between tracks is a slow operation. Density increases, faster at the same rpm speed.

How to speed up? Keep in ram and then periodically write to disk. Good for performance until you have a problem like a system loses power. Big risk, might lose some data but what if you corrupt the data structure. Corrupted, lose all the data on disk. FSCK on windows, chkdisk, all they do, go through and see if it is a clean file system. If in a dirty state, walking the file system and ensure consistency...checking metadata blocks. The times on modern drives is enormous if have to check the whole drive. RAID arrays, lots of time, can’t do this, this is bad.

So idea of the journaled FS. Set aside a portion of the disk where you do all your writes first….written sequentially….to do it fast. Now committed, if lose power, all you have to do is replay the journal. Great system when workload is dominated by reads. But if you have a lot of writes, this is bad b/c have doubled the writes. What is the solution? Eliminate everything except the journal. Just write records one after the other. Block number might be on disk 50 times, b/c have written multiple versions but then, you go back and later reclaim it.

So the longer you go without cleanup, without committing to disk, the faster the performance. New report supersedes the old report. 20 copied of A and one version of B...need to do a cleanup 1 copy for A, 1 for B in a pile and then throw the rest out. Big table does almost the same thing. Keeps writing new data and the old data might still be valid. Classic strategy when you have lots of writes and you want to avoid random access, using sequential writes instead.

So Spanner, they wanted a database with transactions. What is the hard part of a transaction? Have a consistent state. Have to have the same view globally, globally literally means the globe. Want to have a consistent view of the data. One place has seats, the other place has payments, if not consistent, selling payments for seats you don’t have. How does time help you solve that? Creates a concept of a before and after….what is the global order of events. That is what you need for global consistencies. In a distributed system, ordering is not consistent by default. So if you have to maintain a global ordering, you are screwed. A will happen before B and B will happen before A. If it is inconsistent, will have different versions of the data, which one came first etc. This is bad. By having synced clocks, and the skew between them, can assign a timestamp and compare what happened first. If there is any possibility that the ordering is ambiguous, they know it. The window of uncertainty until things settle down and then do the operation. Getting a global ordering of events by using accurate clocks. From a physics standpoint is bizarre. Relativity. Simultaneous events are undefined, depends on the frame of reference. Ordering. GPS, atomic clocks in space sending radio signals essentially.

DistOS 2018F 2018-11-05

2018-11-06T19:05:54Z

Sheldon:

==Readings==

* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]
* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]

==Notes==
Lecture Notes:

Lecture:

Why did they build big table:
Other apps other than a web crawler wanted to use the GFS….BigTable, wanted it to be fast, lots of data … multidimensional sorted map (not a database)

The Ad people said that BigTable was dumb b/c they wanted to do queries to generate business reports….so they now have business reports. Every technology bears the fingerprints of the initial use cases….unix has it all over the place...weirdness everywhere.

Here: what is Google determines the technology stack. How much code is at Google, source code repository is in the billions of lines of code….what they have done? Not only complex apps, the entire infrastructure for distributed OS, services and everything built internally mostly from scratch. Pioneers, they now have the legacy code problems. …. Opensource might have something better but they are locked in. At the forefront of building stuff at web internet scale….built it and know how to do it. Now less competitive advantage b/c lots of orgs know how to build things similar, yet better.

What’s big table built on-top of? GFS and Chubby. Why not build from scratch? Chubby is specialized, what it does, it does well. One thing to note: when look at papers, the Google techs fit together….building on top of other technologies. Advantages in the amount of code required to write.

Data structure, the SSTable, the crawler stores stuff in SSTables and then this was built on top to query the SSTable rather specialized tools. How to make database-like. Versioning the SSTables and then adding index semantics and localities (a mapping).
Request stuff, how to index and find stuff.

Spanner was built on top of Colossus instead of GFS….key problem with GFS, batch-oriented...random access is not well optimized. Why Colossus? Don’t know. The systems being discussed are large systems, why this course is not a coding course. Big systems. What that means, papers published about big systems, things were glossed over. Don’t talk about everything, have to leave a lot out. Paper is saying how to solve the problem. How to implement SQL transactions on a global scale. How to implement an entire system, they don’t go through that.

Memtable (5.3): what is it? A log-structured file system. Writes, don’t overwrite things that have written directly. 5 versions of a file, which versions are the current one, the rest are old. Whatever is in memory, will dump again etc. but with a log structure, what you do is maintain new new new new new and then clean up the old stuff. Have a compaction thing. Take current stuff, put it at the end and the cleanup...this is an important pattern to see with of systems.

How does a normal file system work? Have blocks
Hard drives (mechanical): sequential read and writes are fast. While random, moving head between tracks is a slow operation. Density increases, faster at the same rpm speed.

How to speed up? Keep in ram and then periodically write to disk. Good for performance until you have a problem like a system loses power. Big risk, might lose some data but what if you corrupt the data structure. Corrupted, lose all the data on disk. FSCK on windows, chkdisk, all they do, go through and see if it is a clean file system. If in a dirty state, walking the file system and ensure consistency...checking metadata blocks. The times on modern drives is enormous if have to check the whole drive. RAID arrays, lots of time, can’t do this, this is bad.

So idea of the journaled FS. Set aside a portion of the disk where you do all your writes first….written sequentially….to do it fast. Now committed, if lose power, all you have to do is replay the journal. Great system when workload is dominated by reads. But if you have a lot of writes, this is bad b/c have doubled the writes. What is the solution? Eliminate everything except the journal. Just write records one after the other. Block number might be on disk 50 times, b/c have written multiple versions but then, you go back and later reclaim it.

So the longer you go without cleanup, without committing to disk, the faster the performance. New report supersedes the old report. 20 copied of A and one version of B...need to do a cleanup 1 copy for A, 1 for B in a pile and then throw the rest out. Big table does almost the same thing. Keeps writing new data and the old data might still be valid. Classic strategy when you have lots of writes and you want to avoid random access, using sequential writes instead.

So Spanner, they wanted a database with transactions. What is the hard part of a transaction? Have a consistent state. Have to have the same view globally, globally literally means the globe. Want to have a consistent view of the data. One place has seats, the other place has payments, if not consistent, selling payments for seats you don’t have. How does time help you solve that? Creates a concept of a before and after….what is the global order of events. That is what you need for global consistencies. In a distributed system, ordering is not consistent by default. So if you have to maintain a global ordering, you are screwed. A will happen before B and B will happen before A. If it is inconsistent, will have different versions of the data, which one came first etc. This is bad. By having synced clocks, and the skew between them, can assign a timestamp and compare what happened first. If there is any possibility that the ordering is ambiguous, they know it. The window of uncertainty until things settle down and then do the operation. Getting a global ordering of events by using accurate clocks. From a physics standpoint is bizarre. Relativity. Simultaneous events are undefined, depends on the frame of reference. Ordering. GPS, atomic clocks in space sending radio signals essentially.

Distributed OS: Fall 2018

2018-11-06T19:04:44Z

Sheldon: /* November 5, 2018 */

==Course Outline==

[[Distributed OS: Fall 2018 Course Outline|Here]] is the course outline.

==Project Help==

To develop your literature review or research proposal, start with a single research paper that you find interesting and that is related to distributed operating systems in some way.

To begin selecting a paper, I suggest that you:
* search on Google Scholar using keywords relating to your interests, and/or
* browse the proceedings of major conferences that publish work related to distributed operating systems.

The main operating system conferences are [https://www.usenix.org/conferences/byname/179 OSDI] and ACM SOSP ([http://sosp.org/ sosp.org],[http://dl.acm.org/event.cfm?id=RE208&CFID=475138068&CFTOKEN=43996267 ACM DL]). Note that not all the work here is on distributed operating systems! Also, many other conferences publish some work related to distributed operating systems, e.g. [https://www.usenix.org/conferences/byname/178 NSDI].

To help you write a literature review or the background of a research paper, read the following:
* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://www.writing.utoronto.ca/advice/specific-types-of-writing/literature-review Taylor, "The Literature Review: A Few Tips On Conducting It"]

==Assigned Readings==

===[[DistOS 2018F 2018-09-10|September 10, 2018]]===

The Early Internet:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/kahn1972-resource.pdf Robert E. Kahn, "Resource-Sharing Computer Communications Networks" (1972)] [http://dx.doi.org/10.1109/PROC.1972.8911 (DOI)]
* [https://archive.org/details/ComputerNetworks_TheHeraldsOfResourceSharing Computer Networks: The Heralds of Resource Sharing (1972)] - video

===[[DistOS 2018F 2018-09-12|September 12, 2018]]===

The Mother of All Demos:
* [http://www.dougengelbart.org/firsts/dougs-1968-demo.html Doug Engelbart Institute, "Doug's 1968 Demo"]. You may want to focus on the [http://dougengelbart.org/events/1968-demo-highlights.html highlights] or the [http://sloan.stanford.edu/MouseSite/1968Demo.html annotated clips].
* [http://en.wikipedia.org/wiki/The_Mother_of_All_Demos Wikipedia's page on "The Mother of all Demos"]

===[[DistOS 2018F 2018-09-17|September 17, 2018]]===

The Alto:
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/alto.pdf Thacker et al., "Alto: A Personal computer" (1979)] ([https://archive.org/details/bitsavers_xeroxparcttoAPersonalComputer_6560658 archive.org])

===[[DistOS 2018F 2018-09-19|September 19, 2018]]===

Multics & UNIX:
* [http://en.wikipedia.org/wiki/Multics Wikipedia article on Multics]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/unix.pdf Dennis M. Ritchie and Ken Thompson, "The UNIX Time-Sharing System" (1974)]

Optional: Browse around [http://www.multicians.org/ the Multicians website].

===[[DistOS 2018F 2018-09-24|September 24, 2018]]===

LOCUS & Sprite:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/walker-locus.pdf Bruce Walker et al., "The LOCUS Distributed Operating System." (1983)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/ousterhout-sprite.pdf John Ousterhout et al., "The Sprite Network Operating System" (1987)]

===[[DistOS 2018F 2018-09-26|September 26, 2018]]===

V, Amoeba, Clouds:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/cheriton-v.pdf David R. Cheriton, "The V Distributed System." (1988)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/tanenbaum-amoeba.pdf Andrew Tannenbaum et al., "The Amoeba System" (1990)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/clouds-dasgupta.pdf Partha Dasgupta et al., "The Clouds Distributed Operating System" (1991)]

===[[DistOS 2018F 2018-10-01|October 1, 2018]]===

NFS & AFS (+ Literature reviews)
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/sandberg-nfs.pdf Russel Sandberg et al., "Design and Implementation of the Sun Network Filesystem" (1985)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/howard-afs.pdf John H. Howard et al., "Scale and Performance in a Distributed File System" (1988)]

* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://advice.writing.utoronto.ca/types-of-writing/literature-review/ Taylor, "The Literature Review: A Few Tips On Conducting It"] [http://homeostasis.scs.carleton.ca/~soma/distos/2018f/taylor-literature-review.pdf (PDF)]

===[[DistOS 2018F 2018-10-03|October 3, 2018]]===

Farsite & Oceanstore
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/adya-farsite-intro.pdf Atul Adya et al.,"FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment" (2002)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/bolosky-farsite-retro.pdf William J. Bolosky et al., "The Farsite Project: A Retrospective" (2007)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)]

===October 8, 2018===

'''Thanksgiving!'''

===[[DistOS 2018F 2018-10-10|October 10, 2018]]===

Plan 9 & Inferno
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/presotto-plan9.pdf Presotto et. al, Plan 9, A Distributed System (1991)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/pike-plan9.pdf Pike et al., Plan 9 from Bell Labs (1995)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2018f/doward97-inferno.pdf Doward et al., The Inferno Operating System (1997)]
* [http://www.inferno-os.info/inferno/ Inferno OS website] (browse)

===[[DistOS 2018F 2018-10-15|October 15, 2018]]===

'''Test 1 review'''

===October 17, 2018===

'''Test 1'''

===[[DistOS 2018F 2018-10-29|October 29, 2018]]===

* [http://www.usenix.org/events/osdi06/tech/weil.html Weil et al., Ceph: A Scalable, High-Performance Distributed File System (OSDI 2006)].
* [http://pdos.csail.mit.edu/~strib/docs/tapestry/tapestry_jsac03.pdf Zhao et al, "Tapestry: A Resilient Global-Scale Overlay for Service Deployment" (JSAC 2003)]

Background (optional but helpful):
* [http://en.wikipedia.org/wiki/Distributed_hash_table Wikipedia's article on Distributed Hash Tables]
* [http://en.wikipedia.org/wiki/Kademlia Wikipedia's article on Kademlia]
* [http://en.wikipedia.org/wiki/Tapestry_%28DHT%29 Wikipedia's article on Tapestry]

===[[DistOS 2018F 2018-10-31|October 31, 2018]]===

'''Project Proposal Due'''

* [http://research.google.com/archive/gfs-sosp2003.pdf Sanjay Ghemawat et al., "The Google File System" (SOSP 2003)]
* [https://www.usenix.org/legacy/events/osdi06/tech/burrows.html Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI 2006)]

===[[DistOS 2018F 2018-11-05|November 5, 2018]]===

* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]
* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]

===[[DistOS 2018F 2018-10-29|November 7, 2018]]===

* [http://research.google.com/archive/mapreduce.html Dean & Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters" (OSDI 2004)]
* [https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi Martin Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning" (OSDI 2016)]

===[[DistOS 2018F 2018-10-29|November 12, 2018]]===

* [http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf DeCandia et al., "Dynamo: Amazon’s Highly Available Key-value Store" (SOSP 2007)]
* [http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf Lakshman & Malik, "Cassandra - A Decentralized Structured Storage System" (LADIS 2009)]

===[[DistOS 2018F 2018-10-29|November 14, 2018]]===

* [http://static.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf Beaver et al., "Finding a needle in Haystack: Facebook’s photo storage" (OSDI 2010)]
* [https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar Muralidhar et al., "f4: Facebook's Warm BLOB Storage System" (OSDI 2014)]

===[[DistOS 2018F 2018-10-29|November 19, 2018]]===

* Anderson, "BOINC: A System for Public-Resource Computing and Storage" (Grid Computing 2004) [http://dx.doi.org/10.1109/GRID.2004.14 (DOI)] [http://ieeexplore.ieee.org.proxy.library.carleton.ca/stamp/stamp.jsp?tp=&arnumber=1382809 (Proxy)]

===November 21, 2018===

Test 2

===[[DistOS 2018F 2018-10-29|November 26, 2018]]===

===[[DistOS 2018F 2018-10-29|November 28, 2018]]===

DistOS 2018F 2018-11-05

2018-11-06T19:04:26Z

Sheldon: /* Notes */

Lecture:

Why did they build big table:
Other apps other than a web crawler wanted to use the GFS….BigTable, wanted it to be fast, lots of data … multidimensional sorted map (not a database)

The Ad people said that BigTable was dumb b/c they wanted to do queries to generate business reports….so they now have business reports. Every technology bears the fingerprints of the initial use cases….unix has it all over the place...weirdness everywhere.

Here: what is Google determines the technology stack. How much code is at Google, source code repository is in the billions of lines of code….what they have done? Not only complex apps, the entire infrastructure for distributed OS, services and everything built internally mostly from scratch. Pioneers, they now have the legacy code problems. …. Opensource might have something better but they are locked in. At the forefront of building stuff at web internet scale….built it and know how to do it. Now less competitive advantage b/c lots of orgs know how to build things similar, yet better.

What’s big table built on-top of? GFS and Chubby. Why not build from scratch? Chubby is specialized, what it does, it does well. One thing to note: when look at papers, the Google techs fit together….building on top of other technologies. Advantages in the amount of code required to write.

Data structure, the SSTable, the crawler stores stuff in SSTables and then this was built on top to query the SSTable rather specialized tools. How to make database-like. Versioning the SSTables and then adding index semantics and localities (a mapping).
Request stuff, how to index and find stuff.

Spanner was built on top of Colossus instead of GFS….key problem with GFS, batch-oriented...random access is not well optimized. Why Colossus? Don’t know. The systems being discussed are large systems, why this course is not a coding course. Big systems. What that means, papers published about big systems, things were glossed over. Don’t talk about everything, have to leave a lot out. Paper is saying how to solve the problem. How to implement SQL transactions on a global scale. How to implement an entire system, they don’t go through that.

Memtable (5.3): what is it? A log-structured file system. Writes, don’t overwrite things that have written directly. 5 versions of a file, which versions are the current one, the rest are old. Whatever is in memory, will dump again etc. but with a log structure, what you do is maintain new new new new new and then clean up the old stuff. Have a compaction thing. Take current stuff, put it at the end and the cleanup...this is an important pattern to see with of systems.

How does a normal file system work? Have blocks
Hard drives (mechanical): sequential read and writes are fast. While random, moving head between tracks is a slow operation. Density increases, faster at the same rpm speed.

How to speed up? Keep in ram and then periodically write to disk. Good for performance until you have a problem like a system loses power. Big risk, might lose some data but what if you corrupt the data structure. Corrupted, lose all the data on disk. FSCK on windows, chkdisk, all they do, go through and see if it is a clean file system. If in a dirty state, walking the file system and ensure consistency...checking metadata blocks. The times on modern drives is enormous if have to check the whole drive. RAID arrays, lots of time, can’t do this, this is bad.

So idea of the journaled FS. Set aside a portion of the disk where you do all your writes first….written sequentially….to do it fast. Now committed, if lose power, all you have to do is replay the journal. Great system when workload is dominated by reads. But if you have a lot of writes, this is bad b/c have doubled the writes. What is the solution? Eliminate everything except the journal. Just write records one after the other. Block number might be on disk 50 times, b/c have written multiple versions but then, you go back and later reclaim it.

So the longer you go without cleanup, without committing to disk, the faster the performance. New report supersedes the old report. 20 copied of A and one version of B...need to do a cleanup 1 copy for A, 1 for B in a pile and then throw the rest out. Big table does almost the same thing. Keeps writing new data and the old data might still be valid. Classic strategy when you have lots of writes and you want to avoid random access, using sequential writes instead.

So Spanner, they wanted a database with transactions. What is the hard part of a transaction? Have a consistent state. Have to have the same view globally, globally literally means the globe. Want to have a consistent view of the data. One place has seats, the other place has payments, if not consistent, selling payments for seats you don’t have. How does time help you solve that? Creates a concept of a before and after….what is the global order of events. That is what you need for global consistencies. In a distributed system, ordering is not consistent by default. So if you have to maintain a global ordering, you are screwed. A will happen before B and B will happen before A. If it is inconsistent, will have different versions of the data, which one came first etc. This is bad. By having synced clocks, and the skew between them, can assign a timestamp and compare what happened first. If there is any possibility that the ordering is ambiguous, they know it. The window of uncertainty until things settle down and then do the operation. Getting a global ordering of events by using accurate clocks. From a physics standpoint is bizarre. Relativity. Simultaneous events are undefined, depends on the frame of reference. Ordering. GPS, atomic clocks in space sending radio signals essentially.

DistOS 2018F 2018-10-31

2018-11-01T00:11:07Z

Sheldon: /* Notes */

==Readings==

* [https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett Corbett et al., "Spanner: Google’s Globally-Distributed Database" (OSDI 2012)]
* [http://research.google.com/archive/bigtable-osdi06.pdf Chang et al., "BigTable: A Distributed Storage System for Structured Data" (OSDI 2006)]

==Notes==
GFS: Why do they want to store stuff?

What is it being used for?

The search engine, a copy of the web, for the crawler and the indices built from the crawler. Motivated the design. Bring in lots of data and do something with it. Crawler going and downloading everything it can file. What kind of file system? Lots of small files? No, not efficient. Had an atomic record append. Find things and put together a bunch of pages, whole data structure...a record, an application object is what did the crawler find? Atomic append, got it, saved, done. When appending it, that matters b/c it is not a byte stream, it is an application object appended one after another.
If crawling along, try to do a write, problem with the write....do it again and try a few times...how many times did you write it? Got written at least once. Record append, no garrantee that when you write something at the end, application level has to detect duplicates which might be done multiple times...makes sense in the context of a very large web crawler. Enormous files with lots of the internet in them, in one file. Large files. Inform the design of it, what is the chunk size, 64MB (designed for large web crawler, want to save everything). Lots of files and metadata...no, important b/c they did a lot of work to minimize metadata by design choice....One master server stored all metadata and you want it to be fast, cannot be the bottleneck so...loaded up with RAM etc. But run from memory, metadata lookup and then the chunk server.
Ceph has an entire cluster, dynamically allocate metadata....GFS has one server with replicas and that’s it. Seems dumb, why? Simpler, ridiculously simple...not a general purpose file system, not POSIX compatible just developed for a web crawler. 64MB and one metadata server...optimized for atomic append operations on objects that may be replicated. Why they don’t use GFS now, optimized for batch processing...build and index and make available to the search engine....but now, oriented completely towards fresh data. But GFS is an interesting design from perspective of specialized storage, design for a specific application....lots of trade offs, high performance but limited functionality. If doing a read, what are you doing? What is the process. Ask the master first what chunk you are looking for....great here is the info on the chunks....read a byte range, give the chunks then the client gets the data as fast as it can digest it. The master server is not involved most of the time. Course grain nature of the chunks....64MB, the number of data that has to be transmitted, amplification factor...leverage to be able to do relative scaling.

Writes are more complicated .... three is the magic number (3 replicas)...send out, wait for ack...might need to re-transmit until done....how big of a buffer do you need? A good size, several GB of RAM. Crawling application, use a lot of systems all writing to the same file, order in which they will be appended...who knows but order doesn’t matter...why it is a weird kind of file.

Chubby, what is chubby? A file system. Hierarchical key value store, can you store big files? No, 256k....but it’s a file system....why not use a regular file system, why chubby? Consistency, consistent distributed file system....read and write will look the same at any time. Readers and writers will see the same view. Previously, not completely consistent....b/c going to use it for locks...lock files are fine on an individual system but here, distributed file system that is completely consistent to use for locking....not easy, kinda crazy. They use Paxos....same state for everyone, how many failures can you tolerate.....5 b/c the Paxos algorithm is 2n+1 so if want to tolerate two failures, should have 5 machines....high level of reliability.
GFS when it wants to elect a master at a cell....go to chubby, find out who the master is and start talking to the master....if master is down or not responding...ask chubby, elects a new master....reliable backbone to elect who is in-charge....when master fails, switch over and transmit state to everyone quickly....chubby is how it was done.

Paxos does not scale...5 servers, go to 50...NO, chatty protocol so...what you need consistency for, can get it but must pay...so the rest of the system must be less consistent. If force consistency on everything all the time, never get performance. Paxos is the algorithm chubby used to coordinate state in the file system. Managing uncertainty....need the same state...paxos is a data structure, updates to the data structure....could always ask the current value from chubby and get the same value from any of the nodes...might have a delay...don’t know yet...wait and then when gives answer, is the correct answer.

Sequencer in Chubby: consensus....what is the ordering the states? Who goes first, second third and fourth....so if enforce ordering, will pay a price. Can do this with a lock generation number. Ordering, temporal ordering, is more difficult than consensus of state.

Hadoop ZooKeeper...the things people built after Google published...a whole set of technologies....not identical to GFS but inspired by GFS

How does Hadoop compare to AWS....amazon has own version of this stuff. Can go to Amazon and tell them to give you a bunch of Vms and can deploy Hadoop.....what AWS does, they can do it for you. What they offer S3 (cheapest storage)

DistOS 2018F 2018-10-29

2018-10-29T16:44:10Z

Sheldon: /* Notes */

==Readings==

* [http://research.google.com/archive/gfs-sosp2003.pdf Sanjay Ghemawat et al., "The Google File System" (SOSP 2003)]
* [https://www.usenix.org/legacy/events/osdi06/tech/burrows.html Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI 2006)]

==Notes==
Lecture Notes:

Peer-to-peer file sharing
Napster -> classic silicon valley, business model that makes no sense
Napster said, make all music available but not going to actually make the music available, use other people’s machines. Napster maintained a central directory but files stored on individual computers.
People still wanted to exchange music files. Don’t have a centralized database of all the songs. DHTs are a technology...what is a hash table...give it a string and it gives you something else...where can I download this? So if implement a distributed hash table, no one system controlling the hash table...just use that and then you have to shutdown a bunch of nodes.

ISP throttling...now they poisoned it (record companies)....download something, what the heck is that?
Idea behind Tapestry.....DHT as a service....Overlay network, what is it? Network that sits on-top of another network. Have the internet, isn’t the internet good enough? Point is that you want a different topology than what you have. Network based on geographic and organizational boundaries.
Overlay, redo the tapology...send a message to neighbours, neighbours defined by an overlay network.
i.e. Tor...defines its own topology.
Facebook....social networks....send a message out, make a post, routed to friends, neighbours and they get it. Tapology of the social graph, who connects to who. Tapestry is an overlay network but does it ignore geography, no it makes use of it. Peers are nodes that are close by....the network are the systems running Tapestry but also identifies nearby nodes vs. Further distance nodes....network topology aware.
Why? In large scale, will have a lot of node additions and deletions. Peer to Peer was, lets do file sharing but that was not their goal...building distributed applications of some kind.
Pond...wanted something that provided a layer of messaging to their own nodes, that’s Tapestry .... table of hosts will not work....this provides the layer to find each other and send messages to each other in an efficient way. Lets build an infrastructure for sending messages, do we build apps on top of things like this today?

Keep an eye out for....DHTs will appear but their is a fundamental issue with DHT, the Limewire problem...they do very badly with untrusted nodes....can mess everyone else up by giving bad info to the network...can stop one or two nodes but attackers have significant resources like a botnet to attack your system.

Single Tapestry Node figure 6....the OS...nothing fancy, like the regular internet except they are maintaining state using a distributed hash table...
Botnets...if they hard-code the IP address, how the botnet gets taken down...one way IRC, google search...use social media like an Instagram account....comments on celebrity Instagram feeds....spies used to send messages with ads in a newspaper or number stations (ham radio).
Tapestry, trust issues b/c with trusted infrastructure there are better ways of doing it.

Ceph:
Ceph is crazy, very complicated .... Ceph is out there, people are building this but really? What is CRUSH? Lets assume you have data, know where to go to get the parts of the files...would have to send a lot of data and update metadata every time you made changes to the file so they said we are not going to do that. These are not blocks, they are objects...what did they mean? Basically variable length chunk of storage, not dividing into fixed or variable size...file is some number of objects in some sort of order. When open file, what objects does it exist in but does the metadata storage give the objects? No, an algorithm to generate the names of the objects.

Metadata....store in memory every file ....hot spots for metadata access....3 would be maxed out and part of the system sleeping so the tree is dynamically re-partitioned to respond to hot spots.

Can have OSDs in parallel so asking for a file....distributed among lots of nodes so high performance ....many many computers talking to many many computers.

POSIX compatible (Ceph) impressive...POSIX compatibility is painful on writes...need to coordinate (centralize writes but that is slow). Can tell Ceph to be lazy. Take home lesson from Ceph....(all trusted, POSIX in distributed OS, can do it but OMG the admin overhead).

Tapestry...take home lesson...centralize node (trust)

Compare GFS with Ceph and Chubby (politically correct FAT storage :P) with Tapestry

DistOS 2018F 2018-10-10

2018-10-10T20:48:50Z

Sheldon: /* Notes */

==Readings==

Plan 9 & Inferno
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/presotto-plan9.pdf Presotto et. al, Plan 9, A Distributed System (1991)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/pike-plan9.pdf Pike et al., Plan 9 from Bell Labs (1995)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2018f/doward97-inferno.pdf Doward et al., The Inferno Operating System (1997)]
* [http://www.inferno-os.info/inferno/ Inferno OS website] (browse)

==Notes==
In-class lecture notes:

Where did the people who built these systems go? Are they still at bell labs? They are the people who made UNIX many years ago, then developed plan 9 and inferno then they went to Google and developed GO. Inferno was early 2000s. A direct continuation between Inferno and GO. GO is garbage collected vs. RUST. Automatic reference counting is not garbage collection, it is memory management. GO no pointers, RUST has pointers. GO produces binaries that don’t have any dependencies....similar to objective C, smaller than a Java run-time.

Inferno was like Java

What is Plan9? Why did they need to develop something better than Unix? Force people to share resources? Assumed that everything is run by the same administrator. Access to resources and network transparent. Why did they want to fix UNIX, what went wrong with UNIX? Graphics and networking and everything else were just a hack on-top of UNIX. Berkley developed Network, MIT developed graphics and from Bell Labs perspective, they messed it up. It is gross and messed up. Plan 9 is looking at everything bolted onto UNIX and made a new system that fixed it. Everything is a file, so they started to fix the hacks....classic UNIX started with that and then diverged. Plan 9 did not take off? Backwards compatibility. Violates evolution....never took off. Was Plan 9 a waist? Do modern systems have anything related to Plan 9? Proc ... all Kernel resources can be accessed as files. Ioctls (IO control system calls) were used in older Linux (it was cryptic and dumb, not insusceptible)....made everything a file like Plan 9 (Proc) then they make sysfs (a key value store...easier to use and better organized than Proc). Proc is still used for processes but separated it out into sysfs.
Unified protocol for doing things....http (has a name-space called domains)
Plan9 was a nice clean design, all really nice but doesn’t solve a problem that people had...shared file systems people had (nfs) and could access remote printers.

Nothing too different from Sun’s solution. Did not solve any new problem. Just a cleaner take on old problems. Does not win when there is established players...some force that pushes you towards it.

Plan9 lets fix UNIX
Inferno:
- a better JAVA. Time-frame when it came out....Bell Labs, we can do better than JAVA.
- Limbo ... kinda C like

JAVA has two fundamental blemishes...it is huge as a language....lots of libraries is ok but JAVA has a lot of syntax (all serves a purpose but, all the explicit typing....modern languages use type inferences such as Swift and RUST)...java is almost impossible to write without an IDE. The advantage of C is that it is a small language, can understand and use most of it on a regular bases...JAVA is not as bad as C++ but it is very verbose....forced object orientation and inheritance onto everything.
C is interfaces and protocols vs. Objects while JAVA forces objects.
Other languages, object orientation is much more subtle.

When the web was young....we need portable run-time to run code securely across the internet...a JAVA Applet....required a plugin to work.
Inferno was going for the same market... but failed b/c JAVA had time share....only secure in enterprise applications and everywhere else is not secure. Even Android might move away from it.

System that you have into the system you want to program....as long as it provides a common set of functionality. Inferno same as UNIX but was supposed to be portable similar to JAVA.

From distributed OS, what problem are they solving?
Plan9 Work-group sharing
Inferno Applications on a web-browser but not solving how to get lots of computers together to do big jobs are not what they are solving. Distributed OS for scientific purposes....nuclear simulation on 10,000 CPUS (those OS are small, specialized and exchange messages between nodes fast...straightforward design...zero copying for high-speed networking). Time it takes to send messages to nodes is less than the time it takes to copy data from memory. Low-latency networking, how long does it take to send and receive 1 byte of data. Point A to B with as few copies as possible. Kernel copies a buffer, it has lost high-speed networking. Classic high performance computers, and programming layers to exchange messages. Not really our focus....not really the nature of the system we have built...instead have the cloud.

Project questions?

Test on Wednesday the 17th ...next class, review...god way to study, think about questions might ask....short essays. Choose from a small choice of questions.

DistOS 2018F 2018-10-10

2018-10-10T20:45:22Z

Sheldon: /* Notes */

==Readings==

Plan 9 & Inferno
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/presotto-plan9.pdf Presotto et. al, Plan 9, A Distributed System (1991)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2014w/pike-plan9.pdf Pike et al., Plan 9 from Bell Labs (1995)]
* [https://homeostasis.scs.carleton.ca/~soma/distos/2018f/doward97-inferno.pdf Doward et al., The Inferno Operating System (1997)]
* [http://www.inferno-os.info/inferno/ Inferno OS website] (browse)

==Notes==
In-class lecture notes:

Where did the people who built these systems go? Are they still at bell labs? They are the people who made UNIX many years ago, then developed plan 9 and inferno then they went to Google and developed GO. Inferno was early 2000s. A direct continuation between Inferno and GO. GO is garbage collected vs. RUST. Automatic reference counting is not garbage collection, it is memory management. GO no pointers, RUST has pointers. GO produces binaries that don’t have any dependencies....similar to objective C, smaller than a Java run-time.

Inferno was like Java

What is Plan9? Why did they need to develop something better than Unix? Force people to share resources? Assumed that everything is run by the same administrator. Access to resources and network transparent. Why did they want to fix UNIX, what went wrong with UNIX? Graphics and networking and everything else were just a hack ontop of UNIX. Berkley developed Network, MIT developed graphics and from Bell Labs perspective, they messed it up. It is gross and messed up. Plan 9 is looking at everything bolted onto unix and made a new system that fixed it. Everything is a file, so they started to fix the hacks....classic UNIX started with that and then diverged. Plan 9 did not take off? Backwards compatibility. Violates evolution....never took off. Was Plan 9 a waist? Do modern systems have anything related to Plan 9? Proc ... all Kernel resources can be accessed as files. Ioctls (IO control system calls) were used in older Linux (it was cryptic and dumb, not insusceptible)....made everything a file like Plan 9 (‘/Proc) then they make ‘/sysfs (a key value store...easier to use and better organized than ‘/Proc). Proc is still used for processes but separated it out into sysfs .
Unified protocol for doing things....http (has a name-space called domains)
Plan9 was a nice clean design, all really nice but doesn’t solve a problem that people had...shared file systems people had (nfs) and could access remote printers.

Nothing too different from Sun’s solution. Did not solve any new problem. Just a cleaner take on old problems. Does not win when there is established players...some force that pushes you towards it.

Plan9 lets fix UNIX
Inferno:
- a better JAVA. Time-frame when it came out....Bell Labs, we can do better than JAVA.
- Limbo ... kinda C like

JAVA has two fundamental blemishes...it is huge as a language....lots of libraries is ok but JAVA has a lot of syntax (all serves a purpose but, all the explicit typing....modern languages use type inferences such as Swift and RUST)...java is almost impossible to write without an IDE. The advantage of C is that it is a small language, can understand and use most of it on a regular bases...JAVA is not as bad as C++ but it is very verbose....forced object orientation and inheritance onto everything.
C is interfaces and protocols vs. Objects while JAVA forces objects.
Other languages, object orientation is much more subtle.

When the web was young....we need portable runtime to run code securely across the internet...a JAVA Applet....required a plugin to work.
Inferno was going for the same market... but failed b/c JAVA had time share....only secure in enterprise applications and everywhere else is not secure. Even Android might move away from it.

System that you have into the system you want to program....as long as it provides a common set of functionality. Inferno same as UNIX but was supposed to be portable similar to JAVA.

From distributed OS, what problem are they solving?
Plan9 Work-group sharing
Inferno Applications on a web-browser but not solving how to get lots of computers together to do big jobs are not what they are solving. Distributed OS for scientific purposes....nuclear simulation on 10,000 CPUS (those OS are small, specialized and exchange messages between nodes fast...straightforward design...zero copying for high-speed networking). Time it takes to send messages to nodes is less than the time it takes to copy data from memory. Low-latency networking, how long does it take to send and receive 1 byte of data. Point A to B with as few copies as possible. Kernel copies a buffer, it has lost high-speed networking. Classic high performance computers, and programming layers to exchange messages. Not really our focus....not really the nature of the system we have built...instead have the cloud.

Project questions?

Test on Wednesday the 17th ...next class, review...god way to study, think about questions might ask....short essays. Choose from a small choice of questions.

DistOS 2018F 2018-10-03

2018-10-09T15:09:22Z

Sheldon: /* Notes */

==Readings==

Farsite & Oceanstore
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/adya-farsite-intro.pdf Atul Adya et al.,"FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment" (2002)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/bolosky-farsite-retro.pdf William J. Bolosky et al., "The Farsite Project: A Retrospective" (2007)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/oceanstore-sigplan.pdf John Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage" (2000)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/fall2008/fast2003-pond.pdf Sean Rhea et al., "Pond: the OceanStore Prototype" (2003)]

==Notes==
in-class Lecture Notes:
Oceanstore:
- large scale
- durable, archive
- distributed world wide
-untrusted storage nodes
Client had the keys, b/c could not trust the data storage nodes
How many keys did you need? Could store them on anything.
untrusted storage
- data centre storage

--> what is similar: fundamentally s3 but how does it compare

S3 is a web interface to a large file...immutable chunk of data with directories (buckets) etc. S3 is cheaper than virtual disks, really cheap for storing data. Can have new versions of files but it is immutable. S3 fragments but unknown to user....but S3 is not OceanStore in fundamental ways; in S3 no encrypting or the keys are with Amazon...by default. OceanStore, would be across multiple providers and organization while with Amazon, all stored with Amazon. Other companies offer S3 compatibility but that is selecting an organization not automatically spread around.

Why not doing Oceanstore? Legal issues (where’s the data?) b/c encrypting is so strong, when you have data on your system, have no idea what it is. Could be nasty stuff but have no clue. Not significant b/c of usability...have to manage my keys! If this breaks, everything is dead. If don’t want to do that, have contractual relationship that does not make sense. Easier to manage S3, might as well trust them...paying them money, business relationship...society is based on trust. Cloud is a high trust environment. Lots of overhead to try to eliminate trust and the benefit is limited b/c they end up trusting their providers. And the Amazon model allows for lock-in. OceanStore has no lock-in. OceanStore, nodes never have the keys, streaming the data.

Farsite
- Distributed – in an organization (smaller scale)
- “Use disks on workstations”
- directory server is what manages the keys and authentication codes.
- at Microsoft use Korborus ... it’s about keys, certain systems has keys, issues keys etc.
- having a centralized authentication server was a key problem
- How does security fall apart, especially anything based on crypto
- TLS as a protocol is great except how certificates are managed
- Unreliable, nodes not trusted but still trusted somewhat b/c the nodes often have the keys

-today, we have nothing close to Farsite in use...why? Storage is cheap. If you want reliability etc. Just setup servers....or go to cloud, let someone else manage the server. Outsources, thin clients rather than this.

What happened with Pond. Ended up running on 40 machines, prototype.
Built on Tapestry, and PlanetLab nods...workstations everywhere...universities would join to use it for experiments....big for distributed systems. But it is not production use. From a production use point of view how what they built looked? How it reads and how it writes. Reads were really fast, writes were really slow. Why were reads fast and writes were slow? They had parallel reads which is different from NTFS....not a good comparison. They had to votes on changes and lots of overhead to sync.

Farsight – tech transfer
Microsoft research ... hired a bunch of academics and set them up with resources and said go have fun with your group. Have to show that you contributed to Microsoft. Cool tech that came from research but rarely went to the product side.
Development process, re-implementation of components...off having fun for years....solving a problem PACSOS....retrospective, Google File System published years before PACSOS....other companies built the stuff while Microsoft researched.

For exam: common themes, good questions....might show up on test

DistOS 2018F 2018-09-26

2018-10-03T02:09:57Z

Sheldon: /* Notes */

==Readings==

V, Amoeba, Clouds:
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-21/cheriton-v.pdf David R. Cheriton, "The V Distributed System." (1988)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/tanenbaum-amoeba.pdf Andrew Tannenbaum et al., "The Amoeba System" (1990)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-01-28/clouds-dasgupta.pdf Partha Dasgupta et al., "The Clouds Distributed Operating System" (1991)]

==Notes==
in-class lecture notes:
- V – Fast RPC but wanted to do software backed plain

Non of them are saying let’s use UNIX so they tried other things but did not caught on

Amoeba:

Wanted to cluster computers so that users at terminals that could send load balanced jobs to the cluster and easier to implement than a pear implemented system.

Bunch of servers do things, and file servers and then clients

So distributed but specialized, not peer-to-peer

File system implemented as a set of processes

Had Unix interface but removed file system approach...had a notion of files but their files were different. Contiguous blocks but they were also immutable files .... replicated at will...powerful idea b/c we will see again...immutability why? Avoid the problem of shared state...when you replicate it, you replicate it and your done, don’t need to synchronize. If want something different, make a new file or segment. Distributed files, will see immutable stuff like Amazon S3; have versions but the underlying system is immutable. Git is immutable underneath b/c taking these things identified by hashes and that is being stored.

Mutability to immutability with versions....can use kinda the same but the underlying system is very different.

Ameoba is influential b/c they did a couple of things in an interesting way. Lessons learned from the dream of a distributed OS. Key lesson, how did they do permissions? Access control lists...no They had capabilties...big idea in access control b/c normally when we think of UNIX permissions access control lists....what’s the objects being controlled and who can access it. With a capability model instead we have a string set of bits and if you have these bits you can access the resource and if you don’t have it can’t access. Get the capability, and once have it can use it to access that resource. It means that the permission granting is separate from the permission check...don’t need to know who you are but that you have the correct capability. To make it work well you need something cryptographic...Ameoba didn’t have good cryptography (48 bits) but it was the concept model...distributed access control.

Linux capabilities are not the same, same term but only some of the notions....capabilities is an un-forgable bit string to gain access to resource. Keys today are not b/c of performance reasons.

Active directory in Windows is based on Kerberos and the server gives a ticket that allows access to resources...basically capabilities.

Process model:

Clouds

What went wrong? Execution context are not indexed. Moving between address spaces. Normally execution context and state are connected. If you blow that up, what’s your security model?

Why didn’t UNIX die? Programs and development environment would have to all be developed again from scratch. Get new OS that do new things when you have a reason, a use case. Unix is below, everything else on top.

DistOS 2018F 2018-10-01

2018-10-03T02:04:25Z

Sheldon: /* Notes */

==Readings==

NFS & AFS (+ Literature reviews)
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/sandberg-nfs.pdf Russel Sandberg et al., "Design and Implementation of the Sun Network Filesystem" (1985)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/howard-afs.pdf John H. Howard et al., "Scale and Performance in a Distributed File System" (1988)]

* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://advice.writing.utoronto.ca/types-of-writing/literature-review/ Taylor, "The Literature Review: A Few Tips On Conducting It"] [http://homeostasis.scs.carleton.ca/~soma/distos/2018f/taylor-literature-review.pdf (PDF)]

==Notes==
in-class Lecture notes:

FS client-server file sharing

Server has directory, you can access it

How much work were they trying to do? They were trying to do as little work as possible.

Simple to implement and allow clients to load files

How did they break POSIX (portable OS interface), classic UNIX semantics ... it was stateless which was an implementation choice that caused them to break things. NFS, the thing they took out was open, read, write and close b/c it is stateless.

The paper we read today is not NFS today, describing NFS v1 or v2....v3 changed a lot and v4 changed a lot more.
Stateless b/c they didn’t want the server to keep track of the clients so all the client had to do was send a read or a write and then it was done.
NFS original allowed for limited caching and seeking (changing file pointer) but still not state on the server but it changes subsequent reads and writes which the kernel must keep track of (bit offset the NFS has but not locally). Path name and translate to a token for late lookups so you don’t need to parse strings every time but the server needs to translate token to path.
Corner case: Unix and Open, what happens when you delete a file that is open for writing in Unix? Program using it still exists, takes up space on disk until file closed by the process....delete the file but disk is still full so must kill the process to delete the log files (sig signal) but for NFS this doesn’t work so they had a crude hack....if remove last link to open file, would rename it to .nfsXXXX that would need to later go and cleanup.
It has to do with the client knew that it was still open and it knew the ref. Count. In Unix, no delete, only unlink....looks to links to iNode.

Key thing to know about NFS....how to determine who had access to what file? In Unix, by userID and groupID...the request would come for a specific request, who does the permission check in NFS? The client-kernel b/c it is assumed that the user ID and group IDs are synced on server; client wants to access file, if not allowed, the client kernel would deny it. When mount it, the local kernel can pretend to be any idea on the remote server...except the root user (but root is a camelian)...NFS...No File Security

Why no file security, why transparent? Clients were trusted by a central admin...on wire things would be encrypted with a RPC or data encrypting standard...back in the 80s but would not ship that b/c could not export it b/c had to get a license to export it. Was turned off, option to enable but no-one did.

NFS was widely used with large installations but, it sucks b/c major problems. Scalability, and security....trust client computers to do anything with the file system...every file access is going through the server instead of cached...not safe with NFS.

Every read with NFS generates an RPC...every file access, every read/write, doing an RPC...doing network traffic...NFS was designed in a world where they said the “network is the computer”. Built systems that didn’t have local hard drives and all file access was remote. Disk-less workstations, 10-20 systems, it would work well. With AFS, don’t want to do network traffic on every read and write, only on open and close thereby reducing network traffic required...cache of file to validate or maybe only have to send changes back.

When changed from NFS to AFS...with NFS if the network goes down, what happens? System freezes, can’t do any read-writes, just waits (blocks until network comes back) so, kinda know when things are bad so don’t loose much data but can’t do anything more.
With AFS, do read/writes etc. Then do a close but the network is down, close fails, loose all changes...close has a return value, close can failed, part of the POSIX standard...here close is a commit so if it succeeds or not, it matters! The API changed in a way that is not obvious which breaks things...b/c can fail on close so...programs needed to change to check for the close...POSIX but weird...hard to know what your assumptions are until they are violated...change how it works and then the mental model is potentially wrong. Close just tells the terminal you are done. Conflicts due to multiple edits to the file when the network is back.

With NFS, the way file IDs were synced...classic system called YP (Yellow Pages)...still see references for YP...a trademark so changed Network Information Service...thing for syncing the password file entry.

With AFS, the client is not trusted...must authenticate using CORBORUS ... talk to an auth server, auth to it and gives a ticket (temp key for 8 hours) and go to the network, get resources (files and mail) but the auth server just has be involved when you login to get tickets to do everything. AFS depends on you having tickets...if the tickets expire, all of a sudden cannot access anything...must renew tickets with a command or logout/login again.

AFS scalability tricks beyond open/close and caching...had volumes so can move files around in the file system without requring the data in memory...an abstraction in the file server, multiple copies of the same volume. Read only replicas and then later read/write replicas...replications and fault tolerance were added in. And a globally unique file name....AFS was cool b/c you could nav to /afs/athena.mit.edu
/afs/andrew.com.edu
why we don’t use AFS...b/c we use the web...everything was slow, go outside afs cell, things would break.

One of the biggest things they did not build a web-browser over it.

AFS was not easy to setup, NFS took a few seconds to setup.

AFS cool ideas but, the web took-off b/c easy to setup

AFS was a Multix like thing v.s NFS which was like Unix....when you have access control lists....look for a overly complex security system.

Literature Review:

Pick an area related to distributed OS and do a literature review of it.
Scope too broadly, would have to write a book about it
Wants it 10-20 pages....proper review of topic picked.
Trick to pick a narrow enough topic you find interesting
How to limit scope?
Don’t start with high level and gather papers, wrong approach. Incomplete view of chosen topic.
Too random.
What to do instead?
Pick a paper. Pick one paper that you think is interesting and look for things that are related to it and look from who the paper cites and grow outward. Basically, if say talk about this narrow thing...what is in common with the papers.
How do you find the paper? One paper that you understand, read carefully and reasonably well cited
Go through major conferences, pick paper. Cannot pick a paper looked at in class, the papers are not narrow enough, need something more specialized. Better off taking something related to what we find interesting. i.e. crypt-currencies or interested in graphics or usability...your area of comp sci.

If going to get something done by the end of the term....test is on the 17th, just before the break so, when the break comes, spend time finding papers.

Outline: an abstract, an outline and 10 references. Going from 1 to 10 is not hard if having a good paper to start with.

In cloud space...IPFS, solid etc. For distributed computation...what is the history? Where does it come from? Tell a factual story.

DistOS 2018F 2018-10-01

2018-10-03T02:01:03Z

Sheldon: /* Notes */

==Readings==

NFS & AFS (+ Literature reviews)
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/sandberg-nfs.pdf Russel Sandberg et al., "Design and Implementation of the Sun Network Filesystem" (1985)]
* [http://homeostasis.scs.carleton.ca/~soma/distos/2008-02-11/howard-afs.pdf John H. Howard et al., "Scale and Performance in a Distributed File System" (1988)]

* Harvey, "What Is a Literature Review?" [http://www.cs.cmu.edu/~missy/WritingaLiteratureReview.doc (DOC)] [http://www.cs.cmu.edu/~missy/Writing_a_Literature_Review.ppt (PPT)]
* [http://advice.writing.utoronto.ca/types-of-writing/literature-review/ Taylor, "The Literature Review: A Few Tips On Conducting It"] [http://homeostasis.scs.carleton.ca/~soma/distos/2018f/taylor-literature-review.pdf (PDF)]

==Notes==
in-class Lecture notes:

FS client-server file sharing

Server has directory, you can access it

How much work were they trying to do? They were trying to do as little work as possible.

Simple to implement and allow clients to load files

How did they break POSIX (portable OS interface), classic UNIX semantics ... it was stateless which was an implementation choice that caused them to break things. NFS, the thing they took out was open, read, write and close b/c it is stateless.

The paper we read today is not NFS today, describing NFS v1 or v2....v3 changed a lot and v4 changed a lot more.
Stateless b/c they didn’t want the server to keep track of the clients so all the client had to do was send a read or a write and then it was done.
NFS original allowed for limited caching and seeking (changing file pointer) but still not state on the server but it changes subsequent reads and writes which the kernel must keep track of (bit offset the NFS has but not locally). Path name and translate to a token for late lookups so you don’t need to parse strings every time but the server needs to translate token to path.
Corner case: Unix and Open, what happens when you delete a file that is open for writing in Unix? Program using it still exists, takes up space on disk until file closed by the process....delete the file but disk is still full so must kill the process to delete the log files (sig signal) but for NFS this doesn’t work so they had a crude hack....if remove last link to open file, would rename it to .nfsXXXX that would need to later go and cleanup.
It has to do with the client knew that it was still open and it knew the ref. Count. In Unix, no delete, only unlink....looks to links to iNode.

Key thing to know about NFS....how to determine who had access to what file? In Unix, by userID and groupID...the request would come for a specific request, who does the permission check in NFS? The client-kernel b/c it is assumed htat the user ID and group IDs are synced on server; client wants to access file, if not allowed, the client kernel would deny it. When mount it, the local kernel can pretend to be any idea on the remote server...except the root user (but root is a camelian)...NFS...No File Security

Why no file security, why transparent? Clients were trusted by a central admin...on wire things would be encrypted with a RPC or data encrypting standard...back in the 80s but would not ship that b/c could not export it b/c had to get a license to export it. Was turned off, option to enable but no-one did.

NFS was widely used with large installations but, it sucks b/c major problems. Sociability, and security....trust client computers to do anything with the file system...every file access is going through the server instead of cached...not safe with NFS.

Every read with NFS generates an RPC...every file access, every read/write, doing an RPC...doing network traffic...NFS was designed in a world where they said the “network is the computer”. Built systems that didn’t have local hard drives and all file access was remote. Disk-less workstations, 10-20 systems, it would work well. With AFS, don’t want to do network traffic on every read and write, only on open and close thereby reducing network traffic required...cache of file to validate or maybe only have to send changes back.

When changed from NFS to AFS...with NFS if the network goes down, what happens? System freezes, can’t do any read-writes, just waits (blocks until network comes back) so, kinda know when things are bad so don’t loose much data but can’t do anything more.
With AFS, do read/writes etc. Then do a close but the network is down, close fails, loose all changes...close has a return value, close can failed, part of the POSIX standard...here close is a commit so if it succeeds or not, it matters! The API changed in a way that is not obvious which breaks things...b/c can fail on close so...programs needed to change to check for the close...POSIX but weird...hard to know what your assumptions are until they are violated...change how it works and then the mental model is potentially wrong. Close just tells the terminal you are done. Conflicts due to multiple edits to the file when the network is back.

With NFS, the way file IDs were synced...classic system called YP (Yellow Pages)...still see references for YP...a trademark so changed Network Information Service...thing for syncing the password file entry.

With AFS, the client is not trusted...must authenticate using CORBORUS ... talk to an auth server, auth to it and gives a ticket (temp key for 8 hours) and go to the network, get resources (files and mail) but the auth server just has be involved when you login to get tickets to do everything. AFS depends on you having tickets...if the tickets expire, all of a sudden cannot access anything...must renew tickets with a command or logout/login again.

AFS scalability tricks beyond open/close and caching...had volumes so can move files around in the file system without requring the data in memory...an abstraction in the file server, multiple copies of the same volume. Read only replicas and then later read/write replicas...replications and fault tolerance were added in. And a globally unique file name....AFS was cool b/c you could nav to /afs/athena.mit.edu
/afs/andrew.com.edu
why we don’t use AFS...b/c we use the web...everything was slow, go outside afs cell, things would break.

One of the biggest things they did not build a web-browser over it.

AFS was not easy to setup, NFS took a few seconds to setup.

AFS cool ideas but, the web took-off b/c easy to setup

AFS was a multix like thing v.s NFS which was like Unix....when you have access control lists....look for a overly complex security system.

Literature Review:

Pick an area related to distributed OS and do a literature review of it.
Scope too broadly, would have to write a book about it
Wants it 10-20 pages....proper review of topic picked.
Trick to pick a narrow enough topic you find interesting
How to limit scope?
Don’t start with high level and gather papers, wrong approach. Incomplete view of chosen topic.
Too random.
What to do instead?
Pick a paper. Pick one paper that you think is interesting and look for things that are related to it and look from who the paper cites and grow outward. Basically, if say talk about this narrow thing...what is in common with the papers.
How do you find the paper? One paper that you understand, read carefully and reasonably well cited
Go through major conferences, pick paper. Cannot pick a paper looked at in class, the papers are not narrow enough, need something more specialized. Better off taking something related to what we find interesting. i.e. crypt-currencies or interested in graphics or usability...your area of comp sci.

If going to get something done by the end of the term....test is on the 17th, just before the break so, when the break comes, spend time finding papers.

Outline: an abstract, an outline and 10 references. Going from 1 to 10 is not hard if having a good paper to start with.

In cloud space...IPFS, solid etc. For distributed computation...what is the history? Where does it come from? Tell a factual story.