Operating Systems 2015F Lecture 19

From Soma-notes
Revision as of 17:39, 18 November 2015 by Soma (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Video

The video from the lecture given on November 18, 2015 is now available.


Notes

Linux Kernel map (to be discussed later, but take a look!).

Lecture 19
----------

Cloud computing
 - client-server computing taken to an extreme
 - servers "on demand"
 - someone else takes care of the servers
 - "utility"-style computing (you pay for what you use)

At a hardware level...
 - cloud computing is a heavy industry
   - very resource intensive: power & cooling mainly
   - lots and lots of computers

At a software level, challenges are
 - coordination, workload distribution
 - reliability - computers will fail during routine tasks
 - maintenance

Companies with big computational needs built these "cloud infrastructures" for internal use, then made them available to outsiders.


What sort of abstractions do cloud providers, well, provide?

1. Infrastructure as a service
 - openstack, i.e. virtual machines on demand
 - operating system as a "process" running on a "kernel"
   - i.e. a virtual machine running on a hypervisor
   - hypervisors multiplex regular kernels

this virtualization started with mainframes, but came to PCs with VMware in the late 1990's.
 - VMware managed to virtualize an architecture, the PC, which wasn't designed for virtualization

2. Platform as a service
 - run your app on a standard "platform" where you don't see or manage the OS
 - e.g. Google App Engine, AWS Elastic Beanstalk
 - trade off lower-level control for ease of use
 - platforms are normally cheaper than infrastructure as a service

3. Software as a service
 - most things on the web
 - Netflix, Google Apps, Facebook, Salesforce, ...
 - for example, Netflix makes use of Amazon's cloud


Technology is a bit different
 - hardware virtualization (hypervisors): multiplex kernels/operating systems
 - OS virtualization: run multiple userlands on one kernel
   - web hosting
   - "simple" hack on regular kernels, add multiple namespaces for process IDs, user IDs, files, etc.
   - much lighter weight than running multiple kernels
 - platforms as a service are mostly proprietary versions of open source runtimes


Kernels want to mess with
 - trap/interrupt tables
 - page tables
 - I/O devices

With a virtualized kernel, physical memory is actually virtual!


When you have a hypervisor, someone has to control it
 - typically, you have a "host OS" (a VM with special privileges), or you have a remote management console
 - when you have a host OS, it provids drivers for networking and disk, other devices, (e.g. Xen)
 - commercial hypervisors try to get away from a host OS by...sticking device drivers into the hypervisor!

Filesystems, or how do you store files in the cloud?
 - every virtual machine has its own local filesystem
 - but real data is stored remotely in some sort of distributed storage

For really big filesystems, the files can be terabytes in size
 - think a web index
 - have to allow for parallel access
   - thousands of computers accessing the "same" file
 - standard filesystem semantics break down
   - no canonical ordering, duplicated data, not a bytestream but divided into records
 - starts looking more like a database

Google File System (implemented in Hadoop)
 - runs on Linux systems, using collections of regular files
 - "block size" (chunks) is 64M
 - optimized for record append operations
 - records require a unique ID so duplicates can be detected

What do I want you to notice
 - at cloud scale, standard OS abstractions don't work
 - at best, they can be building blocks for other abstractions
 - cloud abstractions tend to be functional, rather than based on state

Example: mapreduce
 - want to calculate the frequency of words on the web
 - take a cache of the web, calculate frequency on every page
 - merge counts between pages
 - frequency on each page: map operation
 - merge: reduce
 - order of operation doesn't matter
 - failed operations don't matter, just do it again