Notes
Web Scale
---------
* Midterm grading is ongoing, hopefully will be finished this week
* Proposal deadline extended to Friday, will try to give you some material this week to help
- I've been ignoring some of you on Teams, I will be replying today
Up to this point in the class, we've really been focused on distributed systems for running classic UNIX-like workloads
- individual developer/engineer working at a workstation on their stuff
Key problems here are
- making small files available to any workstation
- authentication at a workstation (Kerberos, out of scope)
- running jobs on remote computers
- process migration is nice but not so common a need in practice
We've had hints at larger problems
- DSM can be used for large-scale scientific applications
Scientific applications will consume as many resources as you throw at them
- but mostly they've been addressed through specialized solutions
- classic method is distributed apps built on MPI (a message passing API)
- look up "Beowulf clusters" for classic implementations
These aren't so "distributed OS"-like, because it is like coding in assembly language - not much abstraction beyond what UNIX+networking provides
But the web came in the 1990's and changed everything
Why did the web change the computing landscape? There were new problems to solve!
Today it is commonplace to have web applications that are accessed by millions of people concurrently.
* Initially web applications ran on a single computer with just a web server process.
* Then, that process was connected to a backend database (MySQL) and Perl scripts (CGI).
* This was all great for a modest number of users. But then the world got on the world wide web and we neeeded to SCALE
The first company to really try to scale with the growth of the web was Google
- before, companies bought the biggest, most expensive computers they could and load balanced between them carefully
- google realized that you could use lots of cheap computers if you could deal with their cheapness in software (i.e., distributing workload, dealing with failures)
- and this really mattered for search engines
Search engines were the first hard problem because to index the web you had to download it, and the web was growing exponentially in size
- but how do you make an application layer for software to run on
that takes advantage of many many computers?
Papers for the second half of the semester are to give you an idea of how large-scale systems are built to support web applications. We need:
- filesystems
- databases
- computation/data processing
- plumbing to support the above (coordination services)
Do these things make a "distributed operating system"?
- yes and no
- yes: real applications are built on top of these abstractions
- no: abstractions aren't general purpose, need different ones
for different applications (an OS normally has one set of abstractions that
everyone uses)
Why can't we just have one unifying distributed OS then?
- communication and fault tolerance
- communication: it is always expensive, and how to minimize it
depends on the application. (Parallel is always hard)
- really, we turn most of the app into being embarassingly parallel
(requiring no coordination), and the remainder becomes the
hard part that requires special engineering