DistOS 2021F 2021-11-18

From Soma-notes
Revision as of 02:26, 19 November 2021 by Soma (talk | contribs) (Created page with "==Notes== <pre> Lecture 18 ---------- - proposals, I'm grading, that is my tomorrow - but tonight, I'll get you readings, hopefully for rest of term - experience 2 is almo...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Notes

Lecture 18
----------
 - proposals, I'm grading, that is my tomorrow
 - but tonight, I'll get you readings, hopefully for rest of term
 - experience 2 is almost done, should be released over the weekend
   - you'll have more time than we originally planned
      - probably till the final exam
   - he'll then get to grading experience 1

Delos
 - why did I pick it?  in part, "power of abstraction"
 - consensus protocols keep being used
   - many ones
   - each is very technical

abstracting consensus to a shared log file
 - consensus over the order of events

log files are central to maintaining state
 - in log-based filesystems, the log is everything
 - this is because the log determines how state is changed
   - if you can agree on the order of changes,
     (and what those changes are)
     you can agree on the final state

Chubby said the central consensus mechanism should be a file
 - Delos says it should be a log
 - and you can use a log to make a consensus filesystem if you want!


The abstraction says 'you have a log'
 - below: how do I make a log?
 - above: how do I use the log?


DHTs & Tapestry

Overlay networks
 - network on a network

We have the Internet, why do we need anything else on top?

an overlay network is adding a level of indirection to the networking problem
 - and that's how you solve problems in CS, right?

You use an overlay network when you don't have an easy way of knowing the destination beforehand
 - where you communicate is based on what is being communicated

On the Internet, I can find someplace by its IP address or DNS name
 - and if I don't know either, I use a search engine

But a search engine is centralized
 - and we want a decentralized way

So an overlay network is a way of routing information where
 - you don't know the location of the destination a-priori
 - and you don't have a centralized authority to tell you

So we define a new topology, and the purpose of that topology is to facilitate the kind of information exchange we want to do
 - "neighbors" have a relationship that helps communication

Social networks are a kind of overlay network
 - social graph overlayed on the Internet
 - but, mostly ends up just being a centralized service nowadays
    - but you could make a decentralized one

On the Internet, how does data get from A to B?
 - A sends info to local router R_1
 - R_1 sends it to R_2
 - ...
 - R_n-1 sends it to R_n
 - R_n sends it to B

So data goes from source to routers to destinations

Routers know about each other, how to route data, using
routing tables
 - spanning trees

BGP is a protocol for updating routing tables

So with an overlay network, we do the same thing
 - note that Tapestry even takes into account local topology

We don't use this sort of stuff as much anymore
 - overlay networks are essential to any P2P system
 - but we don't use P2P so much nowadays

Let's recall Skype, great example of an overlay & P2P network back in the day

Before, Skype had very few central servers, yet it wanted to route calls for the planet
 - idea: make it a P2P network!
 - clients would become hosts, everyone would share in maintaining directory, routing calls
    - actual call would take place directly between two parties
      unless there were networking issues (firewalls, NAT)
    - but connecting the two parties, finding them via
      their usernames took serious resources

Turns out university network admins HATED Skype
 - it would suck up their bandwidth
 - it was also stealthy, so they couldn't easily block it

When Microsoft bought Skype, they dismantled the P2P network
and replaced it with centralized servers
 - and made it much worse for end users

What P2P apps do we use nowadays?
 - Bittorrent?
 - some games?
 - Windows update?

but our home network connections are *so* much better than they were nowadays, and computers are so much more powerful
 - would be so easy to do P2P today

mobile devices aren't the best for P2P
 - intermittent connectivity

But the real problem is one of economics and sociology, not technology
 - P2P networks are heavily used for illegal activity
 - unclear how to monetize any P2P application, so who develops it?
 - nobody knows how to really stop the abuse of P2P systems
    - too easy for bad actors to mess everything up

You could say that cryptocurrencies are the new P2P systems
 - but miners aren't something most people want to run