DistOS 2021F 2021-09-23: Difference between revisions

From Soma-notes
Created page with "==Discussion questions== * What is the basic idea of distributed shared memory? * How does distributed shared memory compare to virtual memory on a single CPU system? How ab..."
 
 
Line 8: Line 8:


==Notes==
==Notes==
<pre>
Lecture 5
---------
Group reports
- so far they look mostly good
- please try to add some structure to it
  - organizing around questions asked/topics discussed helps
  - section headings are nice
- 1-2 pages is what most of you are turning in, which is good
- but please use complete sentences, don't just use bullet points/phrases.  It
  is just harder to understand/more ambiguous
DSM
What is the basic idea of distributed shared memory?
- a process should be able to run across multiple computers
  - different threads on different hosts
  - but all sharing the same memory/address space
  - instead of multiple processes communicating over the network,
    we have one process sharing info with itself through shared memory,
    just like any multithreaded program
  - (we can just share part of a process's address space, then it is just
      like two processes sharing part of their memory on a single system)
Do we like to program multithreaded programs?
- in general, it is the hardest way to implement things
- but on multicore systems, can be the fastest
    - because shared memory is a fast way to share state
      - avoid copies of data from messages
Does this apply to a cluster of systems?
- NO, not at all
- because "shared memory" is an illusion implemented by COPYING DATA OVER THE NETWORK
    - so we can never be faster than just sending messages back and forth
- so, why do DSM if not performance?
    - ease of use, or
    - legacy code
How does distributed shared memory compare to virtual memory on a single CPU system?
- very similar
- basically we're swapping across the network instead of to disk
  - but data can change when it is swapped out
How about a system with many cores?
- very similar, except...
- everything implemented in hardware
- VERY fast network, ultra low latency
- don't have to worry about network failures
    - system is dead then
- copying has to happen all the time
    - between caches of cores and main memory
How common do you think is DSM today in the cloud today? Why?
- In a sense, DSM is alive and well, just on multicore systems
- but everywhere else...not so much
  - just isn't more efficient in software
  - better to just send messages using some other abstraction
Isn't a distributed cache a DSM?
- not really, it is much more specialized
- think content distribution networks (CDNs)
How aware does a programmer need to be of DSM to use it?
- not at all, it is transparent
How aware to use it efficiently?
- VERY aware, if you aren't careful performance will go into the toliet
- and it is hard to tell you're making things slow, we don't think of memory access that way most of the time
  - think about how difficult it is to make cache-efficient code
  - add in a network and complexity goes up, as does cost of failures to do
    the right thing
Eventual consistency vs strong consistency
  - if two nodes access the same memory, do they HAVE to see the same thing immediately?
  - if they don't, you can get away with eventual consistency and improve performance
    - but then, why use shared memory?
If you're tuning your system, is it easier to
- play with memory placement, DSM algorithms, or
- optimize network usage?
Note the trend here
- when we try to fool the developer that there isn't a network, we have
  performance & scaling bottlenecks
- need abstractions that are inherently network-aware
    - account for latency, bandwidth, reliability, (& security issues)
</pre>

Latest revision as of 23:26, 23 September 2021

Discussion questions

  • What is the basic idea of distributed shared memory?
  • How does distributed shared memory compare to virtual memory on a single CPU system? How about a system with many cores?
  • How aware does a programmer need to be of DSM to use it? How aware to use it efficiently?
  • What are the key mechanisms supporting DSM?
  • How common do you think is DSM today in the cloud today? Why?

Notes

Lecture 5
---------

Group reports
 - so far they look mostly good
 - please try to add some structure to it
   - organizing around questions asked/topics discussed helps
   - section headings are nice
 - 1-2 pages is what most of you are turning in, which is good
 - but please use complete sentences, don't just use bullet points/phrases.  It
   is just harder to understand/more ambiguous

DSM
What is the basic idea of distributed shared memory?
 - a process should be able to run across multiple computers
   - different threads on different hosts
   - but all sharing the same memory/address space
   - instead of multiple processes communicating over the network,
     we have one process sharing info with itself through shared memory,
     just like any multithreaded program

   - (we can just share part of a process's address space, then it is just
      like two processes sharing part of their memory on a single system)

Do we like to program multithreaded programs?
 - in general, it is the hardest way to implement things
 - but on multicore systems, can be the fastest
    - because shared memory is a fast way to share state
       - avoid copies of data from messages

Does this apply to a cluster of systems?
 - NO, not at all
 - because "shared memory" is an illusion implemented by COPYING DATA OVER THE NETWORK
    - so we can never be faster than just sending messages back and forth
 - so, why do DSM if not performance?
    - ease of use, or
    - legacy code

How does distributed shared memory compare to virtual memory on a single CPU system?
 - very similar
 - basically we're swapping across the network instead of to disk
   - but data can change when it is swapped out


How about a system with many cores?
 - very similar, except...
 - everything implemented in hardware
 - VERY fast network, ultra low latency
 - don't have to worry about network failures
    - system is dead then
 - copying has to happen all the time
    - between caches of cores and main memory

How common do you think is DSM today in the cloud today? Why?
 - In a sense, DSM is alive and well, just on multicore systems
 - but everywhere else...not so much
   - just isn't more efficient in software
   - better to just send messages using some other abstraction

Isn't a distributed cache a DSM?
 - not really, it is much more specialized
 - think content distribution networks (CDNs)

How aware does a programmer need to be of DSM to use it?
 - not at all, it is transparent

How aware to use it efficiently?
 - VERY aware, if you aren't careful performance will go into the toliet
 - and it is hard to tell you're making things slow, we don't think of memory access that way most of the time
   - think about how difficult it is to make cache-efficient code
   - add in a network and complexity goes up, as does cost of failures to do
     the right thing

Eventual consistency vs strong consistency
  - if two nodes access the same memory, do they HAVE to see the same thing immediately?
  - if they don't, you can get away with eventual consistency and improve performance
     - but then, why use shared memory?

If you're tuning your system, is it easier to
 - play with memory placement, DSM algorithms, or
 - optimize network usage?

Note the trend here
 - when we try to fool the developer that there isn't a network, we have
   performance & scaling bottlenecks
 - need abstractions that are inherently network-aware
    - account for latency, bandwidth, reliability, (& security issues)