DistOS 2021F 2021-09-23: Difference between revisions
Created page with "==Discussion questions== * What is the basic idea of distributed shared memory? * How does distributed shared memory compare to virtual memory on a single CPU system? How ab..." |
|||
| Line 8: | Line 8: | ||
==Notes== | ==Notes== | ||
<pre> | |||
Lecture 5 | |||
--------- | |||
Group reports | |||
- so far they look mostly good | |||
- please try to add some structure to it | |||
- organizing around questions asked/topics discussed helps | |||
- section headings are nice | |||
- 1-2 pages is what most of you are turning in, which is good | |||
- but please use complete sentences, don't just use bullet points/phrases. It | |||
is just harder to understand/more ambiguous | |||
DSM | |||
What is the basic idea of distributed shared memory? | |||
- a process should be able to run across multiple computers | |||
- different threads on different hosts | |||
- but all sharing the same memory/address space | |||
- instead of multiple processes communicating over the network, | |||
we have one process sharing info with itself through shared memory, | |||
just like any multithreaded program | |||
- (we can just share part of a process's address space, then it is just | |||
like two processes sharing part of their memory on a single system) | |||
Do we like to program multithreaded programs? | |||
- in general, it is the hardest way to implement things | |||
- but on multicore systems, can be the fastest | |||
- because shared memory is a fast way to share state | |||
- avoid copies of data from messages | |||
Does this apply to a cluster of systems? | |||
- NO, not at all | |||
- because "shared memory" is an illusion implemented by COPYING DATA OVER THE NETWORK | |||
- so we can never be faster than just sending messages back and forth | |||
- so, why do DSM if not performance? | |||
- ease of use, or | |||
- legacy code | |||
How does distributed shared memory compare to virtual memory on a single CPU system? | |||
- very similar | |||
- basically we're swapping across the network instead of to disk | |||
- but data can change when it is swapped out | |||
How about a system with many cores? | |||
- very similar, except... | |||
- everything implemented in hardware | |||
- VERY fast network, ultra low latency | |||
- don't have to worry about network failures | |||
- system is dead then | |||
- copying has to happen all the time | |||
- between caches of cores and main memory | |||
How common do you think is DSM today in the cloud today? Why? | |||
- In a sense, DSM is alive and well, just on multicore systems | |||
- but everywhere else...not so much | |||
- just isn't more efficient in software | |||
- better to just send messages using some other abstraction | |||
Isn't a distributed cache a DSM? | |||
- not really, it is much more specialized | |||
- think content distribution networks (CDNs) | |||
How aware does a programmer need to be of DSM to use it? | |||
- not at all, it is transparent | |||
How aware to use it efficiently? | |||
- VERY aware, if you aren't careful performance will go into the toliet | |||
- and it is hard to tell you're making things slow, we don't think of memory access that way most of the time | |||
- think about how difficult it is to make cache-efficient code | |||
- add in a network and complexity goes up, as does cost of failures to do | |||
the right thing | |||
Eventual consistency vs strong consistency | |||
- if two nodes access the same memory, do they HAVE to see the same thing immediately? | |||
- if they don't, you can get away with eventual consistency and improve performance | |||
- but then, why use shared memory? | |||
If you're tuning your system, is it easier to | |||
- play with memory placement, DSM algorithms, or | |||
- optimize network usage? | |||
Note the trend here | |||
- when we try to fool the developer that there isn't a network, we have | |||
performance & scaling bottlenecks | |||
- need abstractions that are inherently network-aware | |||
- account for latency, bandwidth, reliability, (& security issues) | |||
</pre> | |||
Latest revision as of 23:26, 23 September 2021
Discussion questions
- What is the basic idea of distributed shared memory?
- How does distributed shared memory compare to virtual memory on a single CPU system? How about a system with many cores?
- How aware does a programmer need to be of DSM to use it? How aware to use it efficiently?
- What are the key mechanisms supporting DSM?
- How common do you think is DSM today in the cloud today? Why?
Notes
Lecture 5
---------
Group reports
- so far they look mostly good
- please try to add some structure to it
- organizing around questions asked/topics discussed helps
- section headings are nice
- 1-2 pages is what most of you are turning in, which is good
- but please use complete sentences, don't just use bullet points/phrases. It
is just harder to understand/more ambiguous
DSM
What is the basic idea of distributed shared memory?
- a process should be able to run across multiple computers
- different threads on different hosts
- but all sharing the same memory/address space
- instead of multiple processes communicating over the network,
we have one process sharing info with itself through shared memory,
just like any multithreaded program
- (we can just share part of a process's address space, then it is just
like two processes sharing part of their memory on a single system)
Do we like to program multithreaded programs?
- in general, it is the hardest way to implement things
- but on multicore systems, can be the fastest
- because shared memory is a fast way to share state
- avoid copies of data from messages
Does this apply to a cluster of systems?
- NO, not at all
- because "shared memory" is an illusion implemented by COPYING DATA OVER THE NETWORK
- so we can never be faster than just sending messages back and forth
- so, why do DSM if not performance?
- ease of use, or
- legacy code
How does distributed shared memory compare to virtual memory on a single CPU system?
- very similar
- basically we're swapping across the network instead of to disk
- but data can change when it is swapped out
How about a system with many cores?
- very similar, except...
- everything implemented in hardware
- VERY fast network, ultra low latency
- don't have to worry about network failures
- system is dead then
- copying has to happen all the time
- between caches of cores and main memory
How common do you think is DSM today in the cloud today? Why?
- In a sense, DSM is alive and well, just on multicore systems
- but everywhere else...not so much
- just isn't more efficient in software
- better to just send messages using some other abstraction
Isn't a distributed cache a DSM?
- not really, it is much more specialized
- think content distribution networks (CDNs)
How aware does a programmer need to be of DSM to use it?
- not at all, it is transparent
How aware to use it efficiently?
- VERY aware, if you aren't careful performance will go into the toliet
- and it is hard to tell you're making things slow, we don't think of memory access that way most of the time
- think about how difficult it is to make cache-efficient code
- add in a network and complexity goes up, as does cost of failures to do
the right thing
Eventual consistency vs strong consistency
- if two nodes access the same memory, do they HAVE to see the same thing immediately?
- if they don't, you can get away with eventual consistency and improve performance
- but then, why use shared memory?
If you're tuning your system, is it easier to
- play with memory placement, DSM algorithms, or
- optimize network usage?
Note the trend here
- when we try to fool the developer that there isn't a network, we have
performance & scaling bottlenecks
- need abstractions that are inherently network-aware
- account for latency, bandwidth, reliability, (& security issues)