Operating Systems 2020W Lecture 14
Video
The video from the lecture given on March 4, 2020 is now available.
Notes
Lecture 14
----------
Topics
* midterm
* tutorials (meta)
* concurrency
* producer/consumer
* pipes
* shared memory with mmap
* Tutorial 6
* pattern matching versus model building
Concurrency
- doing multiple things "in parallel" that interact
- note signal handlers add a kind of concurrency
- we really can't do it well, can only do it in limited circumstances
We see concurrency issues when there are multiple computations that have
to communicate & coordinate
- multiple threads
- multiple processes
- processes running on different systems
- single process that gets interrupted by signal handlers
Key task with concurrency is managing interactions
Two basic strategies
- messages
- shared data structures (memory)
Networking is messaging-based concurrency
So is pipes
We can use messages to simulate shared state
- think google docs
But where we can, shared state is more efficient,
but is much more dangerous
Pipes are a weird kernel thing
- we are pausing the process doing writes when the reader
isn't ready for them
Pausing writes is easy
- just make the write system call take a long time
- writes block after all
Note that the same is true for reads, they can block
We coordinate by using a shared buffer
- buffer is in kernel space
- writes put data into the buffer
- reads take data out
- when buffer is full, writer is paused on write
- when buffer is empty, reader is paused on read
A pipe is an example of the producer-consumer problem
- classic design pattern for concurrency
pipes show the line between message passing and shared state isn't
always so clear
So, are pipes the way to go for IPC on a single host?
- well, there is significant overhead
- the overhead is in the system calls (switching back and forth to the kernel)
and copying data
- (high performance communication makes use of "zero copy" mechanisms)
So, we can just share memory right?
Note that child processes get a copy of a parent's memory
- NOT shared
You get shared memory with either
- making a thread (all memory is shared)
- shared memory ranges using mmap
- do an anonymous mmap that is shared
- will be shared from parent to child, won't be duplicated
So we have shared memory, are we done? NO
- shared memory is WEIRD on modern systems
- changes to memory made by one process won't necessarily be
seen immediately by other processes
When we share memory, we are fighting the memory hierarchy
- caches get in the way!
Memory hierarchy
- registers
- L1 cache
- L2 cache
- L3 cache volatile, fast
- main memory (DRAM)
--------------------
- SSD
- hard drives persistent, slow
- tapes
L1 caches are normally per core
If CPU cores A and B both try modifying X
A will have one version of X, B will have another version
and they won't necessarily be the same
You can only share memory safely between processes by using
special hardware mechanisms which makes sure data is consistent
- not all data, just selected data
What do we need? special CPU instructions that make sure
changes to memory are instantly shared across all CPUs
- need hardware-provided atomic operations
But we don't want to use such mechanisms directly
need special language support (just like with system calls)
semaphores are an abstraction of this sort of coordination mechanism,
using shared memory that is changed atomically
atomic operation - indivisible operation
Semaphores come from railroad signals
- want to make sure shared track is never occupied by two trains
- semaphore indicates when the track is "taken"
Basic algorithm for accessing shared state
- try to grab semaphore
(check if zero, if zero, make one)
- if taken, wait until it becomes available, then take it
(watch semaphore until it turns to zero, then make it one)
- when finished, release semaphore
(set semaphore to zero)
(this describes a binary semaphore, also known as a mutex)
(can also have counting semaphores)
You can't implement a semaphore safely with a simple byte or integer variable,
using standard programming constructs
Need to be able to do test and set (check value and change it) as ONE instruction