Operating Systems 2020W Lecture 14
Video
The video from the lecture given on March 4, 2020 is now available.
Notes
Lecture 14 ---------- Topics * midterm * tutorials (meta) * concurrency * producer/consumer * pipes * shared memory with mmap * Tutorial 6 * pattern matching versus model building Concurrency - doing multiple things "in parallel" that interact - note signal handlers add a kind of concurrency - we really can't do it well, can only do it in limited circumstances We see concurrency issues when there are multiple computations that have to communicate & coordinate - multiple threads - multiple processes - processes running on different systems - single process that gets interrupted by signal handlers Key task with concurrency is managing interactions Two basic strategies - messages - shared data structures (memory) Networking is messaging-based concurrency So is pipes We can use messages to simulate shared state - think google docs But where we can, shared state is more efficient, but is much more dangerous Pipes are a weird kernel thing - we are pausing the process doing writes when the reader isn't ready for them Pausing writes is easy - just make the write system call take a long time - writes block after all Note that the same is true for reads, they can block We coordinate by using a shared buffer - buffer is in kernel space - writes put data into the buffer - reads take data out - when buffer is full, writer is paused on write - when buffer is empty, reader is paused on read A pipe is an example of the producer-consumer problem - classic design pattern for concurrency pipes show the line between message passing and shared state isn't always so clear So, are pipes the way to go for IPC on a single host? - well, there is significant overhead - the overhead is in the system calls (switching back and forth to the kernel) and copying data - (high performance communication makes use of "zero copy" mechanisms) So, we can just share memory right? Note that child processes get a copy of a parent's memory - NOT shared You get shared memory with either - making a thread (all memory is shared) - shared memory ranges using mmap - do an anonymous mmap that is shared - will be shared from parent to child, won't be duplicated So we have shared memory, are we done? NO - shared memory is WEIRD on modern systems - changes to memory made by one process won't necessarily be seen immediately by other processes When we share memory, we are fighting the memory hierarchy - caches get in the way! Memory hierarchy - registers - L1 cache - L2 cache - L3 cache volatile, fast - main memory (DRAM) -------------------- - SSD - hard drives persistent, slow - tapes L1 caches are normally per core If CPU cores A and B both try modifying X A will have one version of X, B will have another version and they won't necessarily be the same You can only share memory safely between processes by using special hardware mechanisms which makes sure data is consistent - not all data, just selected data What do we need? special CPU instructions that make sure changes to memory are instantly shared across all CPUs - need hardware-provided atomic operations But we don't want to use such mechanisms directly need special language support (just like with system calls) semaphores are an abstraction of this sort of coordination mechanism, using shared memory that is changed atomically atomic operation - indivisible operation Semaphores come from railroad signals - want to make sure shared track is never occupied by two trains - semaphore indicates when the track is "taken" Basic algorithm for accessing shared state - try to grab semaphore (check if zero, if zero, make one) - if taken, wait until it becomes available, then take it (watch semaphore until it turns to zero, then make it one) - when finished, release semaphore (set semaphore to zero) (this describes a binary semaphore, also known as a mutex) (can also have counting semaphores) You can't implement a semaphore safely with a simple byte or integer variable, using standard programming constructs Need to be able to do test and set (check value and change it) as ONE instruction