Operating Systems 2020W Lecture 14

From Soma-notes
Revision as of 22:35, 19 March 2020 by Soma (talk | contribs) (Created page with "==Video== The video from the lecture given on March 4, 2020 [https://homeostasis.scs.carleton.ca/~soma/os-2020w/lectures/comp3000-2020w-lec14-20200304.m4v is now available]....")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Video

The video from the lecture given on March 4, 2020 is now available.

Notes

Lecture 14
----------

Topics
* midterm
* tutorials (meta)
* concurrency
* producer/consumer
* pipes
* shared memory with mmap
* Tutorial 6

* pattern matching versus model building

Concurrency
 - doing multiple things "in parallel" that interact
    - note signal handlers add a kind of concurrency
 - we really can't do it well, can only do it in limited circumstances


We see concurrency issues when there are multiple computations that have
to communicate & coordinate
 - multiple threads
 - multiple processes
 - processes running on different systems
 - single process that gets interrupted by signal handlers

Key task with concurrency is managing interactions

Two basic strategies
 - messages
 - shared data structures (memory)

Networking is messaging-based concurrency
So is pipes

We can use messages to simulate shared state
 - think google docs

But where we can, shared state is more efficient,
but is much more dangerous

Pipes are a weird kernel thing
 - we are pausing the process doing writes when the reader
   isn't ready for them

Pausing writes is easy
 - just make the write system call take a long time
 - writes block after all

Note that the same is true for reads, they can block

We coordinate by using a shared buffer
 - buffer is in kernel space
 - writes put data into the buffer
 - reads take data out
 - when buffer is full, writer is paused on write
 - when buffer is empty, reader is paused on read


A pipe is an example of the producer-consumer problem
 - classic design pattern for concurrency


pipes show the line between message passing and shared state isn't
always so clear

So, are pipes the way to go for IPC on a single host?
 - well, there is significant overhead
 - the overhead is in the system calls (switching back and forth to the kernel)
   and copying data
 - (high performance communication makes use of "zero copy" mechanisms)

So, we can just share memory right?

Note that child processes get a copy of a parent's memory
 - NOT shared

You get shared memory with either
 - making a thread (all memory is shared)
 - shared memory ranges using mmap
   - do an anonymous mmap that is shared
   - will be shared from parent to child, won't be duplicated

So we have shared memory, are we done?  NO
 - shared memory is WEIRD on modern systems
 - changes to memory made by one process won't necessarily be
   seen immediately by other processes

When we share memory, we are fighting the memory hierarchy
 - caches get in the way!

Memory hierarchy
 - registers
 - L1 cache
 - L2 cache
 - L3 cache                 volatile, fast
 - main memory (DRAM)
--------------------
 - SSD
 - hard drives              persistent, slow
 - tapes

L1 caches are normally per core

If CPU cores A and B both try modifying X
 A will have one version of X, B will have another version
 and they won't necessarily be the same

You can only share memory safely between processes by using
special hardware mechanisms which makes sure data is consistent
  - not all data, just selected data

What do we need?  special CPU instructions that make sure
changes to memory are instantly shared across all CPUs
 - need hardware-provided atomic operations

But we don't want to use such mechanisms directly
need special language support (just like with system calls)

semaphores are an abstraction of this sort of coordination mechanism,
using shared memory that is changed atomically

atomic operation - indivisible operation

Semaphores come from railroad signals
 - want to make sure shared track is never occupied by two trains
 - semaphore indicates when the track is "taken"

Basic algorithm for accessing shared state
 - try to grab semaphore
     (check if zero, if zero, make one)
 - if taken, wait until it becomes available, then take it
     (watch semaphore until it turns to zero, then make it one)
 - when finished, release semaphore
     (set semaphore to zero)

 (this describes a binary semaphore, also known as a mutex)
 (can also have counting semaphores)

You can't implement a semaphore safely with a simple byte or integer variable,
using standard programming constructs

Need to be able to do test and set (check value and change it) as ONE instruction