Operating Systems 2017F Lecture 10
Assignmen is due Thursday
Concurrency -more than one thing happening at the same time
challenge with concurrency is keeping shared state consistent -Files that are written to by multiple proceses -Two or more processes communicating -multiple threads within a process
if you have two thing access one thing at the same time, which one reaches first?
Concurrency leads to race conditions: -order of interleaved actions affects output -ex. did process A or B modify the file first?
(generate random number, we will get different result, which we don’t want) We want deterministic behavior from concurrent computations -How to make sure they dont corporate each other? -Need to have some sort of cordination
To get consistent computation, we need to coordinate. (need to somehow interact with each other) -but you can’t interact via regular variables (IMPORTANT)
Regular variables are subject to the memory hierarchy -variables are stored in registers, L1 chache, L2 chache, L3 chache, main memory -can be deeper in some contexts -so what if coordination variable is in L1 cache for one thread and in main memory for another? -hardware maintains consistency… eventually
How caching works
CPU caches are small pools of memory that store information the CPU is most likely to need next. Which information is loaded into cache depends on sophisticated algorithms and certain assumptions about programming code. The goal of the cache system is to ensure that the CPU has the next bit of data it will need already loaded into cache by the time it goes looking for it (also called a cache hit).
A cache miss, on the other hand, means the CPU has to go scampering off to find the data elsewhere. This is where the L2 cache comes into play — while it’s slower, it’s also much larger. Some processors use an inclusive cache design (meaning data stored in the L1 cache is also duplicated in the L2 cache) while others are exclusive (meaning the two caches never share data). If data can’t be found in the L2 cache, the CPU continues down the chain to L3 (typically still on-die), then L4 (if it exists) and main memory (DRAM).
Variable: ranges of address, stored in multiple places. L1: fastest, smallest, closest to the core L2, L3 Mechanisms for coordinating concurrent computatiosn must bypass the memory hierachy instead use: -filesystem locking primitives(kernel) -special machine language instructions
going to use abstraction to do conccurency
operating systems - kernels in particular - are highly concurrent -make them painful to code -have to always worry about having exclusive (or safe) access to data, have to make sure it is done properly
abstruction, help us think about cuncurrency Abstruction for managing concurrency: -atomic variables guarantee strict ordering of operations (only one can be modified one time)
Ex. Variable x: ****** <-range of memory locations
What does the cpu do to modify variable x? -load x into register -modify the register -save the register to x
Can i just modified directly in memory? Yes but it is much slower than using register. register: very small relative to memory. (to see more Register vs Memory : http://techdifferences.com/difference-between-register-and-memory.html)
What if two cores try to modify variable x at the same time? -both could have x in a register
Let’s say x is 5 -core A increments by 1 -core B decrements by 1
what’s x after both have run? Just go back to 5? -coreA will increment 1, but coreB doesn’t register 5 yet, so will be 4 If B chages x but doesn’t see A’s change, you get 4. If A changes x but doesn’t see B’s change… you get 6 -This is bad.
What do I need? I nee x to be atomic, ^^^ shouldn’t be possible. -has to make sure only one core accesses x at a time
only one core can load x in the memory and modify cache have just different speed and differnet sizes. (memory address, when you access the register? dont need to load data, it is in the register)
Classic promitive Atomic variables are typically used to build semaphores… but what we really want is mutual exclusion -only one threat/process/cpu can access the resource at a time if A be there, B can’t be there. it is exclusive access. want mutual exclusion for larger resources, e.g. data structures, devices
higher-level languages directly support mutual exclusion, e.g. Java synchronized methods (you have a object, it is synchronized, it only can happen one at a time.) Semaphores are analogy to a railraod control mechanism -only one train on the track
Change both direction on that one track, otherwise collision They are using light signal.If that train is going south, the other one can’t go north… If zero track, I can grab it. until I am done, no one can get it semaphore is used to implement lock. its more primitive concept.
when a train enters the shared track -grab the semaphore (increment atomic variable) -if not available, wait (check first to see if it is 0)
when a train leaves the shared track -release the semaphore(decrement atomic variable)
lesson: don’t implement semaphores yourself, please!
In the linux kernel -locks semaphores of various types are associated with all kinds of data structures e.g inodes if you wanna access and modify inodes, it is a struct, you need to lock , otherwise if someone want to modify it, it would be dangerous. Key question: when you can’t get the lock, what do you do? -Sleep
(tell the scheduler to wake you up when the lock is available)
Can the scheduler go to sleep? no. what do they do try to grab something but can’t get it? -deal with not getting resource -spin (do a null loop) (spin in circle)
If the wait time is short, spinning is the fastest thing you can do -what happen when you sleep? -can my sleep be woke up?
where possible, everything should be exclusive, minimize sharing
but in userspace… -shared memory? NOooo unless you use atomic variable, or something higher level like semaphore all the problem that kernel need to deal with it, you need to deal with it -in general, shared memory is painful -better way in general, use messages. -series of meesges can be used to synchronize state
3-way handshakes, e.g. opening a TCP connection(everytime you conncect a website...) it have to sychrnoize state -send SYN (synchronize) packet client->server -SYN ACK (acknowlege SYN) server-> client -ACK(acknowlege) client->server why have to be 3-way? first packet necessary to open the communicate, second one: the server tell the client i got your packet, all good. what happen if SYN ACK get lost?
there are message passing happening underneath many things.
Where do i want to send message? I can send messages between processes, using... -Signals! -other ways too (lots of different ways)
classic pattern in concurrency point: used to build compare solution for example: sorting
concurrency pattern: Producer/consumer -two processes P and C -P ‘makes’ things, C’consumes’ them -want to coordinate P and C -P shouldn’t produce too much (C gets too far behind) -C shouldn’t waste tiem waiting for P thinks you stuffing envelope someone is opening, someone is sealing it, someone is stuffing inside the envelope sometimes there would be someone is really slow stuffing, and someone is opening the envelope very fast. Or there are too much paper to make envelopes, which is kind of wastes.
we want a signal say: hey hold on, when you get to some point, just stop, let other people catchup, or switch job.
-shared circular buffer for string output of P and input of C -when buffer is empty, c should sleep? -when the buffer is full, P should sleep? (solution is suprising tricky) what happen if the consumer is sleeping, the producer wake them up? don’t corrupt the buffer? how to you minimize the cost? Anil will talk about this next time.
producer consumer in real life: ‘ls | more ‘ pipe is producer, consumer, one process producer, other is consumer, there is a buffer between this two things. kernel maintain this buffer. How does it handle the coordination? consumer read from the buffer (stdin, stdout), kernel knows it and implement it.