Difference between revisions of "Operating Systems 2017F Lecture 10"

From Soma-notes
Jump to navigation Jump to search
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Assignmen is due Thursday
==Video==
 
Video from the lecture given on October 10, 2017 [http://homeostasis.scs.carleton.ca/~soma/os-2017f/lectures/comp3000-2017f-lec10-10Oct2017.mp4 is now available].  (The audio worked!)
 
==Notes==
 
===In-Class===
 
<pre>
Lecture 10
----------


Concurrency
Concurrency
-more than one thing happening at the same time
- more than one thing happening at the same time


challenge with concurrency is keeping shared state consistent  
challenge with concurrency is keeping shared state consistent
-Files that are written to by multiple proceses
* files that are written to by multiple processes
-Two or more processes communicating
* two or more processes communicating
-multiple threads within a process
* multiple threads within a process


if you have two thing access one thing at the same time, which one reaches first?
concurrency leads to race conditions
- order of interleaved actions affects output
- ex. did process A or B modify the file first?


Concurrency leads to race conditions:
-order of interleaved actions affects output
-ex. did process A or B modify the file first?
(generate random number, we will get different result, which we don’t want)
We want deterministic behavior from concurrent computations
We want deterministic behavior from concurrent computations
-How to make sure they dont corporate each other?
-Need to have some sort of cordination


To get consistent computation, we need to coordinate. (need to somehow interact with each other)
To get consistent computation, we need to coordinate
-but you can’t interact via regular variables (IMPORTANT)
- but you can't interact via regular variables


Regular variables are subject to the memory hierarchy
Regular variables are subject to the memory hierarchy
-variables are stored in registers, L1 chache, L2 chache, L3 chache, main memory
- variables are stored in registers, L1 cache, L2 cache, L3 cache, main memory
-can be deeper in some contexts
- can be deeper in some contexts
-so what if coordination variable is in L1 cache for one thread and in main memory for another?
- so what if coordination variable is in L1 cache for one thread and in main memory for another?
-hardware maintains consistency… eventually
- hardware maintains consistency...eventually


mechanisms for coordinating concurrent computations must bypass the memory hierachy


How caching works
Instead use
CPU caches are small pools of memory that store information the CPU is most likely to need next. Which information is loaded into cache depends on sophisticated algorithms and certain assumptions about programming code. The goal of the cache system is to ensure that the CPU has the next bit of data it will need already loaded into cache by the time it goes looking for it (also called a cache hit).
- filesystem locking primitives (kernel)
A cache miss, on the other hand, means the CPU has to go scampering off to find the data elsewhere. This is where the L2 cache comes into play — while it’s slower, it’s also much larger. Some processors use an inclusive cache design (meaning data stored in the L1 cache is also duplicated in the L2 cache) while others are exclusive (meaning the two caches never share data). If data can’t be found in the L2 cache, the CPU continues down the chain to L3 (typically still on-die), then L4 (if it exists) and main memory (DRAM).
- special machine language instructions


(refs: https://www.extremetech.com/extreme/188776-how-l1-and-l2-cpu-caches-work-and-why-theyre-an-essential-part-of-modern-chips)
Operating systems - kernels in particular - are highly concurrent
- makes them painful to code
- have to always worry about having exclusive (or safe) access to data, resources


Variable: ranges of address, stored in multiple places.
L1: fastest, smallest, closest to the core
L2,
L3
Mechanisms for coordinating concurrent computatiosn must bypass the memory hierachy
instead use:
-filesystem locking primitives(kernel)
-special machine language instructions
going to use abstraction to do conccurency
operating systems - kernels in particular - are highly concurrent
-make them painful to code
-have to always worry about having exclusive (or safe) access to data, have to make sure it is done properly


abstruction, help us think about cuncurrency
Abstractions for managing concurrency
Abstruction for managing concurrency:
- atomic variables guarantee strict ordering of operations
-atomic variables guarantee strict ordering of operations (only one can be modified one time)


Ex.
Example
Variable x: ****** <-range of memory locations


What does the cpu do to modify variable x?
Variable X:  ***** <- range of memory locations
-load x into register
-modify the register
-save the register to x


Can i just modified directly in memory? Yes but it is much slower than using register.  
To modify X, you...
register: very small relative to memory.
  - load X into a register
(to see more Register vs Memory : http://techdifferences.com/difference-between-register-and-memory.html)
  - modify the register
  - save the register to X


What if two cores try to modify variable x at the same time?
What if two cores try to modify X at the same time?
-both could have x in a register
  - both could have X in a register at the same time


Let’s say x is 5
X is 5
-core A increments by 1
core A increments by 1
-core B decrements by 1
core B decrements by 1


what’s x after both have run? Just go back to 5?
What's X after both have run?
-coreA will increment 1, but coreB doesn’t register 5 yet, so will be 4
If B chages x but doesn’t see A’s change, you get 4.
If A changes x but doesn’t see B’s change… you get 6
-This is bad.


What do I need?
If B changes X but doesn't see A's change...you get 4
I nee x to be atomic, ^^^ shouldn’t be possible.
If A changes X but doesn't see B's change...you get 6
-has to make sure only one core accesses x at a time


only one core can load x in the memory and modify
For X to be atomic, ^^^ shouldn't be possible
cache have just different speed and differnet sizes.
- has to make sure only one core accesses X at a time
(memory address, when you access the register? dont need to load data, it is in the register)


Classic promitive
Atomic variables are typically used to build semaphores...
Atomic variables are typically used to build semaphores…
but what we really want is mutual exclusion
-only one threat/process/cpu can access the resource at a time
if A be there, B can’t be there. it is exclusive access.
want mutual exclusion for larger resources, e.g. data structures, devices


higher-level languages directly support mutual exclusion, e.g. Java synchronized methods
But what we really want is mutual exclusion
(you have a object, it is synchronized, it only can happen one at a time.)
Semaphores are analogy  to a railraod control mechanism
-only one train on the track


Change both direction on that one track, otherwise collision
- only one thread/process/CPU can access the resource at a time
They are using light signal.If that train is going south, the other one can’t go north…
If zero track, I can grab it. until I am done, no one can get it
semaphore is used to implement lock. its more primitive concept.


when a train enters the shared track
Want mutual exclusion for larger resources, e.g. data structures, devices
-grab the semaphore (increment atomic variable)
-if not available, wait (check first to see if it is 0)


when a train leaves the shared track
Higher-level languages directly support mutual exclusion, e.g.
-release the semaphore(decrement atomic variable)
Java synchronized methods


lesson: don’t implement semaphores yourself, please!
Semaphores are an analogy to a railroad control mechanism
- only one train on the track!


In the linux kernel
When a train enters the shared track
-locks semaphores of various types are associated with all kinds of data structures
- grab the semaphore (increment atomic variable)
e.g inodes
- if not available, wait (check first to see if it is 0)
if you wanna access and modify inodes, it is a struct, you need to lock , otherwise if someone want to modify it, it would be dangerous.
 
Key question: when you can’t get the lock, what do you do?
When a train leaves the shared track
-Sleep
- release the semaphore (decrement atomic variable)
(tell the scheduler to wake you up when the lock is available)
 
Can the scheduler go to sleep? no.
Lesson: don't implement semaphores yourself, please!
what do they do try to grab something but can’t get it?
 
-deal with not getting resource
In the Linux kernel
-spin (do a null loop) (spin in circle)
- locks using semaphores of various types are associated with all kinds
  of data structures
  e.g. inodes
 
When you can't get the lock, what do you do?
- sleep (tell the scheduler to wake you up when the lock is available)
- deal with not getting the resource
- spin (do a null loop)


If the wait time is short, spinning is the fastest thing you can do
If the wait time is short, spinning is the fastest thing you can do
-what happen when you sleep?
-can my sleep be woke up?


Where possible, everything should be exclusive, minimize sharing
But in userspace...
- shared memory...NOOOOOOO
    - unless you use atomic variable, or something higher level
      like a semaphore
- messages
    - series of messages can be used to synchronize state
3-way handshakes, e.g. opening a TCP connection
- SYN (synchronize) client->server
- SYN-ACK (acknowledge SYN) server->client
- ACK (acknowledge) client->server


where possible, everything should be exclusive, minimize sharing
Can send messages between processes
- signals!
- other ways too


but in userspace…
-shared memory? NOooo
unless you use atomic variable, or something higher level
like semaphore
all the problem that kernel need to deal with it, you need to deal with it
-in general, shared memory is painful
-better way in general, use messages.
-series of meesges can be used to synchronize state


3-way handshakes, e.g. opening a TCP connection(everytime you conncect a website...)
concurrency pattern: producer/consumer
it have to sychrnoize state
* two processes P and C
-send SYN (synchronize) packet client->server
* P "makes" things, C "consumes" them
-SYN ACK (acknowlege SYN) server-> client
* want to coordinate P and C
-ACK(acknowlege) client->server
  - P shouldn't produce too much (C gets too far behind)
why have to be 3-way?
  - C shouldn't waste time waiting for P
first packet necessary to open the communicate, second one: the server tell the client i got your packet, all good.
what happen if SYN ACK get lost?


there are message passing happening underneath many things.
* shared circular buffer for storing output of P and input of C
* when buffer is empty, C should sleep
* when buffer is full, P should sleep
</pre>


Where do i want to send message? I can send messages between processes, using...
----
-Signals!
-other ways too
(lots of different ways)


classic pattern in concurrency
<pre>
point: used to build compare solution
For those who have used Mutexes before, a mutex is essentially a binary semaphore (hence the name, mutual exclusion).
for example: sorting


concurrency pattern: Producer/consumer
A mutex allows only a single thread to access a resource, whereas a semaphore allows, and can keep count of how many threads are accessing a certain resource at once.
-two processes P and C
-P ‘makes’ things, C’consumes’ them
-want to coordinate P and C
-P shouldn’t produce too much (C gets too far behind)
-C shouldn’t waste tiem waiting for P
thinks you stuffing envelope
someone is opening, someone is sealing it, someone is stuffing inside the envelope
sometimes there would be someone is really slow stuffing, and someone is opening the envelope very fast. Or there are too much paper to make envelopes, which is kind of wastes.


we want a signal say: hey hold on, when you get to some point, just stop, let other people catchup, or switch job.
Mutexes are more common in practice, for some examples of when to use a semaphore, please see:
    http://www.freertos.org/Real-time-embedded-RTOS-Counting-Semaphores.html


-shared circular buffer for string output of P and input of C
When using a semaphore, you wait on it using sem_wait(), an analogy is waiting for a key to go to the washroom.
-when buffer is empty, c should sleep?
When youre finished with the resource, you return the key using sem_post() for someone else to use.
-when the buffer is full, P should sleep?
(solution is suprising tricky)
what happen if the consumer is sleeping, the producer wake them up? don’t corrupt the buffer? how to you minimize the cost? Anil will talk about this next time.


producer consumer in real life:
</pre>
‘ls | more ‘
pipe is producer, consumer, one process producer, other is consumer, there is a buffer between this two things. kernel maintain this buffer.
How does it handle the coordination?
consumer read from the buffer (stdin, stdout), kernel knows it and implement it.

Latest revision as of 12:03, 20 October 2017

Video

Video from the lecture given on October 10, 2017 is now available. (The audio worked!)

Notes

In-Class

Lecture 10
----------

Concurrency
 - more than one thing happening at the same time

challenge with concurrency is keeping shared state consistent
 * files that are written to by multiple processes
 * two or more processes communicating
 * multiple threads within a process

concurrency leads to race conditions
 - order of interleaved actions affects output
 - ex. did process A or B modify the file first?

We want deterministic behavior from concurrent computations

To get consistent computation, we need to coordinate
 - but you can't interact via regular variables

Regular variables are subject to the memory hierarchy
 - variables are stored in registers, L1 cache, L2 cache, L3 cache, main memory
 - can be deeper in some contexts
 - so what if coordination variable is in L1 cache for one thread and in main memory for another?
 - hardware maintains consistency...eventually

mechanisms for coordinating concurrent computations must bypass the memory hierachy

Instead use
 - filesystem locking primitives (kernel)
 - special machine language instructions

Operating systems - kernels in particular - are highly concurrent
 - makes them painful to code
 - have to always worry about having exclusive (or safe) access to data, resources


Abstractions for managing concurrency
 - atomic variables guarantee strict ordering of operations

Example

Variable X:  ***** <- range of memory locations

To modify X, you...
  - load X into a register
  - modify the register
  - save the register to X

What if two cores try to modify X at the same time?
  - both could have X in a register at the same time

X is 5
core A increments by 1
core B decrements by 1

What's X after both have run?

If B changes X but doesn't see A's change...you get 4
If A changes X but doesn't see B's change...you get 6

For X to be atomic, ^^^ shouldn't be possible
 - has to make sure only one core accesses X at a time

Atomic variables are typically used to build semaphores...

But what we really want is mutual exclusion

 - only one thread/process/CPU can access the resource at a time

Want mutual exclusion for larger resources, e.g. data structures, devices

Higher-level languages directly support mutual exclusion, e.g.
Java synchronized methods

Semaphores are an analogy to a railroad control mechanism
 - only one train on the track!

When a train enters the shared track
 - grab the semaphore (increment atomic variable)
 - if not available, wait (check first to see if it is 0)

When a train leaves the shared track
 - release the semaphore (decrement atomic variable)

Lesson: don't implement semaphores yourself, please!

In the Linux kernel
 - locks using semaphores of various types are associated with all kinds
   of data structures
   e.g. inodes

When you can't get the lock, what do you do?
 - sleep (tell the scheduler to wake you up when the lock is available)
 - deal with not getting the resource
 - spin (do a null loop)

If the wait time is short, spinning is the fastest thing you can do

Where possible, everything should be exclusive, minimize sharing

But in userspace...
 - shared memory...NOOOOOOO
    - unless you use atomic variable, or something higher level
      like a semaphore
 - messages
    - series of messages can be used to synchronize state

3-way handshakes, e.g. opening a TCP connection
 - SYN (synchronize) client->server
 - SYN-ACK (acknowledge SYN) server->client
 - ACK (acknowledge) client->server

Can send messages between processes
 - signals!
 - other ways too


concurrency pattern: producer/consumer
 * two processes P and C
 * P "makes" things, C "consumes" them
 * want to coordinate P and C
   - P shouldn't produce too much (C gets too far behind)
   - C shouldn't waste time waiting for P

 * shared circular buffer for storing output of P and input of C
 * when buffer is empty, C should sleep
 * when buffer is full, P should sleep

For those who have used Mutexes before, a mutex is essentially a binary semaphore (hence the name, mutual exclusion).

A mutex allows only a single thread to access a resource, whereas a semaphore allows, and can keep count of how many threads are accessing a certain resource at once.

Mutexes are more common in practice, for some examples of when to use a semaphore, please see: 
    http://www.freertos.org/Real-time-embedded-RTOS-Counting-Semaphores.html

When using a semaphore, you wait on it using sem_wait(), an analogy is waiting for a key to go to the washroom.
When youre finished with the resource, you return the key using sem_post() for someone else to use.