Basic Synchronization Principles
This article discusses approaches and issues involved with managing concurrency. The objective is to be able to do multiple things at the same time (i.e. using multiple threads), while avoiding coordination problems.
Three Concurrency Problems to Address
Synchronization
Synchronization, a.k.a. serialization, is making tasks run in a serial sequence (one at a time) instead of all at once, because of dependencies between the tasks. An example of this is that when building a house the person building basement needs to tell person building ground floor when the basement is done, so that work on the ground floor can start.
In a more complex case, it may be that parts of the ground floor are only dependent on parts of the basement, so the person building the ground floor may be able to begin work sooner. This more complex case is related to the Producer-Consumer pattern discussed below.
Mutual Exclusion
Mutual exclusion is controlling access to shared resources. Continuing with the construction example, only one person can use the same drill at the same time, or things get ugly. Likewise, things like a disk or a data structure often require a limit of one thread accessing it at a time to ensure the desired effect.
Data Sharing
Data sharing, which is closely related to mutual exclusion, is managing the duplication of data structures or devices. These duplicated resources would in turn send updates to a manager of sorts, and would be informed of updates by the manager.
An example of this is what Massively-Multiplayer Online (MMO) games do to avoid certain problems associated with network lag (the delay between sending a notification of action to the server and receiving the result of that action). If this was done with all world data processing done on the server, a player wouldn't know whether or not an opponent had been defeated until the network responds with the result. With data sharing, the client program can immediately determine, based on its copy of the local environment, that it is likely that the opponent has been defeated, and then receive either confirmation or rejection of that assumption when the server responds with its "official version" of events.
Message Passing is this way of managing duplication of state via coordinated communication between copies. Shared Memory is a way to simulate duplication of state in a way that in some circumstances, can allow concurrent access, and generally access with less overhead than message passing.
Tackling Mutual Exclusion
A Broken Attempt
linked_list list;
int using_list = 0;
void use_list() {
while (using_list) {
// Just waiting here, so nothing goes in the loop
// Could sleep, but this is a broken attempt anyway
}
using_list = 1;
// use the list
// this part of the code is known as a critical region
using_list = 0;
}
What's the problem?
TOCTOU: Time Of Check to Time Of Use
If the thread went to sleep immediately before running the “using_list = 1;” line, another thread might also get access to the list, since using_list is still zero.
Fixing the Broken Attempt
Atomic operation: an operation that cannot be “split apart”
One machine instruction can be made atomic easily, but any more are not guaranteed to be atomic in hardware. As such, special machine instructions are needed to handle this.
Single-Processor System
- One can simply disable interrupts while acquiring access, since then no threads can jump in between the checking of using_list and the setting of using_list.
- Interrupts should only be disabled for so long, because devices depend on the system reacting to interrupts.
linked_list list;
int using_list = 0;
void use_list() {
asm { // disable interrupts
cli
}
while (using_list) {
asm { // enable interrupts, because access wasn't acquired this time
sti
}
sleep(0);// go to sleep so that other threads will run
asm { // disable interrupts to test again
cli
}
}
using_list = 1;
asm { // enable interrupts, because this thread now has mutually exclusive access
sti
}
// use the list
using_list = 0;
}
Multi-Processor System
- For this, a special test-and-set instruction or a special swap instruction are needed.
- The operations need to be bus-locked, which the processor does using its bus snooping mechanism, to make sure that cache coherence is maintained (e.g. all caches get the updated value of using_list before the operation is considered complete).
- An alternative approach to the one presented below, (which can still be implemented using test-and-set or swap), is known as spinlock, which is to keep checking repeatedly until the thread gets access. This works for cases of “low contention” i.e. when you won’t have to be checking too long, because another processor has the access, and it will only have access for a short period of time.
test-and-set: sets (to 1) a bit and indicates whether it was set (1) or clear (0) before swap: one can swap a variable with the value 1, get back its previous value, and check whether it was 0 or 1 before
Using test-and-set:
linked_list list;
int using_list = 0;
void use_list() {
asm {
CheckForAccess:
lock bts using_list,0 ;bus-locked bit test-and-set on bit 0 of using_list
jnc GotAccess ;got access if the bit was 0 at the beginning of the test-and-set
;previous value of bit is in carry flag, so no carry (nc) means bit was 0 before
}
sleep(0); // go to sleep so that other threads will run
asm {
jmp CheckForAccess
GotAccess:
}
// use the list
using_list = 0;
}
Using swap:
linked_list list;
int using_list = 0;
void use_list() {
asm {
CheckForAccess:
mov eax,1
lock xchg using_list,eax ;bus-locked exchange (swap)
test eax,eax ;check if eax is 0 (i.e. check if using_list was 0)
jz GotAccess ;got access if the using_list was 0 at the beginning of the exchange
}
sleep(0); // go to sleep so that other threads will run
asm {
jmp CheckForAccess
GotAccess:
}
// use the list
using_list = 0;
}
Making a massively parallel system with separate memories isn’t hard, but making it work with shared memory is very difficult.
Abstractions
Test-and-set and swap are quite low-level and painful to use in complex cases, so we want to wrap them in an abstraction.
Semaphore
A semaphore is the simplest abstraction. It is a counter variable with two associated functions, up and down.
The initial value of the counter variable indicates how many threads are allowed to gain access concurrently. This initial value is 1 for a binary semaphore, a.k.a. a mutex, a.k.a a simple lock.
down(sem): This atomically tries to decrement the counter, and if it can't (i.e. the counter was already 0), it waits until it can (i.e. until the counter is above 0 again). down is called to gain access. up(sem): This atomically increments the counter. up is called to release access.
linked_list list;
int list_semaphore = 1;
void use_list() {
down(list_semaphore); // Gain access to the list
// use the list
up(list_semaphore); // Release access to the list
}
Producer-Consumer
The (Bounded-Buffer) Producer-Consumer problem itself is more of an abstraction of problems rather than an abstraction of solutions, but its common solutions are abstractions as well. Suppose that there is a thread (or multiple threads) producing items in a buffer, and a thread (or multiple threads) consuming items in a buffer. An example of this is an assembly line (referring to manufacturing, not assembly language).
NOTE: I'll add the rest of the notes later today.