Operating Systems 2019W Lecture 5

From Soma-notes

Video

Video from the lecture given on January 21, 2019 is now available.

Notes

Readings

Summary Notes

If you want to see how changes in C source code map to assembly code, generate assembly for both versions (using gcc -O2 -S) and then compare the output using diff -c. The -c option ensures that diff adds context, which makes the diff easier to read.

Assignment 1 has a question about the allocation of local variables in assembly. The rest of the lecture was devoted to explaining how the function call stack worked and how it relates to local variables.

The function call stack (for x86/x86-64)

  • Stored in a process's address space
  • Top of the stack is stored always in the stack pointer register, %rsp on x86
  • The stack grows "down", meaning to push n bytes onto the stack:
    • decrement %rsp by n
    • new value of %rsp is address for storing the n bytes
  • When a function is called, the call instruction pushes %rip (the register containing the address of the currently executing instuction) onto the stack. This is the return address (where control should return after the function terminates).
  • When a function returns, the ret pops the return address off of the stack and puts it into %rip, thus causing the CPU to run code just after the preceding call instruction.

The function call stack is also used to store local variables:

  • The base pointer register %rbp saves the current %rsp value when you enter the function
  • Other registers to be saved (if any) are pushed onto the stack
  • The stack pointer is decremented to make room for local variables
  • Before returning, registers are restored and the old value of %rsp is restored so the top of the stack refers to the address of the call instruction.

Stack-based storage is very cheap to allocate; however, variables allocated on the stack go away when the function that allocated them exits. If you want data that persists even when the function terminates, you need the stack.

Once you understand how stack allocation works, stack-based buffer overflows are easy to understand: writing past the end of a stack-allocated buffer will eventually overwrite the return address stored on the stack. If the attacker can change the return address they can make the program jump to any part of a process's memory. (See "Stack Smashing for Fun and Profit" under readings.)

Note that modern operating systems have multiple defences against stack-based buffer overflow attacks. We will discuss them later in the semester.

We'll talk about how the OS helps with managing the heap in the next lecture. Later we will also discuss in detail how system calls differ from function calls.

In Class

Lecture 5
---------

Stack manipulation


FFFF


C000   <-- old %rsp 



B100  <-- enter a function X
B000  <-- start of local variables



0000

When something is "pushed" onto the stack of size n
 - end of storage is current value of %rsp
 - new top of stack is old %rsp - n
 - new variable is new %rsp, has size n



n = 16 (10 in hex)

old %rsp = C000
new %rsp = BFF0
new pointer = BFF0

call instruction pushes %rip to stack
 - causes %rsp to be decremented by 8 on x86-64
 - return address is stored at new value of %rsp

stack allocation of memory never fragments
 - because you always allocate and deallocate in the order of stack growth
   or shrinkage


When a function starts, it
 - saves stack pointer to base pointer (%rsp->%rbp)
 - saves registers to the stack
 - allocates space for local variables by decrementing the stack pointer
   (%rsp)

When a function exits, it
 - restores registers from the stack (reverse of pushing order),
   USING %rbp
 - returns, popping return address off stack