Operating Systems 2021F Lecture 8

From Soma-notes

Video

Video from the lecture given on October 5, 2021 is now available:

Video is also available through Brightspace (Resources->Class zoom meetings->Cloud Recordings tab)

Notes

Lecture 8
---------
* Assignment 1 solutions released (will discuss at end)
* Tutorial 3 & gdb
* Tutorial 4

Using gdb
---------
- To allow attaching to processes that aren't gdb's children, do the following:

  sudo -i
  echo 0 > /proc/sys/kernel/yama/ptrace_scope
  exit    # to become a regular user again

  (if you try doing the attach without doing this, you'll get an error
   in gdb telling you about this file)

- Compile with -g (to get debugging symbols) (keep -O)
- connect in two windows
- run the program you want to watch in one window
- in the other, find out its pid (eg using ps aux | grep)
- run gdb on the binary, then attach the PID ("attach <PID>")
- set a breakpoint (probably at a function) so execution stops
  at a point of interest
- do "tui enable" to get a litle text-mode interface that shows you code
- note gdb will only follow one process at a time
   - so you have to decide whether you want to follow the parent or child
     on fork
   - by default, follows the parent
   - "set follow-fork-mode child" to follow child
- remember that gdb has extensive help and command completion
    - tab is your friend!
- n = next statement
  c = continue until next breakpoint/signal/program termination
  s = next statement, but going into functions
  print = view state of variables
  x = examine memory
  b = breakpoint (by line or function name)
  catch syscall = see every system call entered and exited (like strace but
                  slower)

- you can't run the program backwards
  (there are cool things that can, but not standard tools)

GDB is a powerful tool, lots to play with and master
 - For this class I don't care about you learning gdb per se
 - rather, it is a tool for you to understand how
   processes work


Question: how does gdb actually work?
 - aren't processes separate?
    - they each have their own address space
    - how is one process controlling another?
 - how can ltrace and strace watch another process?
 - ONLY WAY: ask the kernel for help
 - they use ptrace
 - ptrace can only follow one process at a time
 - it is also very intrusive, can change program behavior
   - you don't want to use it when someone cares about the program
     continuing to work
 - you use ptrace-based tools to debug programs
   - but what if you want to debug in production?

 - traditionally, to debug in production you'd just look at logs & crash dumps
 - but now we have something better: eBPF
 - "enhanced Berkeley Packet Filter" (name is almost meaningless now)
   - allows us to add code to the kernel safely to interact with the system

 - if your vm does not have /usr/local/share/bpftrace/tools, you're running the wrong VM (it should be the 2021 os VM)
    - this is the one I created for the class

 - VM is all set for bpftrace except for one thing
    - WRONG KERNEL
    - kvm kernels (for some strange reason) don't have full eBPF support
       - error in its configuration
    - so you need to install the generic kernel
       - instructions in Tutorial 4
    - if you have problems please let me know!
 - get the generic kernel running before doing Tutorial 4
    - check /proc/version that it says something like
        Linux version 5.11.0-37-generic
      NOT -kvm


 - unlike strace-based tools, eBPF-based ones must run as root
    - they can SEE ALL, so it makes sense

 - what's great about bpftrace is it lets you see what is happening anywhere on
   the system
     - so can watch specific system calls, who are making them and when
     - but can also watch function in userspace & kernelspace

 - yes in an attacker's hands this is potentially very bad
    - that's why only root can do it
    - lots of other ways for root to get this kind of info,
      this is just crazy easy
 - I will add to the tutorial the header file with the system call numbers
   - so you can interpret the output of syscalls.bt


bpftrace works by attaching "probes" to specific tracepoints
 - events that can be monitored
 - a probe runs when the event happens
 - you can see a list of possible events with bpftrace -l
   - but you can also do uprobes of arbitrary
     userspace functions
   - run "sudo bpftrace -l | wc" to see how many, I see 50K+
      - use grep to search!
 - to see what probes are being used, run bpftrace -v
   (verbose)

I don't expect you to understand how bpftrace works
 - it is pretty magical

But I do expect you to get an understanding of what it is showing you
 - files being opened, programs being run, signals being sent
 - perspective on everything we've covered up to now

eBPF is a hot technology in the cloud today
 - major companies use all kinds of eBPF-based tools to monitor their
   infrastructure, track down bugs, and even secure systems
 - look up cilium.io to see the kinds of things being enabled with eBPF

(bpftrace is just one eBPF-based tool by the way)

Later you'll try writing your own bpftrace scripts
 - but for now, if there is something you'd like to see, ask for
   it on Teams, I can try putting something together

by default, a bpftrace scripts watches the whole system
  - you have to add logic to limit what you see
  
If you have time to spend learning bpftrace, go ahead, but it won't be covered directly on the midterm
  - it is its own language, not fully documented
  - I want you to understand the output of the bpftrace scripts asked about
    in Tutorial 4

Other cool things in eBPF:
  - bcc, the eBPF compiler collection (python + C)
  - cilium (cloud monitoring)
  - bpfcontain (William Findlay, my PhD student, doing container security)


We're going to use eBPF to learn how the kernel works later
 (after the midterm)

eBPF is (a) tool you use to find out the overhead of other tools

(Try running "gdb 3000shell" and then type "run" at the gdb prompt.  See how well things work)

Midterm is not proctored, but I will do randomized & selected interviews after
 - online proctoring is ridiculous
 - you'll submit a text file via brightspace
    - open book, open note, open internet, just NO COLLABORATION
      (you only have 80 minutes so collaboration would mean cheating,
       don't do that)
 - you may volunteer for interviews
   (good way to make sure you got all the points you should)
     - I'll post a schedule once midterms are graded

A2 will be due by class time on the 14th, along with tutorials 3 & 4.