Operating Systems 2021F Lecture 8
Video
Video from the lecture given on October 5, 2021 is now available:
Video is also available through Brightspace (Resources->Class zoom meetings->Cloud Recordings tab)
Notes
Lecture 8 --------- * Assignment 1 solutions released (will discuss at end) * Tutorial 3 & gdb * Tutorial 4 Using gdb --------- - To allow attaching to processes that aren't gdb's children, do the following: sudo -i echo 0 > /proc/sys/kernel/yama/ptrace_scope exit # to become a regular user again (if you try doing the attach without doing this, you'll get an error in gdb telling you about this file) - Compile with -g (to get debugging symbols) (keep -O) - connect in two windows - run the program you want to watch in one window - in the other, find out its pid (eg using ps aux | grep) - run gdb on the binary, then attach the PID ("attach <PID>") - set a breakpoint (probably at a function) so execution stops at a point of interest - do "tui enable" to get a litle text-mode interface that shows you code - note gdb will only follow one process at a time - so you have to decide whether you want to follow the parent or child on fork - by default, follows the parent - "set follow-fork-mode child" to follow child - remember that gdb has extensive help and command completion - tab is your friend! - n = next statement c = continue until next breakpoint/signal/program termination s = next statement, but going into functions print = view state of variables x = examine memory b = breakpoint (by line or function name) catch syscall = see every system call entered and exited (like strace but slower) - you can't run the program backwards (there are cool things that can, but not standard tools) GDB is a powerful tool, lots to play with and master - For this class I don't care about you learning gdb per se - rather, it is a tool for you to understand how processes work Question: how does gdb actually work? - aren't processes separate? - they each have their own address space - how is one process controlling another? - how can ltrace and strace watch another process? - ONLY WAY: ask the kernel for help - they use ptrace - ptrace can only follow one process at a time - it is also very intrusive, can change program behavior - you don't want to use it when someone cares about the program continuing to work - you use ptrace-based tools to debug programs - but what if you want to debug in production? - traditionally, to debug in production you'd just look at logs & crash dumps - but now we have something better: eBPF - "enhanced Berkeley Packet Filter" (name is almost meaningless now) - allows us to add code to the kernel safely to interact with the system - if your vm does not have /usr/local/share/bpftrace/tools, you're running the wrong VM (it should be the 2021 os VM) - this is the one I created for the class - VM is all set for bpftrace except for one thing - WRONG KERNEL - kvm kernels (for some strange reason) don't have full eBPF support - error in its configuration - so you need to install the generic kernel - instructions in Tutorial 4 - if you have problems please let me know! - get the generic kernel running before doing Tutorial 4 - check /proc/version that it says something like Linux version 5.11.0-37-generic NOT -kvm - unlike strace-based tools, eBPF-based ones must run as root - they can SEE ALL, so it makes sense - what's great about bpftrace is it lets you see what is happening anywhere on the system - so can watch specific system calls, who are making them and when - but can also watch function in userspace & kernelspace - yes in an attacker's hands this is potentially very bad - that's why only root can do it - lots of other ways for root to get this kind of info, this is just crazy easy - I will add to the tutorial the header file with the system call numbers - so you can interpret the output of syscalls.bt bpftrace works by attaching "probes" to specific tracepoints - events that can be monitored - a probe runs when the event happens - you can see a list of possible events with bpftrace -l - but you can also do uprobes of arbitrary userspace functions - run "sudo bpftrace -l | wc" to see how many, I see 50K+ - use grep to search! - to see what probes are being used, run bpftrace -v (verbose) I don't expect you to understand how bpftrace works - it is pretty magical But I do expect you to get an understanding of what it is showing you - files being opened, programs being run, signals being sent - perspective on everything we've covered up to now eBPF is a hot technology in the cloud today - major companies use all kinds of eBPF-based tools to monitor their infrastructure, track down bugs, and even secure systems - look up cilium.io to see the kinds of things being enabled with eBPF (bpftrace is just one eBPF-based tool by the way) Later you'll try writing your own bpftrace scripts - but for now, if there is something you'd like to see, ask for it on Teams, I can try putting something together by default, a bpftrace scripts watches the whole system - you have to add logic to limit what you see If you have time to spend learning bpftrace, go ahead, but it won't be covered directly on the midterm - it is its own language, not fully documented - I want you to understand the output of the bpftrace scripts asked about in Tutorial 4 Other cool things in eBPF: - bcc, the eBPF compiler collection (python + C) - cilium (cloud monitoring) - bpfcontain (William Findlay, my PhD student, doing container security) We're going to use eBPF to learn how the kernel works later (after the midterm) eBPF is (a) tool you use to find out the overhead of other tools (Try running "gdb 3000shell" and then type "run" at the gdb prompt. See how well things work) Midterm is not proctored, but I will do randomized & selected interviews after - online proctoring is ridiculous - you'll submit a text file via brightspace - open book, open note, open internet, just NO COLLABORATION (you only have 80 minutes so collaboration would mean cheating, don't do that) - you may volunteer for interviews (good way to make sure you got all the points you should) - I'll post a schedule once midterms are graded A2 will be due by class time on the 14th, along with tutorials 3 & 4.