Operating Systems 2020W Lecture 15: Difference between revisions
Created page with "==Video== Video for the lecture given on March 6, 2020 [https://homeostasis.scs.carleton.ca/~soma/os-2020w/lectures/comp3000-2020w-lec15-20200306.m4v is now available]. ==No..." |
(No difference)
|
Latest revision as of 02:35, 20 March 2020
Video
Video for the lecture given on March 6, 2020 is now available.
Notes
Lecture 15
----------
- midterms mostly graded, all should be done by end of weekend
- midterms will be returned through your TA
- if you go over your midterm with your TA, you get 2 bonus marks
- solutions are posted, will go through them at a later date
Kernel modules & eBPF
---------------------
What do you do if system calls aren't enough?
- not fast enough
- insufficient functionality
- insufficient visibility
Consider web browsers
- web pages can do some things
- browser extensions can do other things
Browser extensions can be very dangerous
- see every page you visit
- change contents of any page
- change interface elements -> change how your banking website works
- send arbitrary data to other systems
- spyware!
Price for added functionality is more risk
How can we extend the operating system, specifically the Linux kernel
Firefox long had very powerful extensions, but more recently
adopted a much more restrictive extension interface
- one reason: mostly compatible with Chrome
- but big reason: much safer
- also, made it hard to change the browser, because they
had to preserve internal interfaces for external consumers
Of course, any program running on a system "extends" it
- but extensions make use of privileged APIs to allow for
tighter integration
- also, extensions tend to run in the same address space as
the main program, so can manipulate main program state
With extending the Linux kernel, we want code running in the kernel
- so, in the address space of the kernel
- but this means the code can mess with arbitrary parts of the kernel
- so you can see and modify any process, act as any user
- interact with any device in arbitrary ways
- allocate any resources you want
Classic bad thing to do with a kernel module: a rootkit
- hide processes
- hide files
- add backdoors that allow you to bypass mainline authentication
- think "joshua"
How do we control what code gets loaded?
- normally requires root privileges
- on some systems, code must be signed
Code running in the kernel IS NOT running as root!
- root is the maximally privileged user
- but root is just a label for processes
- the kernel is what implements the process abstraction
kernel modules have maximum flexibility and maximum pain
- one coding error can lead to system crashes and corrupted devices
But what about adding code to the kernel in a safer way?
When you run code in a web browser, where does it run?
- in the address space of the browser
- (compiled JavaScript)
- this is safe because code is "sandboxed"
- (sandboxing is not a technical term, it is an aspiration)
We don't sandbox code loaded into a kernel
- we already have processes
But we can verify/check code to make sure it is safe (for some
approximation of safe)
Standard kernel modules have no verification
eBPF does!
eBPF is based on the Berkeley Packet Filter (BPF), but extended
- idea was to run code in the kernel to filter packets
trace uses eBPF
strace, gdb uses ptrace (a system call)
- designed for debugging one process at a time
eBPF is very fussy about the code it accepts
- all loops must clearly terminate!
- no arbitrary memory access
- eBPF code runs in kernel space, but is verified to make sure it is safe
(supervisor mode on the CPU, in the kernel address space)
- kernel modules run in kernel space, and ARE NOT verified
- but they may be signed (so inauthentic modules will be rejected)
- supervisor mode on the CPU, in the kernel address space
- processes run in userspace, not verified but are "sandboxed" to a degree
- user mode on CPU
- own address space
- eBPF is a new thing, separate from regular functionality we've covered
- NOT used to implement system calls (at this time)
- NOT used for device drivers
eBPF is a safer way to add code to kernel space
API exposed to userspace is stable
- mainly system calls, but also device files
but internal kernel APIs are not stable
- kernel modules, eBPF programs have to be compiled anew for each
new kernel
- easier with eBPF, because designed to be compiled at runtime
Monolithic vs microkernels
- difference is in what runs in kernel vs user space
- monolithic kernels runs most networking, filesystems, device drivers
in kernel space
- microkernels try to put these all in processes
Advantage of microkernel
- potentially more stable
- easier to debug
- most OS code is in processes, so can use conventional tools to debug
Key disadvantage
- performance
(Security benefit of microkernels is quite arguable)
Linux does have userspace drivers
- NTFS for example (main Windows file system)
eBPF reduces disadvantages of monolithic kernels
- safe mechanism for extensions that is faster than processes