Operating Systems 2019W Lecture 23

Video

The video for the lecture given on April 8, 2019 is now available.
Notes

Lecture 23
----------

Systems security

 - not crypto(graphy)

Cryptography is amazing technology, but it is very brittle
 - almost nothing is proved secure
 - implementation is very very tricky and it is very easy to make
   a mistake that undermines all security guarantees

Debian openssl predictable PRNG flaw
 - someone was trying to get rid of valgrind warnings
 - the uninitialized memory was used to gather entropy from the OS...

Can we make perfect software?
 - if you can prove it correct...maybe?
 - proofs can be flawed and can make false assumptions

When we find vulnerabilities, we're engaging in a process that never ends.

But any vulnerability could undermine a whole system's security

On Linux, the "trusted computing base" (TCB) is
 - Linux kernel
 - bootloader
 - every process running as root (started at boot or setuid root)
 - some partially privileged processes/executables

Security standard practice is that we want as small of a TCB as possible
 - but this isn't practical, at least with current development practices

How can we have security with flawed software?!

I believe this is possible, because other systems are highly flawed yet
are reasonably secure: living systems
 - we all have a variety of imperfect defenses
 - but they work well enough to keep most of us alive most of the time

Modern computer antivirus is like requiring a vaccine against every virus

Living systems use a combination of
 - barriers & general defenses
 - adaptive defense
 - diversity


The adaptive immune system
 - notices damage & other strange behavior
 - tries many possible solutions
 - ramps up solutions that work

Say I want a system to detect malicious system calls
 - if I do large-scale learning (monitor milions of systems, get their
   system calls, look for bad ones), I will face certain fundamental problems
 - you have to stay small if you want the accuracy to be usable

My strategy
 - monitor locally
 - do stupid learning (table lookup)
 - don't do anything too dangerous in response to detected problems
   (don't kill processes or delete data)

For example, slow down unusually behaving processes

But the understandability problem is fundamental