Operating Systems 2020W Lecture 8
Video
The video from the lecture given on January 31, 2020 is now available.
In-Class Notes
Lecture 8 --------- Topics for today - ssh - usernames, groups, uid, gid - process id's: uid, euid, fsuid - file permissions: read, write, execute - directory permissions - setuid, setgid - /etc/passwd, /etc/group, /etc/shadow - login process different kinds of user ID's associated with a process: - uid: the original user ID - euid: "effective" uid, one used for determining privileges (e.g., what processes you can terminate) - fsuid: the uid used for accessing files fsuid is mostly used by file server programs, e.g. NFS, samba (CIFS on Linux) - we'll ignore it from here on Normally euid=uid, but with setuid...euid becomes the uid of the executable file, while the uid stays the same. - so euid generally becomes 0 (root) while the uid stays that of a regular process When a process does a system call, the kernel has to decide whether that system call is authorized - normally it checks the process's euid (and normally euid=uid) to see if this is allowed - if euid=0, almost everything is allowed (as this is root's user ID) - regular users, you have to check the file permissions (for file access) or the uid of the process (for sending signals) to the operating system, there are only uid's and gid's, no usernames or groups - we have files defining the mapping of uid<->usernames, gid<->groups User accounts are in /etc/passwd Groups are in /etc/group On most modern systems, there are NO PASSWORDS in /etc/passwd - there used to be - but having passwords in a file that is readable by everyone on the system is a bit of a security risk - nowadays, passwords are stored in /etc/shadow (a file that has restricted access) It is a pain to manually edit /etc/shadow and /etc/passwd and keep things in sync. So, if you want to do edits... - "shadowconfig off" - edit /etc/passwd, /etc/group - "shadowconfig on" No limit to the number of groups a user can be in, not sure about the limit for how many users a group can have - all done by lookups in userspace so depends on the utilities, not the kernel How are passwords stored in /etc/shadow? - the $6$ prefix means they are encoded using SHA-512 (a variant of SHA-2) - the DES variant of original crypt is horribly insecure - and actually SHA-512 isn't really that great, web apps normally use other functions You should have access to the "crypt(3)" man page, but it wasn't installed on my desktop for some reason The point of a secure hash: - easy to compute the hash - hard (infeasible) to find something that hashes to a given value So technically, you don't have to guess someone's password, you just have to guess a string that has the same SHA-512 hash as their password - in practice this is the same thing If you stole password hashes (stole /etc/shadow or /etc/passwd with password hashes stored in it), you can do an offline attack to guess passwords - you just guess possible passwords, compute their hashes, and compare with the hashes in the password database - "salt" is added (a known string prefix) to each password so that two people with the same password don't have the same hash, e.g. alice has "banana" as password with salt "tacos" hash is of "tacosbanana" bob has "banana" as password with salt "pizza" hash is of "pizzabanana" alice and bob don't have to remember their salt, it is generated automatically and stuck in the password file (appended to the hash string), so it isn't any more secret than the hash NOTE THAT PASSWORD GUESSING PROGRAMS ARE VERY GOOD NOW - look up "john the ripper" - good at guessing common substitutions, e.g. numbers for letters - knows many dictionaries - and modern computers, with GPUs, can do a ridiculous number of guesses - and there are rainbow tables (precomputed tables of all possible password to hash mappings for up to like 12 characters) This means that if a password hash database is compromised, assume the attackers will crack all the passwords - so you hash, but try to keep it private note that no hash function has been proved to be hard to reverse, we just think they are...until they aren't - cryptographers keep breaking old hash functions - so always make your applications so they can change note that secure hash functions are designed so a one bit change in input changes half the bits of the output on average - so small changes should lead to big hash differences So once you type your password, the login process (or ssh or however you got in) has to: - create a child process - have it set up the environment for the new user - change the uid, gid to the new user - execve the user's shell/startup program To perform administrative tasks as a regular user that have to modify files owned by someone else (generally root) - the program binary should be setuid root (or setuid the user who has access to the file, or setgid to the group who has access) Note that secure hashes are often used as unique identifiers - even though technically they aren't unique - e.g., certificates, change sets in git - when you look, secure hashes are everywhere But now let's talk about connecting to your VMs without passwords To do this, we identify ourselves using public key cryptography - instead of a password, we generate a public/private key pair - we are identified by the pubic key - we prove we are who we are by using the private key to answer challenges generated using the public key Better than passwords because if you tell anyone your password they know your password (and thus can impersonate you) - note that the process on a system that accepts a password always sees its plaintext. You don't store it on disk (you just store the hash) but if the process is compromised the attacker can get the actual password - with public key crypto, the remote system NEVER has the private key, so there is nothing to steal that is confidential Whenever someone talks about digital signatures, code signing, certificates, TLS/SSL - they are talking about technologies built on public key cryptography For example, let's talk about SSH - can identify users using a password, but better to identify using a public-private key pair - generate keypair on local machine using ssh-keygen - copy public key to remote system - locally it is id_rsa.pub or similar - on remote system, add it to authorized_keys (can have multiple keys, one per line) - all files are in .ssh (note this directory should only be accessible by the user) authorized_keys stores public keys associated with a user (on remote system) known_hosts stores public keys associated with remote hosts (on local system) With ssh, remote hosts are identified by public/private key pairs OPTIONALLY, users can be identified by public/private key pairs this is the same with SSL/TLS - but almost nobody identifies users with public/private key pairs (i.e., certificates, not passwords) - if you use a secure token, though, this may be possible I expect you to know how to set up passwordless connections using ssh - because that is a super common use case in the cloud the rest I'm not going to ask about And I expect you to understand accounts, uid, gid, etc - just not most of the crypto