WebFund 2024F Lecture 16
Video
Video from the lecture for November 12, 2024 is now available:
Notes
Lecture 16 ---------- - midterm is being graded, should hopefully be done by end of the week - Assignment 2 should finally be uploaded later today (sorry!) - Q7 on A3: your analysis page should have 5 or 6 extra numbers, one per question (it is 5 or 6 depending on whether you are modifying your code with the extra question or not) Q3 has 5 blank submissions Q5 has 20 blank submissions Since you also report on the total number of submissions, with these stats you can see what fraction of students left a question blank -------------- Today: cookies & TLS We previously discussed how http is a stateless protocol But sometimes we need state - most commonly, when we "log in" to a website - but is also used for site preferences (theme), other tasks - and of course, tracking for advertising targeting Most common mechanism for adding state to http is the "cookie" mechanism But what is a cookie? - data set by the server with a "Set-Cookie:" header - later requests to the server get the added header "Cookie: " with the value of the cookies that were previously set - allows the server to "save" data on the client (browser) that will be given back to it - remember, servers talk to many clients, so things like cookies help the server distinguish them from each other - But remember, cookies are just data included in an HTTP header - so anyone can set any cookie - only way to be secure is to 1) make sure it isn't sent to the wrong sites and 2) make sure it isn't guessable Web Server Browser <--- GET /index.html contents of index.html ----> renders index.html, saves user=bob + Content-Type: text/html cookie for this server + Set-Cookie: user=bob <--- GET /index.html + Cookies: user=bob contents of index.html ---> for the user bob Any browser can set any cookie - if you can guess a cookie that allows you to get confidential information, that cookie is insecure (as is the web app) - if you can steal the cookie, you have some sort of hijacking attack - and can potentially be stolen by just impersonating the right site So why is it called cookie? - this is actually an old term - used by the X Window system, which predates the web - used as a general term for data stored that is opaque to the storer & used for authentication & session management Note that a browser has NO OBLIGATION to store a cookie for a website - it can forget it immediately - it can forget it later Browsers generally shouldn't modify cookies, but they can easily forget them Why is it a cookie? - I think someone was hungry What does it mean for a cookie to be secure? - that is separate from secure cookie handling, which needs other technology - essentially, the cookie should not be guessable by unauthorized parties - so if you have a cookie that represents that a user is logged in, unauthorized users shouldn't be able to guess it, otherwise they can be logged in as well, even if they don't know the password This is why you have to keep logging in on the web - everyone is paranoid about cookies being stolen, so they make them valid only for a limited period of time But remember, http is sent in the clear over the Internet - anyone could be snooping - and if they are listening in, they can grab any important cookies and use them for bad purposes - doesn't matter if they are secure cookies or not! This is why almost all web traffic today is encrypted, using the protocol https - https is HTTP over TLS - TLS = transport layer security - used to be SSL = secure sockets layer - decided to have a more general name for a more general mechanism TLS can be used to secure any TCP/IP data stream - email (POP3, IMAP, SMTP) uses TLS today - basically any regular protocol can be put over TLS to make it "secure" But the security guarantees of TLS are very specific and have very strict requirements So if the assumptions of TLS hold, then you get - no eavesdropping (confidentiality) - even if traffic was recorded, can't be decoded later even if both parties are compromised later (perfect forward secrecy) - no undedectable tampering (modifications just end communication with an error) (integrity) - one-sided or two-sided authentication - you either know who the server is, or you know who both the server & client are - on the web today we almost always do one-sided authentication (username/password/2FA are used for the other direction, not TLS) In TLS, entities are identified with certificates. A certificate has - the public key of the entity - associated metadata (name of organization/server, how long valid, etc.) In public key cryptography, you have public keys and private keys - public keys are given away - private keys must be private! A private key is used to decrypt or sign data A public key is used to encrypt or verify signatures If you publish your public key - anyone can send you a secret message, but only you can read it - only you can sign a document as you, but anyone can verify that signature Digital signatures are cool because they say who signed it and that the document has not been modified at all - much better than signatures on paper!