WebFund 2015W Lecture 12
Video
The video from the lecture given on February 23, 2015 is now available.
Notes
On-Screen Notes
Lecture 12 ---------- Midterms to be returned Wednesday Today: MongoDB and databases Parts of a web application: * Front End: client-side code (HTML, CSS, JavaScript) - runs in the browser * Back End: server-side code - web server: PHP, Python, Java, *JavaScript* (node) - database: MySQL, Oracle, PostgresQL, MongoDB, ... Early web servers just used a filesystem - good for static content - bad for concurrent access Why concurrency? - many different web clients trying to modify the "same" resources - e.g., passwords stored in a file - *really* bad when using multiple web servers Databases are good at - fine-grained data concurrency - not as easy to "scale" (cause concurrency is hard) - that's SOMEBODY ELSE's PROBLEM Why not just have web clients talk to the database? (cut out the middle man) - security: don't want all clients to access ALL data - databases do have access control but generally not suitable for most web applications - also, data validation - scalability - databases are relatively slow - often stick a caching layer in front/separate out static content Types of databases ------------------ SQL versus NoSQL databases Actually, many kinds of NoSQL databases Difference is *transactions* SQL => relational databases Data is divided into tables tables have columns (fields) and rows (records) Columns have types (date, string, number, etc) Table for customers fields (columns) - Customer ID (primary key) - name - street - city - province - etc. Table for invoices - Invoice ID (primary key) - Customer ID - what they purchased - amount Relations connect tables Relations often work best with unique keys - fields where every record has a unique value Scenario - creating first invoice for a new customer - problem: system crashes after adding the invoice but before adding the customer's info - transaction is what makes sure this never happens - if just an invoice or a customer change, transaction is not complete and data is thrown away Transaction: - start transaction - make your changes - add the invoice - add the customer - commit transaction if all changes succeeded, otherwise abort Relational databases have *overhead* and *complexty* and *security issues* NoSQL => no transactions We're using MongoDB - "document store" - i.e., JSON store (Technically, BSON) - no built-in relations - however, MongoDB can run code...JavaScript In SQL databases - fields are pre-defined and relatively rigid In MongoDB - can have any properties in any document - no consistency requirement Indexing - records/documents are stored in order of primary keys - fast to find by primary key - what if I want to find records/documents by something that *isn't* a primary key - e.g., invoices by what was purchased, customers by city basic search - if sorted, then you do binary search - maybe store so you can index into an arry/hash table - otherwise...exhaustive search Indexing is a way to get around exhaustive search An index is a data structure associating: * a field or fields with * the primary key Index by city in customers - sorted list of cities and customer IDs Why not index on every field? - space - time of insert/add operation