WebFund 2016W Lecture 11
Video
The video for the lecture given on February 23, 2016 is now available.
Notes
In-class Notes
Lecture 11 ---------- Databases /var/spool/mail - where email is stored - one file per user my mail inbox in /var/spool/mail/soma Who accesses this file? - mail client - mail server What happens if the mail server delivers mail while I'm going through my email (deleting, filing) - unless you lock it, you lose mail, mail client will overwrite inbox and delete new messages - solution: lock the inbox create a soma.lock file (inside it has the process ID of the accessing process) This is a bad idea that does not scale Instead, use a database - concurrent access - data persistence What kind of database? - key-value stores like a filesystem for very small files, with search - document store <-- we'll use this values have structure can index on values - relational database sets of tables tables have rows and columns relations between tables example: names and addresses - rows are individual people - columns are first name, last name, address, city, etc. relation example: customers and purchases - each purchase (invoice) has item purchased, price, and customer ID - customer table connects customer ID and name, address, etc We're going to use MongoDB, a document store, not a relational database. - stores JSON-like documents - scriptable in JavaScript If we were to use a relational database, we'd have to learn SQL - Structured Query Language - many man implementations Oracle, MySQL (MariaDB), PostgreSQL, MS SQL Server, SQLite, etc - you mainly need a relational database to handle transactions Transaction - what if a change to a database involves multiple tables? - a transaction ensures all tables are changed or none are Classic example: airline tickets Tables: - seat reservations - payment - customer info When you buy a ticket, you want to change all three A transaction ensures atomic behavior all happen or none happen MongoDB Relational DB Document => row Collection => table Web Architecture ---------------- web browser (interprets HTML, CSS, JavaScript) | | HTTP | web server (runs arbitrary code, in this class JavaScript in node) | | SQL or MongoDB protocol | database (persistent store, allows concurrent access safely)
Student Notes
Databases
Problems with Examforms
- Previously, we used and array (
var state = [];
) to store the data entered in the form - The data stored in
state
is not persistent meaning it is lost when the server is shut down (stored in RAM) - We could try storing the data in a file instead
- This quickly becomes problematic when we have multiple machines trying to work with the same file (concurrent access)
General Motivation
/var/spool/mail
- The place where email is stored and one file is assigned per user
- My Inbox:
/var/spool/mail/soma
- Who have the ability to access the file?
- Mail client
- Mail server
- What happens if the mail server delivers mail while I’m going through my email?
- (deleting, filing)?
- Unless we don’t lock it, we will lose the mail
- The mail client will overwrite inbox and delete new incoming messages
- Solution: lock the inbox
- Create a soma.lock file (Inside it has the process id of the accessing process)
- This is bad idea that does not scale
A Database As a Solution
- A database offers:
- Concurrent access
- Data persistence
- Now we know what the database does and why should we use it. But what kind of database should we use?
Key Value Store
- Simple association of keys with values
- Like a file system for a small files, with search
Document Store
- Similar to key value stores
- Values have a structure
- Able to do indexing
- We’re going to use MongoDB, a document store
- Stores JSON-like documents (actually BSON)
- Scriptable in JavaScript
Relational Database (Gold Standard)
- Sets of tables
- Tables have rows and columns
- Relations between tables
- Example: names and addresses
- Where rows are individual people and columns are first name, last name, address, city etc.
- Another example: customers and purchases
- Each purchase is a row in the purchases table (the row has item purchased, price and customer ID)
- The customer table connects name, address etc via the customer ID
- If we were to use a relational database, we’d have to lean SQL (Structured Query Language)
- There are a lot of implementations:
- Oracle
- MySQL(MariaDB)
- PostgresQL
- MS SQL Server
- SQLite
- There are a lot of implementations:
- You mainly need a relational database to handle transactions
- What if a change to a database involves a change to multiple tables?
- A transaction ensures all tables are changed or none are
- Classic example: airline tickets
- Tables:
- Seat reservations
- Payment
- Customer info
- When you buy a ticket, you would want to change all three tables
- A transaction ensures atomic behavior
- All changes happen or none happen
- Tables:
Storageline.js
MongoDB
- We will be working with MongoDB as the database that our web applications will use
- When compared to a relational database, the collections of MongoDB are like tables and the documents of MongoDB are like rows
- MongoDB is already installed and is running on the course VM
- If you are using your own machine, you will have to install it
- You can run MongoDB by typing
mongod
in the terminal
- MongoDB listens on 27017 port by default
- You can run a Mongo client from the terminal by typing
mongo
- From here you can interact with the database
Interacting With the Database from a Script
- We use the npm module
mongodb
to make our web application interact as a client with the MongoDB server - The Mongo server (MongoDB) must be running in order for the application to be able to communicate with it (the same goes for the terminal client)
- In this script, we are saving log file entries into the database
- Logs are records left by the system and other applications
- We will be using a syslog-like logs in the tutorial
- Our goal is to parse and store it the log entries in a Mongo collection
- Notice the use of
mc.connect()
in the script which is used to establish a connection to the database. It takes 2 arguments:- A string with the address of the database to access
- A callback function which runs once the connection is established (or the application has failed to establish a connection)
- From within the callback function, we can now access the collections of the database that we have connected to
- Here, we use the
insert()
method of the collection to store a log entry as a document - Every document in a collection has to have a unique ID
- The ID is generated automatically during an insertion
- It is also known as a primary key which helps us to distinguish between documents
Checking the Content of a Collection
- An example of looking up documents in a collection might look like:
mongo //launches the Mongo client
show dbs //lists all databases
use demo-log //indicates which database we wish to use
show collections //lists all collections in the current databse
db.logs.find() //shows all documents in the logs collection
Web Architecture
- The database acts as a server which our web server talks to
- Our web server is thus a client of the database server
- A web browser talks to our server using HTTP
- Our server talks to the Database using SQL or a database-specific protocol (such as the MongoDB protocol)