WebFund 2016W Lecture 11: Difference between revisions

Latest revision as of 21:49, 28 February 2016

Video

The video for the lecture given on February 23, 2016 is now available.

Notes

In-class Notes

Lecture 11
----------

Databases

/var/spool/mail
  - where email is stored
  - one file per user

my mail inbox in /var/spool/mail/soma

Who accesses this file?
  - mail client
  - mail server

What happens if the mail server delivers mail while I'm going
through my email (deleting, filing)
 - unless you lock it, you lose mail, mail client will overwrite
   inbox and delete new messages
 - solution: lock the inbox
     create a soma.lock file
     (inside it has the process ID of the accessing process)

This is a bad idea that does not scale
Instead, use a database
 - concurrent access
 - data persistence



What kind of database?
 - key-value stores
    like a filesystem for very small files,
    with search
 - document store  <-- we'll use this
    values have structure
    can index on values
 - relational database
    sets of tables
    tables have rows and columns
    relations between tables
    example: names and addresses
      - rows are individual people
      - columns are first name, last name, address, city, etc.
    relation example: customers and purchases
      - each purchase (invoice) has item purchased, price, and
        customer ID
      - customer table connects customer ID and name, address, etc

We're going to use MongoDB, a document store, not a relational
database.
  - stores JSON-like documents
  - scriptable in JavaScript

If we were to use a relational database, we'd have to learn SQL
  - Structured Query Language
  - many man implementations
     Oracle, MySQL (MariaDB), PostgreSQL, MS SQL Server, SQLite, etc
  - you mainly need a relational database to handle transactions

Transaction
 - what if a change to a database involves multiple tables?
 - a transaction ensures all tables are changed or none are

Classic example: airline tickets

Tables:
 - seat reservations
 - payment
 - customer info

When you buy a ticket, you want to change all three

A transaction ensures atomic behavior
  all happen or none happen

MongoDB        Relational DB
Document   =>  row
Collection =>  table


Web Architecture
----------------

web browser (interprets HTML, CSS, JavaScript)

  |
  |  HTTP
  |

web server (runs arbitrary code, in this class JavaScript in node)

  |
  |  SQL or MongoDB protocol
  |

database (persistent store, allows concurrent access safely)

Student Notes

Databases

Problems with Examforms

Previously, we used and array (var state = [];) to store the data entered in the form
The data stored in state is not persistent meaning it is lost when the server is shut down (stored in RAM)
We could try storing the data in a file instead
- This quickly becomes problematic when we have multiple machines trying to work with the same file (concurrent access)

General Motivation

/var/spool/mail
- The place where email is stored and one file is assigned per user
- My Inbox: /var/spool/mail/soma
Who have the ability to access the file?
- Mail client
- Mail server
What happens if the mail server delivers mail while I’m going through my email?
- (deleting, filing)?
- Unless we don’t lock it, we will lose the mail
- The mail client will overwrite inbox and delete new incoming messages
Solution: lock the inbox
- Create a soma.lock file (Inside it has the process id of the accessing process)
- This is bad idea that does not scale

A Database As a Solution

A database offers:
- Concurrent access
- Data persistence
Now we know what the database does and why should we use it. But what kind of database should we use?

Key Value Store

Simple association of keys with values
Like a file system for a small files, with search

Document Store

Similar to key value stores
Values have a structure
Able to do indexing
We’re going to use MongoDB, a document store
- Stores JSON-like documents (actually BSON)
- Scriptable in JavaScript

Relational Database (Gold Standard)

Sets of tables
Tables have rows and columns
Relations between tables
Example: names and addresses
- Where rows are individual people and columns are first name, last name, address, city etc.
Another example: customers and purchases
- Each purchase is a row in the purchases table (the row has item purchased, price and customer ID)
- The customer table connects name, address etc via the customer ID
If we were to use a relational database, we’d have to lean SQL (Structured Query Language)
- There are a lot of implementations:
  - Oracle
  - MySQL(MariaDB)
  - PostgresQL
  - MS SQL Server
  - SQLite
You mainly need a relational database to handle transactions
- What if a change to a database involves a change to multiple tables?
- A transaction ensures all tables are changed or none are
- Classic example: airline tickets
  - Tables:
    - Seat reservations
    - Payment
    - Customer info
  - When you buy a ticket, you would want to change all three tables
  - A transaction ensures atomic behavior
  - All changes happen or none happen

Storageline.js

MongoDB

We will be working with MongoDB as the database that our web applications will use
When compared to a relational database, the collections of MongoDB are like tables and the documents of MongoDB are like rows
MongoDB is already installed and is running on the course VM
- If you are using your own machine, you will have to install it
- You can run MongoDB by typing mongod in the terminal
MongoDB listens on 27017 port by default
You can run a Mongo client from the terminal by typing mongo
- From here you can interact with the database

Interacting With the Database from a Script

We use the npm module mongodb to make our web application interact as a client with the MongoDB server
The Mongo server (MongoDB) must be running in order for the application to be able to communicate with it (the same goes for the terminal client)
In this script, we are saving log file entries into the database
- Logs are records left by the system and other applications
- We will be using a syslog-like logs in the tutorial
- Our goal is to parse and store it the log entries in a Mongo collection
Notice the use of mc.connect() in the script which is used to establish a connection to the database. It takes 2 arguments:
- A string with the address of the database to access
- A callback function which runs once the connection is established (or the application has failed to establish a connection)
From within the callback function, we can now access the collections of the database that we have connected to
Here, we use the insert() method of the collection to store a log entry as a document
Every document in a collection has to have a unique ID
- The ID is generated automatically during an insertion
- It is also known as a primary key which helps us to distinguish between documents

Checking the Content of a Collection

An example of looking up documents in a collection might look like:

mongo //launches the Mongo client
show dbs //lists all databases
use demo-log //indicates which database we wish to use
show collections //lists all collections in the current databse
db.logs.find() //shows all documents in the logs collection

Web Architecture

The database acts as a server which our web server talks to
Our web server is thus a client of the database server
A web browser talks to our server using HTTP
Our server talks to the Database using SQL or a database-specific protocol (such as the MongoDB protocol)

@@ Line 106: / Line 106: @@
 database (persistent store, allows concurrent access safely)
 </pre>
+===Student Notes===
+====Databases====
+=====Problems with Examforms=====
+*Previously, we used and array (<code>var state = [];</code>) to store the data entered in the form
+*The data stored in <code>state</code> is not persistent meaning it is lost when the server is shut down (stored in RAM)
+*We could try storing the data in a file instead
+**This quickly becomes problematic when we have multiple machines trying to work with the same file (concurrent access)
+=====General Motivation=====
+*<code>/var/spool/mail</code>
+**The place where email is stored and one file is assigned per user
+**My Inbox: <code>/var/spool/mail/soma</code>
+*Who have the ability to access the file?
+**Mail client
+**Mail server
+*What happens if the mail server delivers mail while I’m going through my email?
+**(deleting, filing)?
+**Unless we don’t lock it, we will lose the mail
+**The mail client will overwrite inbox and delete new incoming messages
+*Solution: lock the inbox
+**Create a soma.lock file (Inside it has the process id of the accessing process)
+**This is bad idea that does not scale
+=====A Database As a Solution=====
+*A database offers:
+**Concurrent access
+**Data persistence
+*Now we know what the database does and why should we use it. But what kind of database should we use?
+======Key Value Store======
+*Simple association of keys with values
+*Like a file system for a small files, with search
+======Document Store======
+*Similar to key value stores
+*Values have a structure
+*Able to do indexing
+*We’re going to use MongoDB, a document store
+**Stores JSON-like documents (actually BSON)
+**Scriptable in JavaScript
+======Relational Database (Gold Standard)======
+*Sets of tables
+*Tables have rows and columns
+*Relations between tables
+*Example: names and addresses
+**Where rows are individual people and columns are first name, last name, address, city etc.
+*Another example: customers and purchases
+**Each purchase is a row in the purchases table (the row has item purchased, price and customer ID)
+**The customer table connects name, address etc via the customer ID
+*If we were to use a relational database, we’d have to lean SQL (Structured Query Language)
+**There are a lot of implementations:
+***Oracle
+***MySQL(MariaDB)
+***PostgresQL
+***MS SQL Server
+***SQLite
+*You mainly need a relational database to handle transactions
+**What if a change to a database involves a change to multiple tables?
+**A transaction ensures all tables are changed or none are
+**Classic example: airline tickets
+***Tables:
+****Seat reservations
+****Payment
+****Customer info
+***When you buy a ticket, you  would want to change all three tables
+***A transaction ensures atomic behavior
+***All changes happen or none happen
+====Storageline.js====
+=====MongoDB=====
+*We will be working with MongoDB as the database that our web applications will use
+*When compared to a relational database, the collections of MongoDB are like tables and the documents of MongoDB are like rows
+*MongoDB is already installed and is running on the course VM
+**If you are using your own machine, you will have to install it
+**You can run MongoDB by typing <code>mongod</code> in the terminal
+*MongoDB listens on 27017 port by default
+*You can run a Mongo client from the terminal by typing <code>mongo</code>
+**From here you can interact with the database
+=====Interacting With the Database from a Script=====
+*We use the npm module <code>mongodb</code> to make our web application interact as a client with the MongoDB server
+*The Mongo server (MongoDB) must be running in order for the application to be able to communicate with it (the same goes for the terminal client)
+*In this script, we are saving log file entries into the database
+**Logs are records left by the system and other applications
+**We will be using a syslog-like logs in the tutorial
+**Our goal is to parse and store it the log entries in a Mongo collection
+*Notice the use of <code>mc.connect()</code> in the script which is used to establish a connection to the database. It takes 2 arguments:
+**A string with the address of the database to access
+**A callback function which runs once the connection is established (or the application has failed to establish a connection)
+*From within the callback function, we can now access the collections of the database that we have connected to
+*Here, we use the <code>insert()</code> method of the collection to store a log entry as a document
+*Every document in a collection has to have a unique ID
+**The ID is generated automatically during an insertion
+**It is also known as a primary key which helps us to distinguish between documents
+=====Checking the Content of a Collection=====
+*An example of looking up documents in a collection might look like:
+<code><pre>
+mongo //launches the Mongo client
+show dbs //lists all databases
+use demo-log //indicates which database we wish to use
+show collections //lists all collections in the current databse
+db.logs.find() //shows all documents in the logs collection
+</pre></code>
+=====Web Architecture=====
+*The database acts as a server which our web server talks to
+*Our web server is thus a client of the database server
+*A web browser talks to our server using HTTP
+*Our server talks to the Database using SQL or a database-specific protocol (such as the MongoDB protocol)