WebFund 2016W Lecture 11

From Soma-notes
Jump to navigation Jump to search

Video

The video for the lecture given on February 23, 2016 is now available.

Notes

In-class Notes

Lecture 11
----------

Databases

/var/spool/mail
  - where email is stored
  - one file per user

my mail inbox in /var/spool/mail/soma

Who accesses this file?
  - mail client
  - mail server

What happens if the mail server delivers mail while I'm going
through my email (deleting, filing)
 - unless you lock it, you lose mail, mail client will overwrite
   inbox and delete new messages
 - solution: lock the inbox
     create a soma.lock file
     (inside it has the process ID of the accessing process)

This is a bad idea that does not scale
Instead, use a database
 - concurrent access
 - data persistence



What kind of database?
 - key-value stores
    like a filesystem for very small files,
    with search
 - document store  <-- we'll use this
    values have structure
    can index on values
 - relational database
    sets of tables
    tables have rows and columns
    relations between tables
    example: names and addresses
      - rows are individual people
      - columns are first name, last name, address, city, etc.
    relation example: customers and purchases
      - each purchase (invoice) has item purchased, price, and
        customer ID
      - customer table connects customer ID and name, address, etc

We're going to use MongoDB, a document store, not a relational
database.
  - stores JSON-like documents
  - scriptable in JavaScript

If we were to use a relational database, we'd have to learn SQL
  - Structured Query Language
  - many man implementations
     Oracle, MySQL (MariaDB), PostgreSQL, MS SQL Server, SQLite, etc
  - you mainly need a relational database to handle transactions

Transaction
 - what if a change to a database involves multiple tables?
 - a transaction ensures all tables are changed or none are

Classic example: airline tickets

Tables:
 - seat reservations
 - payment
 - customer info

When you buy a ticket, you want to change all three

A transaction ensures atomic behavior
  all happen or none happen

MongoDB        Relational DB
Document   =>  row
Collection =>  table


Web Architecture
----------------

web browser (interprets HTML, CSS, JavaScript)

  |
  |  HTTP
  |

web server (runs arbitrary code, in this class JavaScript in node)

  |
  |  SQL or MongoDB protocol
  |

database (persistent store, allows concurrent access safely)

Student Notes

Databases

Problems with Examforms
  • Previously, we used and array (var state = [];) to store the data entered in the form
  • The data stored in state is not persistent meaning it is lost when the server is shut down (stored in RAM)
  • We could try storing the data in a file instead
    • This quickly becomes problematic when we have multiple machines trying to work with the same file (concurrent access)
General Motivation
  • /var/spool/mail
    • The place where email is stored and one file is assigned per user
    • My Inbox: /var/spool/mail/soma
  • Who have the ability to access the file?
    • Mail client
    • Mail server
  • What happens if the mail server delivers mail while I’m going through my email?
    • (deleting, filing)?
    • Unless we don’t lock it, we will lose the mail
    • The mail client will overwrite inbox and delete new incoming messages
  • Solution: lock the inbox
    • Create a soma.lock file (Inside it has the process id of the accessing process)
    • This is bad idea that does not scale
A Database As a Solution
  • A database offers:
    • Concurrent access
    • Data persistence
  • Now we know what the database does and why should we use it. But what kind of database should we use?
Key Value Store
  • Simple association of keys with values
  • Like a file system for a small files, with search
Document Store
  • Similar to key value stores
  • Values have a structure
  • Able to do indexing
  • We’re going to use MongoDB, a document store
    • Stores JSON-like documents (actually BSON)
    • Scriptable in JavaScript
Relational Database (Gold Standard)
  • Sets of tables
  • Tables have rows and columns
  • Relations between tables
  • Example: names and addresses
    • Where rows are individual people and columns are first name, last name, address, city etc.
  • Another example: customers and purchases
    • Each purchase is a row in the purchases table (the row has item purchased, price and customer ID)
    • The customer table connects name, address etc via the customer ID
  • If we were to use a relational database, we’d have to lean SQL (Structured Query Language)
    • There are a lot of implementations:
      • Oracle
      • MySQL(MariaDB)
      • PostgresQL
      • MS SQL Server
      • SQLite
  • You mainly need a relational database to handle transactions
    • What if a change to a database involves a change to multiple tables?
    • A transaction ensures all tables are changed or none are
    • Classic example: airline tickets
      • Tables:
        • Seat reservations
        • Payment
        • Customer info
      • When you buy a ticket, you would want to change all three tables
      • A transaction ensures atomic behavior
      • All changes happen or none happen

Storageline.js

MongoDB
  • We will be working with MongoDB as the database that our web applications will use
  • When compared to a relational database, the collections of MongoDB are like tables and the documents of MongoDB are like rows
  • MongoDB is already installed and is running on the course VM
    • If you are using your own machine, you will have to install it
    • You can run MongoDB by typing mongod in the terminal
  • MongoDB listens on 27017 port by default
  • You can run a Mongo client from the terminal by typing mongo
    • From here you can interact with the database
Interacting With the Database from a Script
  • We use the npm module mongodb to make our web application interact as a client with the MongoDB server
  • The Mongo server (MongoDB) must be running in order for the application to be able to communicate with it (the same goes for the terminal client)
  • In this script, we are saving log file entries into the database
    • Logs are records left by the system and other applications
    • We will be using a syslog-like logs in the tutorial
    • Our goal is to parse and store it the log entries in a Mongo collection
  • Notice the use of mc.connect() in the script which is used to establish a connection to the database. It takes 2 arguments:
    • A string with the address of the database to access
    • A callback function which runs once the connection is established (or the application has failed to establish a connection)
  • From within the callback function, we can now access the collections of the database that we have connected to
  • Here, we use the insert() method of the collection to store a log entry as a document
  • Every document in a collection has to have a unique ID
    • The ID is generated automatically during an insertion
    • It is also known as a primary key which helps us to distinguish between documents
Checking the Content of a Collection
  • An example of looking up documents in a collection might look like:
mongo //launches the Mongo client
show dbs //lists all databases
use demo-log //indicates which database we wish to use
show collections //lists all collections in the current databse
db.logs.find() //shows all documents in the logs collection
Web Architecture
  • The database acts as a server which our web server talks to
  • Our web server is thus a client of the database server
  • A web browser talks to our server using HTTP
  • Our server talks to the Database using SQL or a database-specific protocol (such as the MongoDB protocol)