WebFund 2016W Lecture 11: Difference between revisions

From Soma-notes
Zero (talk | contribs)
Added partial in class notes.
 
LeeCroft (talk | contribs)
No edit summary
 
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
==Video==
The video for the lecture given on February 23, 2016 [http://homeostasis.scs.carleton.ca/~soma/webfund-2016w/lectures/comp2406-2016w-lec11-23Feb2016.mp4 is now available].
==Notes==
==Notes==


Line 32: Line 36:
  - data persistence
  - data persistence


What kind of database?
- key-value stores
    like a filesystem for very small files,
    with search
- document store  <-- we'll use this
    values have structure
    can index on values
- relational database
    sets of tables
    tables have rows and columns
    relations between tables
    example: names and addresses
      - rows are individual people
      - columns are first name, last name, address, city, etc.
    relation example: customers and purchases
      - each purchase (invoice) has item purchased, price, and
        customer ID
      - customer table connects customer ID and name, address, etc
We're going to use MongoDB, a document store, not a relational
database.
  - stores JSON-like documents
  - scriptable in JavaScript
If we were to use a relational database, we'd have to learn SQL
  - Structured Query Language
  - many man implementations
    Oracle, MySQL (MariaDB), PostgreSQL, MS SQL Server, SQLite, etc
  - you mainly need a relational database to handle transactions
Transaction
- what if a change to a database involves multiple tables?
- a transaction ensures all tables are changed or none are
Classic example: airline tickets
Tables:
- seat reservations
- payment
- customer info
When you buy a ticket, you want to change all three
A transaction ensures atomic behavior
  all happen or none happen
MongoDB        Relational DB
Document  =>  row
Collection =>  table
Web Architecture
----------------
web browser (interprets HTML, CSS, JavaScript)
  |
  |  HTTP
  |
web server (runs arbitrary code, in this class JavaScript in node)
  |
  |  SQL or MongoDB protocol
  |
database (persistent store, allows concurrent access safely)
</pre>
</pre>
===Student Notes===
====Databases====
=====Problems with Examforms=====
*Previously, we used and array (<code>var state = [];</code>) to store the data entered in the form
*The data stored in <code>state</code> is not persistent meaning it is lost when the server is shut down (stored in RAM)
*We could try storing the data in a file instead
**This quickly becomes problematic when we have multiple machines trying to work with the same file (concurrent access)
=====General Motivation=====
*<code>/var/spool/mail</code>
**The place where email is stored and one file is assigned per user
**My Inbox: <code>/var/spool/mail/soma</code>
*Who have the ability to access the file?
**Mail client
**Mail server
*What happens if the mail server delivers mail while I’m going through my email?
**(deleting, filing)?
**Unless we don’t lock it, we will lose the mail
**The mail client will overwrite inbox and delete new incoming messages
*Solution: lock the inbox
**Create a soma.lock file (Inside it has the process id of the accessing process)
**This is bad idea that does not scale
=====A Database As a Solution=====
*A database offers:
**Concurrent access
**Data persistence
*Now we know what the database does and why should we use it. But what kind of database should we use?
======Key Value Store======
*Simple association of keys with values
*Like a file system for a small files, with search
======Document Store======
*Similar to key value stores
*Values have a structure
*Able to do indexing
*We’re going to use MongoDB, a document store
**Stores JSON-like documents (actually BSON)
**Scriptable in JavaScript
======Relational Database (Gold Standard)======
*Sets of tables
*Tables have rows and columns
*Relations between tables
*Example: names and addresses
**Where rows are individual people and columns are first name, last name, address, city etc.
*Another example: customers and purchases
**Each purchase is a row in the purchases table (the row has item purchased, price and customer ID)
**The customer table connects name, address etc via the customer ID
*If we were to use a relational database, we’d have to lean SQL (Structured Query Language)
**There are a lot of implementations:
***Oracle
***MySQL(MariaDB)
***PostgresQL
***MS SQL Server
***SQLite
*You mainly need a relational database to handle transactions
**What if a change to a database involves a change to multiple tables?
**A transaction ensures all tables are changed or none are
**Classic example: airline tickets
***Tables:
****Seat reservations
****Payment
****Customer info
***When you buy a ticket, you  would want to change all three tables
***A transaction ensures atomic behavior
***All changes happen or none happen
====Storageline.js====
=====MongoDB=====
*We will be working with MongoDB as the database that our web applications will use
*When compared to a relational database, the collections of MongoDB are like tables and the documents of MongoDB are like rows
*MongoDB is already installed and is running on the course VM
**If you are using your own machine, you will have to install it
**You can run MongoDB by typing <code>mongod</code> in the terminal
*MongoDB listens on 27017 port by default
*You can run a Mongo client from the terminal by typing <code>mongo</code>
**From here you can interact with the database
=====Interacting With the Database from a Script=====
*We use the npm module <code>mongodb</code> to make our web application interact as a client with the MongoDB server
*The Mongo server (MongoDB) must be running in order for the application to be able to communicate with it (the same goes for the terminal client)
*In this script, we are saving log file entries into the database
**Logs are records left by the system and other applications
**We will be using a syslog-like logs in the tutorial
**Our goal is to parse and store it the log entries in a Mongo collection
*Notice the use of <code>mc.connect()</code> in the script which is used to establish a connection to the database. It takes 2 arguments:
**A string with the address of the database to access
**A callback function which runs once the connection is established (or the application has failed to establish a connection)
*From within the callback function, we can now access the collections of the database that we have connected to
*Here, we use the <code>insert()</code> method of the collection to store a log entry as a document
*Every document in a collection has to have a unique ID
**The ID is generated automatically during an insertion
**It is also known as a primary key which helps us to distinguish between documents
=====Checking the Content of a Collection=====
*An example of looking up documents in a collection might look like:
<code><pre>
mongo //launches the Mongo client
show dbs //lists all databases
use demo-log //indicates which database we wish to use
show collections //lists all collections in the current databse
db.logs.find() //shows all documents in the logs collection
</pre></code>
=====Web Architecture=====
*The database acts as a server which our web server talks to
*Our web server is thus a client of the database server
*A web browser talks to our server using HTTP
*Our server talks to the Database using SQL or a database-specific protocol (such as the MongoDB protocol)

Latest revision as of 21:49, 28 February 2016

Video

The video for the lecture given on February 23, 2016 is now available.

Notes

In-class Notes

Lecture 11
----------

Databases

/var/spool/mail
  - where email is stored
  - one file per user

my mail inbox in /var/spool/mail/soma

Who accesses this file?
  - mail client
  - mail server

What happens if the mail server delivers mail while I'm going
through my email (deleting, filing)
 - unless you lock it, you lose mail, mail client will overwrite
   inbox and delete new messages
 - solution: lock the inbox
     create a soma.lock file
     (inside it has the process ID of the accessing process)

This is a bad idea that does not scale
Instead, use a database
 - concurrent access
 - data persistence



What kind of database?
 - key-value stores
    like a filesystem for very small files,
    with search
 - document store  <-- we'll use this
    values have structure
    can index on values
 - relational database
    sets of tables
    tables have rows and columns
    relations between tables
    example: names and addresses
      - rows are individual people
      - columns are first name, last name, address, city, etc.
    relation example: customers and purchases
      - each purchase (invoice) has item purchased, price, and
        customer ID
      - customer table connects customer ID and name, address, etc

We're going to use MongoDB, a document store, not a relational
database.
  - stores JSON-like documents
  - scriptable in JavaScript

If we were to use a relational database, we'd have to learn SQL
  - Structured Query Language
  - many man implementations
     Oracle, MySQL (MariaDB), PostgreSQL, MS SQL Server, SQLite, etc
  - you mainly need a relational database to handle transactions

Transaction
 - what if a change to a database involves multiple tables?
 - a transaction ensures all tables are changed or none are

Classic example: airline tickets

Tables:
 - seat reservations
 - payment
 - customer info

When you buy a ticket, you want to change all three

A transaction ensures atomic behavior
  all happen or none happen

MongoDB        Relational DB
Document   =>  row
Collection =>  table


Web Architecture
----------------

web browser (interprets HTML, CSS, JavaScript)

  |
  |  HTTP
  |

web server (runs arbitrary code, in this class JavaScript in node)

  |
  |  SQL or MongoDB protocol
  |

database (persistent store, allows concurrent access safely)

Student Notes

Databases

Problems with Examforms
  • Previously, we used and array (var state = [];) to store the data entered in the form
  • The data stored in state is not persistent meaning it is lost when the server is shut down (stored in RAM)
  • We could try storing the data in a file instead
    • This quickly becomes problematic when we have multiple machines trying to work with the same file (concurrent access)
General Motivation
  • /var/spool/mail
    • The place where email is stored and one file is assigned per user
    • My Inbox: /var/spool/mail/soma
  • Who have the ability to access the file?
    • Mail client
    • Mail server
  • What happens if the mail server delivers mail while I’m going through my email?
    • (deleting, filing)?
    • Unless we don’t lock it, we will lose the mail
    • The mail client will overwrite inbox and delete new incoming messages
  • Solution: lock the inbox
    • Create a soma.lock file (Inside it has the process id of the accessing process)
    • This is bad idea that does not scale
A Database As a Solution
  • A database offers:
    • Concurrent access
    • Data persistence
  • Now we know what the database does and why should we use it. But what kind of database should we use?
Key Value Store
  • Simple association of keys with values
  • Like a file system for a small files, with search
Document Store
  • Similar to key value stores
  • Values have a structure
  • Able to do indexing
  • We’re going to use MongoDB, a document store
    • Stores JSON-like documents (actually BSON)
    • Scriptable in JavaScript
Relational Database (Gold Standard)
  • Sets of tables
  • Tables have rows and columns
  • Relations between tables
  • Example: names and addresses
    • Where rows are individual people and columns are first name, last name, address, city etc.
  • Another example: customers and purchases
    • Each purchase is a row in the purchases table (the row has item purchased, price and customer ID)
    • The customer table connects name, address etc via the customer ID
  • If we were to use a relational database, we’d have to lean SQL (Structured Query Language)
    • There are a lot of implementations:
      • Oracle
      • MySQL(MariaDB)
      • PostgresQL
      • MS SQL Server
      • SQLite
  • You mainly need a relational database to handle transactions
    • What if a change to a database involves a change to multiple tables?
    • A transaction ensures all tables are changed or none are
    • Classic example: airline tickets
      • Tables:
        • Seat reservations
        • Payment
        • Customer info
      • When you buy a ticket, you would want to change all three tables
      • A transaction ensures atomic behavior
      • All changes happen or none happen

Storageline.js

MongoDB
  • We will be working with MongoDB as the database that our web applications will use
  • When compared to a relational database, the collections of MongoDB are like tables and the documents of MongoDB are like rows
  • MongoDB is already installed and is running on the course VM
    • If you are using your own machine, you will have to install it
    • You can run MongoDB by typing mongod in the terminal
  • MongoDB listens on 27017 port by default
  • You can run a Mongo client from the terminal by typing mongo
    • From here you can interact with the database
Interacting With the Database from a Script
  • We use the npm module mongodb to make our web application interact as a client with the MongoDB server
  • The Mongo server (MongoDB) must be running in order for the application to be able to communicate with it (the same goes for the terminal client)
  • In this script, we are saving log file entries into the database
    • Logs are records left by the system and other applications
    • We will be using a syslog-like logs in the tutorial
    • Our goal is to parse and store it the log entries in a Mongo collection
  • Notice the use of mc.connect() in the script which is used to establish a connection to the database. It takes 2 arguments:
    • A string with the address of the database to access
    • A callback function which runs once the connection is established (or the application has failed to establish a connection)
  • From within the callback function, we can now access the collections of the database that we have connected to
  • Here, we use the insert() method of the collection to store a log entry as a document
  • Every document in a collection has to have a unique ID
    • The ID is generated automatically during an insertion
    • It is also known as a primary key which helps us to distinguish between documents
Checking the Content of a Collection
  • An example of looking up documents in a collection might look like:
mongo //launches the Mongo client
show dbs //lists all databases
use demo-log //indicates which database we wish to use
show collections //lists all collections in the current databse
db.logs.find() //shows all documents in the logs collection
Web Architecture
  • The database acts as a server which our web server talks to
  • Our web server is thus a client of the database server
  • A web browser talks to our server using HTTP
  • Our server talks to the Database using SQL or a database-specific protocol (such as the MongoDB protocol)