WebFund 2016W Lecture 20: Difference between revisions

From Soma-notes
LeeCroft (talk | contribs)
No edit summary
LeeCroft (talk | contribs)
 
(One intermediate revision by the same user not shown)
Line 5: Line 5:
==Student Notes==
==Student Notes==


===OpenStack===
===Assignment 5 – Last Minute Question===


*There is no need to use the course VM to set up of connect to an instance
*How should the download functionality display the query result?
**This is intended as a substitute for the VM
**Display as a page as a text file. Do a <code>res.send()</code> of a big string with each log entry terminated by a new line. You should be sending the file as plain text.
*There are various options to get your web application on the instance:
**WINSCP
**wget [URL]
**Others...
*You can create multiple sessions to the same instance with SSH


===Assignment 5===
===Assignment 6 Hints===


*Extended until Sunday (March 27) at 11:55 pm
*This will involve making an application that works in a similar fashion to exam-storage (a single-page app)
*Extra functionality (2% bonus) will be at the TA’s discretion
*You will be doing the same things that are already implemented in assignment 5 but you need to change it to work from a single page
*If things are running slowly for you, try testing with files no larger than 1000 entries
*If database has too many entries, drop them
**To drop: <code>db.logs.drop()</code>
***This is done from the Mongo terminal client
*“Show” shows the entries in a scrollable textbox


===MongoDB Storage===
===Exam Storage===


*By default, documents in MongoDB can be up to 16 megabytes
*A little bit more complicated than interactive-demo.
*For what we are doing, nothing should ever exceed this size since we are breaking files up into multiple documents
*The file structure of the application is modified to clarify the division between what runs on the client and what runs on the server
*GridFS can be used to store larger documents
**public: contains everything that goes to the client directly (CSS and JavaScript files)
**It will split these larger documents into chunks sized 255 kB each
**server: contains all the JavaScript files that will run on the server
**start.js is a little JavaScript file that you will use to start the server (this is replacing bin/www)
**views contains all the templates for generating html
**keys: cryptographic keys
*Note that you can use <code>npm rebuild</code> to recompile any binary dependencies in the application modules (in case something is not working on your machine)


===Assignment 6===
====start.js====


*This will involve making assignment 5 into a single-page web appliction
*The app.js file is now located in the server directory
*Look at exam-storage to see an example of how to make a single-page application using client-side code
**The call to <code>require()</code> in start.js is therefore changed to: <code>var app = require(‘./server/app’);</code>
**We then set the port, create the server and it start listening on port 3000.


===Tutorial 9===
====app.js====


*The base application analyzes text entered by the user and counts word frequencies to display in a graph
*Things are fairly similar to our previous applications
*Division between client and server:
**We are specifying the ssl keys and setting up the same modules we've previously used including one for sessions
**Server: dumb web server (one route to render index)
**<code>app.use(express.static(‘public’));</code> is serving everything from public directory
**Client: everything (all the functionality, ie. interact.js)
*In index.jade, we can see that some client-side scripts are linked and many of the HTML elements have IDs
**These IDs are used in the client-side script to easily access the elements using jQuery
*It is important to understand the control flow when clicking on the update button
**The click event for the button causes a function to be called
**In this function, the user-entered data is taken from the page using jQuery
**Based on the settings specified by this data, words frequencies are counted and the graph is created
*What does the map method do?
**It processes each element of an array, passing that element in to a function
**The return value of the function is put into a new array


===Other Notes===
====routes.js ====
 
*Starts with setting up a connection to the database.
**The users collection contains documents which have a username and a hashed password
*Many routes defined in here are similar to what we have done before
*The <code>storeSaltedpassword()</code> function in the /register is something new
**This stores the username and hashed and salted password in the users collection
**The hashing and salting is done using bcrypt
*In the /login route, we are again using bcrypt to compare a password that a user has entered with a hashed version stored in the database
**Why do we have the same error message for both incorrect password and invalid username?
***If someone is trying to break into your system, you don’t want them to know that they have come across the valid account and they are trying to figure out the password. You just want them to know that the username and the password is not the right one. Therefore, we use the same error string for both username and password.
 
====Password Hashing====
 
*When you log in to the virtual machine and type in the password, it is stored in the virtual machine
**Go to cd /etc there is a file called passwd which gives you a file that contains all of the accounts on the system
**In old school UNIX installations, the hash of the password is shown instead of 'x' in the file, but not here if you check the stored passwords stored in virtual machine
**To show the hash of the password in the file, you can type in the command <code>sudo shadowconfig off</code>, and then you should be able to see the file with the hash of your password instead of the 'x'
*For any file you create, you can compute a sha1sum (hash) of that file which is a big alphanumeric string
**If you change one bit in the file (input), it will make a huge change in the output
**Sha1sum should no longer be used since it is no longer considered secure but you can use sha256 instead
*Bcrypt is what we are using to hash passwords before storing them in the database
**Bcrypt is a hash function but it is a bit weird. It is slow by design and is memory intensive.
**Storing your passwords using bcrypt is really good today but in general, it is a good idea to use some kind of framework of something like Passport.
***These types of frameworks take care of making decisions that you might not know how to make and they help avoid using outdated and insecure technology
***Passport is a middleware authentication module for Node.js
*Why do we want to use hashing for passwords in the first place? Why do we not want to store them in the plain text?
**We want to avoid a breach in security caused by an attacker obtaining the stored passwords
**The hashing approach should have an irreversibility property that is essential for password storage, has adequate pre-computation defenses, and performance defenses needed to prevent dictionary attacks
**Just by looking at a hashed string, no one should be able to guess the original value and get your password
***They will actually have to play around with the hashing function or will have to plug in random values and try every single possibility which should hopefully take them a very long time
*In our web application, it is important to do the hashing on the server as opposed to on the client's machine even though doing this on the client would reduce the computational burden for the server
**This is because we cannot trust anything that is sent to us from the client
**Rather than computing the hash function and then sending us the resultant hash, the client could iterate over hash values and send them to us
***This would allow the client to avoid the need to spend time actually running the hash function
 
====Salting====
 
*Let’s say user 1 and user 2 have the same password, they will also have the same resultant hash
**We don’t want that as this makes things easier for an attacker
**What we will do is that we use the same password for both but add a string called salt1 and salt2 for users 1 and 2 respectively
**Now they have completely different hashes but the password is still the same
**Every user should have a different hash
*Salting is also the defeat to precomputation
**An attacker could use a precomputed dictionary of hash values allowing them to quickly look up the input that was used to create the hash
**By hashing a salted password, we are able to render the lookup tables useless
 
====account.js====
 
*<code>updateFileList()</code> is used to update the files displayed on the page
**It first sends an ajax GET request for /getFileStats to the server to get the updated file list
***When the server receives this request, it first checks to see if the user is logged in
***If the user is logged in, it queries the database to get their uploaded files
****When doing a query, we can optionally specify a projection object
****The projection is used to specify which information (document attributes) you want to retrieve in the query
****We specify the query object as <code>{content:0}</code> meaning that we do not want to include the content attribute in the documents that we retrieve
**In the <code>doUpdateFileList()</code> callback function, we update the DOM with the data received in the response
*<code>updateFileList()</code> is called when the page is initially loaded and every time a file is uploaded
**This means that the content on the page will always be up to date
*When a user requests a file download, how do we know which file to give them?
**We use the <code>downloadFile()</code> function to create and return a function which will be used as a callback function for the click event of the link to download a specific file
**By passing the ID of the file into <code>downloadFile()</code> we are able to create a callback function specific to each file so that we know which one to download when it gets called for the click event
*Notice that the href attribute for the download links is set to #
**This causes the link to take the user to the top of the page
**This is useful when we want a link that will not cause the user to go to a different page
*<code>saveDownloadedFile()</code> is used as a callback function for the POST request that is sent to the server when a file is requested for download from within a download click event callback function
**This function handles saving the file that gets sent to the client from the server
**It relies on the <code>saveAs()</code> function which lets us download file in the folder/location of our choice without changing the page.
*The reason why you want to manipulate things on the browser instead of on the server and the differences involved in making this change are important parts of exam-storage that you should understand
 
===Types of Questions for the Exam===
 
*Why are we using https?
**To protect data going between the browser and the server, just to make sure everything is encrypted there
*Why are we using bcrypt?
**To make sure that the password is not stored in plain text on the server
*What if in routes.js I just compare the password and the user.password instead of doing bcrypt?
 
===Next Week===
 
*High level topics which aren’t included in the final
*How do you make web applications pretty without being a web designer?


*If VM runs out of space, things may break
**Be careful, clean up space
**See the forums for a way to minimize the space MongoDB takes up


==Code==
==Code==


In this lecture we discussed [http://homeostasis.scs.carleton.ca/~soma/webfund-2016w/code/exam-storage.zip exam-storage], the code for [[WebFund 2016W: Tutorial 10|Tutorial 10]]
In this lecture we discussed [http://homeostasis.scs.carleton.ca/~soma/webfund-2016w/code/exam-storage.zip exam-storage], the code for [[WebFund 2016W: Tutorial 10|Tutorial 10]]

Latest revision as of 14:04, 31 March 2016

Video

The video for the lecture given on March 24, 2016 is now available.

Student Notes

Assignment 5 – Last Minute Question

  • How should the download functionality display the query result?
    • Display as a page as a text file. Do a res.send() of a big string with each log entry terminated by a new line. You should be sending the file as plain text.

Assignment 6 Hints

  • This will involve making an application that works in a similar fashion to exam-storage (a single-page app)
  • You will be doing the same things that are already implemented in assignment 5 but you need to change it to work from a single page

Exam Storage

  • A little bit more complicated than interactive-demo.
  • The file structure of the application is modified to clarify the division between what runs on the client and what runs on the server
    • public: contains everything that goes to the client directly (CSS and JavaScript files)
    • server: contains all the JavaScript files that will run on the server
    • start.js is a little JavaScript file that you will use to start the server (this is replacing bin/www)
    • views contains all the templates for generating html
    • keys: cryptographic keys
  • Note that you can use npm rebuild to recompile any binary dependencies in the application modules (in case something is not working on your machine)

start.js

  • The app.js file is now located in the server directory
    • The call to require() in start.js is therefore changed to: var app = require(‘./server/app’);
    • We then set the port, create the server and it start listening on port 3000.

app.js

  • Things are fairly similar to our previous applications
    • We are specifying the ssl keys and setting up the same modules we've previously used including one for sessions
    • app.use(express.static(‘public’)); is serving everything from public directory

routes.js

  • Starts with setting up a connection to the database.
    • The users collection contains documents which have a username and a hashed password
  • Many routes defined in here are similar to what we have done before
  • The storeSaltedpassword() function in the /register is something new
    • This stores the username and hashed and salted password in the users collection
    • The hashing and salting is done using bcrypt
  • In the /login route, we are again using bcrypt to compare a password that a user has entered with a hashed version stored in the database
    • Why do we have the same error message for both incorrect password and invalid username?
      • If someone is trying to break into your system, you don’t want them to know that they have come across the valid account and they are trying to figure out the password. You just want them to know that the username and the password is not the right one. Therefore, we use the same error string for both username and password.

Password Hashing

  • When you log in to the virtual machine and type in the password, it is stored in the virtual machine
    • Go to cd /etc there is a file called passwd which gives you a file that contains all of the accounts on the system
    • In old school UNIX installations, the hash of the password is shown instead of 'x' in the file, but not here if you check the stored passwords stored in virtual machine
    • To show the hash of the password in the file, you can type in the command sudo shadowconfig off, and then you should be able to see the file with the hash of your password instead of the 'x'
  • For any file you create, you can compute a sha1sum (hash) of that file which is a big alphanumeric string
    • If you change one bit in the file (input), it will make a huge change in the output
    • Sha1sum should no longer be used since it is no longer considered secure but you can use sha256 instead
  • Bcrypt is what we are using to hash passwords before storing them in the database
    • Bcrypt is a hash function but it is a bit weird. It is slow by design and is memory intensive.
    • Storing your passwords using bcrypt is really good today but in general, it is a good idea to use some kind of framework of something like Passport.
      • These types of frameworks take care of making decisions that you might not know how to make and they help avoid using outdated and insecure technology
      • Passport is a middleware authentication module for Node.js
  • Why do we want to use hashing for passwords in the first place? Why do we not want to store them in the plain text?
    • We want to avoid a breach in security caused by an attacker obtaining the stored passwords
    • The hashing approach should have an irreversibility property that is essential for password storage, has adequate pre-computation defenses, and performance defenses needed to prevent dictionary attacks
    • Just by looking at a hashed string, no one should be able to guess the original value and get your password
      • They will actually have to play around with the hashing function or will have to plug in random values and try every single possibility which should hopefully take them a very long time
  • In our web application, it is important to do the hashing on the server as opposed to on the client's machine even though doing this on the client would reduce the computational burden for the server
    • This is because we cannot trust anything that is sent to us from the client
    • Rather than computing the hash function and then sending us the resultant hash, the client could iterate over hash values and send them to us
      • This would allow the client to avoid the need to spend time actually running the hash function

Salting

  • Let’s say user 1 and user 2 have the same password, they will also have the same resultant hash
    • We don’t want that as this makes things easier for an attacker
    • What we will do is that we use the same password for both but add a string called salt1 and salt2 for users 1 and 2 respectively
    • Now they have completely different hashes but the password is still the same
    • Every user should have a different hash
  • Salting is also the defeat to precomputation
    • An attacker could use a precomputed dictionary of hash values allowing them to quickly look up the input that was used to create the hash
    • By hashing a salted password, we are able to render the lookup tables useless

account.js

  • updateFileList() is used to update the files displayed on the page
    • It first sends an ajax GET request for /getFileStats to the server to get the updated file list
      • When the server receives this request, it first checks to see if the user is logged in
      • If the user is logged in, it queries the database to get their uploaded files
        • When doing a query, we can optionally specify a projection object
        • The projection is used to specify which information (document attributes) you want to retrieve in the query
        • We specify the query object as {content:0} meaning that we do not want to include the content attribute in the documents that we retrieve
    • In the doUpdateFileList() callback function, we update the DOM with the data received in the response
  • updateFileList() is called when the page is initially loaded and every time a file is uploaded
    • This means that the content on the page will always be up to date
  • When a user requests a file download, how do we know which file to give them?
    • We use the downloadFile() function to create and return a function which will be used as a callback function for the click event of the link to download a specific file
    • By passing the ID of the file into downloadFile() we are able to create a callback function specific to each file so that we know which one to download when it gets called for the click event
  • Notice that the href attribute for the download links is set to #
    • This causes the link to take the user to the top of the page
    • This is useful when we want a link that will not cause the user to go to a different page
  • saveDownloadedFile() is used as a callback function for the POST request that is sent to the server when a file is requested for download from within a download click event callback function
    • This function handles saving the file that gets sent to the client from the server
    • It relies on the saveAs() function which lets us download file in the folder/location of our choice without changing the page.
  • The reason why you want to manipulate things on the browser instead of on the server and the differences involved in making this change are important parts of exam-storage that you should understand

Types of Questions for the Exam

  • Why are we using https?
    • To protect data going between the browser and the server, just to make sure everything is encrypted there
  • Why are we using bcrypt?
    • To make sure that the password is not stored in plain text on the server
  • What if in routes.js I just compare the password and the user.password instead of doing bcrypt?

Next Week

  • High level topics which aren’t included in the final
  • How do you make web applications pretty without being a web designer?


Code

In this lecture we discussed exam-storage, the code for Tutorial 10