Operating Systems 2017F Lecture 7: Difference between revisions

From Soma-notes
CindyYu (talk | contribs)
Significant formatting improvement, some content omissions, spelling errors
 
(2 intermediate revisions by one other user not shown)
Line 2: Line 2:


Video from the lecture given on September 28, 2017 [http://homeostasis.scs.carleton.ca/~soma/os-2017f/lectures/comp3000-2017f-lec07-28Sep2017.mp4 is now available].
Video from the lecture given on September 28, 2017 [http://homeostasis.scs.carleton.ca/~soma/os-2017f/lectures/comp3000-2017f-lec07-28Sep2017.mp4 is now available].
==Code==
Code and files from the lecture (captured as they were at the end) are available [http://homeostasis.scs.carleton.ca/~soma/os-2017f/code/lec07/ here].
==Notes==


3000 Sep28
3000 Sep28
Next week: Anil is away from Sunday to Thursday morning. Review section.
Next week: Anil is away from Sunday to Thursday morning. Review section.


Topic today: File system
Topic today: File systems


'''What is a file?'''
===What is a file?===
It is a key - value pair.
It is a key - value pair.


Key:  hierarchiacal filename (pathname)
*Key:  hierarchical filename (pathname)
 
*Value: arbitrary number of bytes
value: arbitrary number of bytes


'''In principle, you can use files to store very small values'''
'''In principle, you can use files to store very small values'''
* most filesystems have a minimum file size of at least 1k, often 4K or more
* a file containing one byte of data wastes a lot of real disk space
'''A set of files 'stored' together, sharing a common hierarchical root, is a filesystem.'''
Why quote 'stored'?
* Because there is no storage for virtual filesystems (such as /proc and /sys)
* when you first start a UNIX system, you start with the ‘root’ filesystem. (have nothing to do with root user.)
* other filesystems can be put on top by "mount"-ing them.


-most filesystems have a minimum file size of at least 1k, often 4K or more
====Demonstration====


-one byte of data wastes a lot of space
Run <code>mount</code>, it shows
  /dev/vda1 on / type ext4 (rw, relatetime,data=ordered)


* /dev is root filesystem
* / is the top of hierarchy
* ‘rw’ means read write
* /def/vda1 is a Special file, files in /dev are special files.


'''A set of files ‘stored’ together, sharing a common hierarchical root, is a filesystem.'''
(Special files are actual files, historically they might have been stored on disk, nowadays they are not.)


Why quote ‘stored’ ?
Run <code>df .</code> (df is for displaying the amount of available disk space for file systems on which the invoking user has appropriate read access.)
* There is a filesystem called ‘udev’, which is a virtual file system with zero space used, since it is created automatically on run time


- Because there is no storage for virtual filesystems (such as /proc and /sys)
Run <code>who</code> in another terminal
* it displays <code>soma tty7 2017-09-28 12:59 (:0)</code>


when you first start a UNIX system, you start with the ‘root’ filesystem. (have nothing to do with root user.)
Run <code>ls /dev</code>
*Instead of ''vda'', it shows ''sda''. The ‘v’ in ''vda'' stands for ‘virtual’. The ‘s’ in ''sda'' stands for SCSI (The Small Computer System Interface).
*Any device you want to access on unix system has /dev


-But other filesystems can be put on top by "mount"-ing them.
Run <code>ls -a</code> in <code>/dev</code>


*Looking at the permissions (e.g ‘crw--rw----’), first letter for normal file is blank , directory is ‘d’, ‘c’ is for Character Devices, ‘b’ is for Block Devices


CODE DEMONSTRATION:
*Character devices example: keyboard, mouse, /dev/random
 
:-read/write a single byte (character) at a time
 
*What makes b-disk different? Caches.
 
:-Hard drives are optical drives are examples of block devices
Run ‘mount’
*Character and block devices are accessed via special files, denoted by 'c' and 'b' in the file metadata
 
*Block devices are for block storage, to cache results
it shows‘/dev/vda1 on / type ext4 (rw, relatetime,data=ordered) ‘
:-It is faster to read or write large chunks of data (e.g. 4KiB) than it would be one byte at a time
 
:-Traditional filesystems are stored in block devices
- /dev is root filesystem
*However, a filesystem can be anything that provides a filesystem interface.
 
- / is the top of hierachy
 
-‘rw’ means read write  
 
-/def/vda1 is a Special file, files in /dev are special files.
 
 
(Special files are actual files, it would be stored on disk, nowadays they are not.)
 
 
Run ‘df .’ (df is for displaying the amount of available disk space for file systems on which the invoking user has appropriate read access.)
 
-There is a filesystem called ‘udev’, which is a virtual file system with zero space used, since it is created automatically on run time
 
 
Run ‘who’ in another terminal
 
-it display ‘soma tty7 2017-09-28 12:59 (:0)’
 
 
Run ‘ls /dev’
 
-Instead of ''vda'', it shows ''sda''. The ‘v’ in ''vda'' stands for ‘virtual’. The ‘s’ in ''sda'' stands for SCSI (The Small Computer System Interface).
 
-Any device you want to access on unix system has /dev
 
 
Run ‘ls-a’ in /dev
 
-Looking at the permissions (e.g ‘crw--rw----’), first letter for normal file is blank , directory is ‘d’, ‘c’ is for Character Devices, ‘b’ is for Block Devices
 
-Character devices example: keyboard, mouse.
 
-What makes b-disk different?caches.
 
-Special files are either Character Devices or Block Devices,
 
-Block divce is for block storage, to cache result.
 
 
Traditional filesystems are stored in block devices
 
-but a filesystem can be anything that provides a filesystem interface.
 


'''file interface basics:'''
'''file interface basics:'''


-open, read, write, seek, close
*open, read, write, seek, close
 
*open directory, read/write directory, close directory
-open directory, read/write directory, close directory
 


'''Block size issues:'''
'''Block size issues:'''


-larger blocks, less overhead(fixed overhead for each block access)
*larger blocks, less overhead
 
:-there is a fixed cost for each block access
-smaller blocks, more efficient space usage(fewer files smaller than one block)
:-large files span many blocks
 
*smaller blocks, more efficient space usage
 
:-you end up with fewer files that are smaller than one block


'''filesystem interface basics:'''
'''filesystem interface basics:'''
Line 107: Line 82:
-mount and unmount
-mount and unmount


filesystem block size is a multiple of disk block (sometimes called sector or cluster) size
  e.g. an SSD has a 512 byte sectors with an EXT3 filesystem with 8,192 byte (8K) blocks


filesystem blocksize is a multiple of disk blocksize
How do you structure a filesystem?
'''Inode:'''
First, you have to understand inodes.
CODE DEMONSTRATION:
1.Run ‘cp lec07.txt test.txt’
-There is no different between those files
2.Run ‘ln test.txt duplicate.txt’
-There is difference between duplicate.txt and test.txt: A number for test.txt changed from ‘1’ to ‘2’. That number is inode count
3. Run ‘ln duplicate.txt alsodup.txt’
-Inode count number changed ‘2’ from ‘3’
'''ln' stands for link''
4.Run ‘ln -s duplicate.txt another.txt’
-There is no different between another.txt lec07.txt
5.Run ‘ls -la has diffrence’
-It shows another.txt -> duplicate.txt (symbolic link pointing one to the other
6.Run ‘rm duplicate.txt’
-The arrow is still pointing, but inode count number goes down. (From ‘3’ to ‘2’)
7.Run ‘cat another.txt’
-It shows there is such file, because the file that pointing to is not exist
-‘ln’ creates hard links to an item by default.
-File name is a pointer to an inodes.
-‘rm’ is not remove, just unlink, which reduces the link count for inode


-When you clear storage, when inode inbound number go to 0, is not link to other file,system will remove it.
===Inodes===


-When the user open a file, referece count is incremented.
How do you structure a filesystem? First, you have to understand inodes.


====Demonstration====


A file sys doesn’t have to be a block device? It can store anywhere as long as you provide the api.
1. Run <code>cp lec07.txt test.txt</code>
 
:- There is no difference between those files
A file system can be made in a file (the basic of virtual machine)
2. Run <code>ln test.txt duplicate.txt</code>
:- There is difference between duplicate.txt and test.txt: A number for test.txt changed from ‘1’ to ‘2’. That number is inode count
:- <code>ln</code> stands for ''link''
3. Run <code>ln duplicate.txt alsodup.txt</code>
:- Inode count number changed ‘2’ from ‘3’
4. Run <code>ln -s duplicate.txt another.txt</code>
:- There is no different between another.txt lec07.txt
5. Run <code>ls -la</code>has diffrence’
:- It shows another.txt -> duplicate.txt (symbolic link pointing one to the other
6.Run <code>rm duplicate.txt</code>
:- The arrow is still pointing, but inode count number goes down. (From ‘3’ to ‘2’)
7. Run <code>cat another.txt</code>
:- It shows there is such file, because the file that pointing to is not exist


* ‘ln’ creates hard links to an item by default.
* File name is a pointer to an inodes.
* ‘rm’ is not remove, just unlink, which reduces the link count for inode
* When you clear storage, when inode inbound number go to 0, is not link to other file,system will remove it.
* When the user open a file, reference count is incremented.
* A file sys doesn’t have to be a block device? It can store anywhere as long as you provide the api.
* A file system can be made in a file (the basic of virtual machine)


One other file api: mmap
One other file api: mmap


Inode store permision
'''Inode store permission'''
 


mmap, munmap: map or unmap files or devices into memory, this is the piece that unify the file sys with process memory.
mmap, munmap: map or unmap files or devices into memory, this is the piece that unify the file sys with process memory.
Line 181: Line 124:
What ‘read’ does: read directly into RAM
What ‘read’ does: read directly into RAM


Run <code>top</code>:


Run ‘top’
:-There are ‘VIRT’, ‘RES’ and ‘SHR’:
 
VIRT stands for the virtual size of a process
-There are ‘VIRT’, ‘RES’ and ‘SHR’
RES stands for the resident size, which is an accurate representation of how much actual physical memory a process is consuming.
 
SHR indicates how much of the VIRT size is actually sharable (memory or libraries).
  VIRT stands for the virtual size of a process
  RES stands for the resident size, which is an accurate representation of how much actual physical memory a process is consuming.
  SHR indicates how much of the VIRT size is actually sharable (memory or libraries).
 
 
==Code==
 
Code and files from the lecture (captured as they were at the end) are available [http://homeostasis.scs.carleton.ca/~soma/os-2017f/code/lec07/ here].

Latest revision as of 05:16, 4 October 2017

Video

Video from the lecture given on September 28, 2017 is now available.

Code

Code and files from the lecture (captured as they were at the end) are available here.

Notes

3000 Sep28 Next week: Anil is away from Sunday to Thursday morning. Review section.

Topic today: File systems

What is a file?

It is a key - value pair.

  • Key: hierarchical filename (pathname)
  • Value: arbitrary number of bytes

In principle, you can use files to store very small values

  • most filesystems have a minimum file size of at least 1k, often 4K or more
  • a file containing one byte of data wastes a lot of real disk space

A set of files 'stored' together, sharing a common hierarchical root, is a filesystem. Why quote 'stored'?

  • Because there is no storage for virtual filesystems (such as /proc and /sys)
  • when you first start a UNIX system, you start with the ‘root’ filesystem. (have nothing to do with root user.)
  • other filesystems can be put on top by "mount"-ing them.

Demonstration

Run mount, it shows

 /dev/vda1 on / type ext4 (rw, relatetime,data=ordered)
  • /dev is root filesystem
  • / is the top of hierarchy
  • ‘rw’ means read write
  • /def/vda1 is a Special file, files in /dev are special files.

(Special files are actual files, historically they might have been stored on disk, nowadays they are not.)

Run df . (df is for displaying the amount of available disk space for file systems on which the invoking user has appropriate read access.)

  • There is a filesystem called ‘udev’, which is a virtual file system with zero space used, since it is created automatically on run time

Run who in another terminal

  • it displays soma tty7 2017-09-28 12:59 (:0)

Run ls /dev

  • Instead of vda, it shows sda. The ‘v’ in vda stands for ‘virtual’. The ‘s’ in sda stands for SCSI (The Small Computer System Interface).
  • Any device you want to access on unix system has /dev

Run ls -a in /dev

  • Looking at the permissions (e.g ‘crw--rw----’), first letter for normal file is blank , directory is ‘d’, ‘c’ is for Character Devices, ‘b’ is for Block Devices
  • Character devices example: keyboard, mouse, /dev/random
-read/write a single byte (character) at a time
  • What makes b-disk different? Caches.
-Hard drives are optical drives are examples of block devices
  • Character and block devices are accessed via special files, denoted by 'c' and 'b' in the file metadata
  • Block devices are for block storage, to cache results
-It is faster to read or write large chunks of data (e.g. 4KiB) than it would be one byte at a time
-Traditional filesystems are stored in block devices
  • However, a filesystem can be anything that provides a filesystem interface.

file interface basics:

  • open, read, write, seek, close
  • open directory, read/write directory, close directory

Block size issues:

  • larger blocks, less overhead
-there is a fixed cost for each block access
-large files span many blocks
  • smaller blocks, more efficient space usage
-you end up with fewer files that are smaller than one block

filesystem interface basics:

-mount and unmount

filesystem block size is a multiple of disk block (sometimes called sector or cluster) size

 e.g. an SSD has a 512 byte sectors with an EXT3 filesystem with 8,192 byte (8K) blocks


Inodes

How do you structure a filesystem? First, you have to understand inodes.

Demonstration

1. Run cp lec07.txt test.txt

- There is no difference between those files

2. Run ln test.txt duplicate.txt

- There is difference between duplicate.txt and test.txt: A number for test.txt changed from ‘1’ to ‘2’. That number is inode count
- ln stands for link

3. Run ln duplicate.txt alsodup.txt

- Inode count number changed ‘2’ from ‘3’

4. Run ln -s duplicate.txt another.txt

- There is no different between another.txt lec07.txt

5. Run ls -lahas diffrence’

- It shows another.txt -> duplicate.txt (symbolic link pointing one to the other

6.Run rm duplicate.txt

- The arrow is still pointing, but inode count number goes down. (From ‘3’ to ‘2’)

7. Run cat another.txt

- It shows there is such file, because the file that pointing to is not exist
  • ‘ln’ creates hard links to an item by default.
  • File name is a pointer to an inodes.
  • ‘rm’ is not remove, just unlink, which reduces the link count for inode
  • When you clear storage, when inode inbound number go to 0, is not link to other file,system will remove it.
  • When the user open a file, reference count is incremented.
  • A file sys doesn’t have to be a block device? It can store anywhere as long as you provide the api.
  • A file system can be made in a file (the basic of virtual machine)

One other file api: mmap

Inode store permission

mmap, munmap: map or unmap files or devices into memory, this is the piece that unify the file sys with process memory.

What ‘read’ does: read directly into RAM

Run top:

-There are ‘VIRT’, ‘RES’ and ‘SHR’:
VIRT stands for the virtual size of a process
RES stands for the resident size, which is an accurate representation of how much actual physical memory a process is consuming.
SHR indicates how much of the VIRT size is actually sharable (memory or libraries).