Operating Systems 2017F Lecture 7: Difference between revisions
|  Significant formatting improvement, some content omissions, spelling errors | |||
| (One intermediate revision by one other user not shown) | |||
| Line 2: | Line 2: | ||
| Video from the lecture given on September 28, 2017 [http://homeostasis.scs.carleton.ca/~soma/os-2017f/lectures/comp3000-2017f-lec07-28Sep2017.mp4 is now available]. | Video from the lecture given on September 28, 2017 [http://homeostasis.scs.carleton.ca/~soma/os-2017f/lectures/comp3000-2017f-lec07-28Sep2017.mp4 is now available]. | ||
| ==Code== | |||
| Code and files from the lecture (captured as they were at the end) are available [http://homeostasis.scs.carleton.ca/~soma/os-2017f/code/lec07/ here]. | |||
| ==Notes== | ==Notes== | ||
| Line 8: | Line 12: | ||
| Next week: Anil is away from Sunday to Thursday morning. Review section. | Next week: Anil is away from Sunday to Thursday morning. Review section. | ||
| Topic today: File  | Topic today: File systems | ||
| ===What is a file?=== | |||
| It is a key - value pair. | It is a key - value pair. | ||
| Key:   | *Key:  hierarchical filename (pathname) | ||
| *Value: arbitrary number of bytes | |||
| '''In principle, you can use files to store very small values''' | '''In principle, you can use files to store very small values''' | ||
| * most filesystems have a minimum file size of at least 1k, often 4K or more | |||
| * a file containing one byte of data wastes a lot of real disk space | |||
| '''A set of files 'stored' together, sharing a common hierarchical root, is a filesystem.''' | |||
| Why quote 'stored'?  | |||
| * Because there is no storage for virtual filesystems (such as /proc and /sys) | |||
| * when you first start a UNIX system, you start with the ‘root’ filesystem. (have nothing to do with root user.) | |||
| * other filesystems can be put on top by "mount"-ing them. | |||
| ====Demonstration==== | |||
| Run <code>mount</code>, it shows | |||
|   /dev/vda1 on / type ext4 (rw, relatetime,data=ordered) | |||
| * /dev is root filesystem | |||
| * / is the top of hierarchy | |||
| * ‘rw’ means read write  | |||
| * /def/vda1 is a Special file, files in /dev are special files. | |||
| (Special files are actual files, historically they might have been stored on disk, nowadays they are not.) | |||
| Run <code>df .</code> (df is for displaying the amount of available disk space for file systems on which the invoking user has appropriate read access.) | |||
| * There is a filesystem called ‘udev’, which is a virtual file system with zero space used, since it is created automatically on run time | |||
| - | Run <code>who</code> in another terminal | ||
| * it displays <code>soma tty7 2017-09-28 12:59 (:0)</code> | |||
| Run <code>ls /dev</code> | |||
| *Instead of ''vda'', it shows ''sda''. The ‘v’ in ''vda'' stands for ‘virtual’. The ‘s’ in ''sda'' stands for SCSI (The Small Computer System Interface). | |||
| *Any device you want to access on unix system has /dev | |||
| Run <code>ls -a</code> in <code>/dev</code> | |||
| - | *Looking at the permissions (e.g ‘crw--rw----’), first letter for normal file is blank , directory is ‘d’, ‘c’ is for Character Devices, ‘b’ is for Block Devices | ||
| *Character devices example: keyboard, mouse, /dev/random | |||
| :-read/write a single byte (character) at a time | |||
| *What makes b-disk different? Caches. | |||
| :-Hard drives are optical drives are examples of block devices | |||
| *Character and block devices are accessed via special files, denoted by 'c' and 'b' in the file metadata | |||
| *Block devices are for block storage, to cache results | |||
| :-It is faster to read or write large chunks of data (e.g. 4KiB) than it would be one byte at a time | |||
| :-Traditional filesystems are stored in block devices | |||
| *However, a filesystem can be anything that provides a filesystem interface. | |||
| '''file interface basics:''' | '''file interface basics:''' | ||
| *open, read, write, seek, close | |||
| *open directory, read/write directory, close directory | |||
| '''Block size issues:''' | '''Block size issues:''' | ||
| *larger blocks, less overhead | |||
| :-there is a fixed cost for each block access | |||
| :-large files span many blocks | |||
| *smaller blocks, more efficient space usage | |||
| :-you end up with fewer files that are smaller than one block | |||
| '''filesystem interface basics:''' | '''filesystem interface basics:''' | ||
| Line 109: | Line 82: | ||
| -mount and unmount | -mount and unmount | ||
| filesystem block size is a multiple of disk block (sometimes called sector or cluster) size | |||
|   e.g. an SSD has a 512 byte sectors with an EXT3 filesystem with 8,192 byte (8K) blocks | |||
| ===Inodes=== | |||
| How do you structure a filesystem? First, you have to understand inodes. | |||
| ====Demonstration==== | |||
| 4.Run  | 1. Run <code>cp lec07.txt test.txt</code>  | ||
| :- There is no difference between those files | |||
| -There is no different between another.txt lec07.txt | 2. Run <code>ln test.txt duplicate.txt</code> | ||
| :- There is difference between duplicate.txt and test.txt: A number for test.txt changed from ‘1’ to ‘2’. That number is inode count | |||
| 5.Run  | :- <code>ln</code> stands for ''link'' | ||
| 3. Run <code>ln duplicate.txt alsodup.txt</code> | |||
| -It shows another.txt -> duplicate.txt (symbolic link pointing one to the other | :- Inode count number changed ‘2’ from ‘3’ | ||
| 4. Run <code>ln -s duplicate.txt another.txt</code> | |||
| 6.Run  | :- There is no different between another.txt lec07.txt | ||
| 5. Run <code>ls -la</code>has diffrence’ | |||
| -The arrow is still pointing, but inode count number goes down. (From ‘3’ to ‘2’) | :- It shows another.txt -> duplicate.txt (symbolic link pointing one to the other | ||
| 6.Run <code>rm duplicate.txt</code> | |||
| 7.Run  | :- The arrow is still pointing, but inode count number goes down. (From ‘3’ to ‘2’) | ||
| 7. Run <code>cat another.txt</code> | |||
| :- It shows there is such file, because the file that pointing to is not exist | |||
| * ‘ln’ creates hard links to an item by default.  | |||
| * File name is a pointer to an inodes. | |||
| * ‘rm’ is not remove, just unlink, which reduces the link count for inode | |||
| * When you clear storage, when inode inbound number go to 0, is not link to other file,system will remove it. | |||
| * When the user open a file, reference count is incremented. | |||
| * A file sys doesn’t have to be a block device? It can store anywhere as long as you provide the api. | |||
| * A file system can be made in a file (the basic of virtual machine) | |||
| One other file api: mmap | One other file api: mmap | ||
| Inode store  | '''Inode store permission''' | ||
| mmap, munmap: map or unmap files or devices into memory, this is the piece that unify the file sys with process memory. | mmap, munmap: map or unmap files or devices into memory, this is the piece that unify the file sys with process memory. | ||
| Line 183: | Line 124: | ||
| What ‘read’ does: read directly into RAM | What ‘read’ does: read directly into RAM | ||
| Run <code>top</code>: | |||
| :-There are ‘VIRT’, ‘RES’ and ‘SHR’: | |||
|  VIRT stands for the virtual size of a process | |||
| -There are ‘VIRT’, ‘RES’ and ‘SHR’ |  RES stands for the resident size, which is an accurate representation of how much actual physical memory a process is consuming. | ||
|  SHR indicates how much of the VIRT size is actually sharable (memory or libraries). | |||
Latest revision as of 05:16, 4 October 2017
Video
Video from the lecture given on September 28, 2017 is now available.
Code
Code and files from the lecture (captured as they were at the end) are available here.
Notes
3000 Sep28 Next week: Anil is away from Sunday to Thursday morning. Review section.
Topic today: File systems
What is a file?
It is a key - value pair.
- Key: hierarchical filename (pathname)
- Value: arbitrary number of bytes
In principle, you can use files to store very small values
- most filesystems have a minimum file size of at least 1k, often 4K or more
- a file containing one byte of data wastes a lot of real disk space
A set of files 'stored' together, sharing a common hierarchical root, is a filesystem. Why quote 'stored'?
- Because there is no storage for virtual filesystems (such as /proc and /sys)
- when you first start a UNIX system, you start with the ‘root’ filesystem. (have nothing to do with root user.)
- other filesystems can be put on top by "mount"-ing them.
Demonstration
Run mount, it shows
/dev/vda1 on / type ext4 (rw, relatetime,data=ordered)
- /dev is root filesystem
- / is the top of hierarchy
- ‘rw’ means read write
- /def/vda1 is a Special file, files in /dev are special files.
(Special files are actual files, historically they might have been stored on disk, nowadays they are not.)
Run df . (df is for displaying the amount of available disk space for file systems on which the invoking user has appropriate read access.)
- There is a filesystem called ‘udev’, which is a virtual file system with zero space used, since it is created automatically on run time
Run who in another terminal
- it displays soma tty7 2017-09-28 12:59 (:0)
Run ls /dev
- Instead of vda, it shows sda. The ‘v’ in vda stands for ‘virtual’. The ‘s’ in sda stands for SCSI (The Small Computer System Interface).
- Any device you want to access on unix system has /dev
Run ls -a in /dev
- Looking at the permissions (e.g ‘crw--rw----’), first letter for normal file is blank , directory is ‘d’, ‘c’ is for Character Devices, ‘b’ is for Block Devices
- Character devices example: keyboard, mouse, /dev/random
- -read/write a single byte (character) at a time
- What makes b-disk different? Caches.
- -Hard drives are optical drives are examples of block devices
- Character and block devices are accessed via special files, denoted by 'c' and 'b' in the file metadata
- Block devices are for block storage, to cache results
- -It is faster to read or write large chunks of data (e.g. 4KiB) than it would be one byte at a time
- -Traditional filesystems are stored in block devices
- However, a filesystem can be anything that provides a filesystem interface.
file interface basics:
- open, read, write, seek, close
- open directory, read/write directory, close directory
Block size issues:
- larger blocks, less overhead
- -there is a fixed cost for each block access
- -large files span many blocks
- smaller blocks, more efficient space usage
- -you end up with fewer files that are smaller than one block
filesystem interface basics:
-mount and unmount
filesystem block size is a multiple of disk block (sometimes called sector or cluster) size
e.g. an SSD has a 512 byte sectors with an EXT3 filesystem with 8,192 byte (8K) blocks
Inodes
How do you structure a filesystem? First, you have to understand inodes.
Demonstration
1. Run cp lec07.txt test.txt 
- - There is no difference between those files
2. Run ln test.txt duplicate.txt
- - There is difference between duplicate.txt and test.txt: A number for test.txt changed from ‘1’ to ‘2’. That number is inode count
- - lnstands for link
3. Run ln duplicate.txt alsodup.txt
- - Inode count number changed ‘2’ from ‘3’
4. Run ln -s duplicate.txt another.txt
- - There is no different between another.txt lec07.txt
5. Run ls -lahas diffrence’
- - It shows another.txt -> duplicate.txt (symbolic link pointing one to the other
6.Run rm duplicate.txt
- - The arrow is still pointing, but inode count number goes down. (From ‘3’ to ‘2’)
7. Run cat another.txt
- - It shows there is such file, because the file that pointing to is not exist
- ‘ln’ creates hard links to an item by default.
- File name is a pointer to an inodes.
- ‘rm’ is not remove, just unlink, which reduces the link count for inode
- When you clear storage, when inode inbound number go to 0, is not link to other file,system will remove it.
- When the user open a file, reference count is incremented.
- A file sys doesn’t have to be a block device? It can store anywhere as long as you provide the api.
- A file system can be made in a file (the basic of virtual machine)
One other file api: mmap
Inode store permission
mmap, munmap: map or unmap files or devices into memory, this is the piece that unify the file sys with process memory.
What ‘read’ does: read directly into RAM
Run top:
- -There are ‘VIRT’, ‘RES’ and ‘SHR’:
VIRT stands for the virtual size of a process RES stands for the resident size, which is an accurate representation of how much actual physical memory a process is consuming. SHR indicates how much of the VIRT size is actually sharable (memory or libraries).