Operating Systems 2014F Lecture 14: Difference between revisions

From Soma-notes
Afry (talk | contribs)
Created page with "How does an operating system know it's accessed memory it doesn't have access to? A lot of you said segments. Filesystems - normally operating system mechanisms are talkin..."
 
No edit summary
 
(5 intermediate revisions by one other user not shown)
Line 1: Line 1:
The audio from the lecture given on November 5, 2014 [http://homeostasis.scs.carleton.ca/~soma/os-2014f/lectures/comp3000-2014f-lec14-05Nov2014.mp3 is now available].
How does an operating system know it's accessed memory it doesn't have access to? A lot of you said segments.  
How does an operating system know it's accessed memory it doesn't have access to? A lot of you said segments.  


Line 7: Line 9:
A couple of challenges with persistent storage. What's weird about storing things in persistent storage? It's slow. durability and persistent.
A couple of challenges with persistent storage. What's weird about storing things in persistent storage? It's slow. durability and persistent.


Going to make errors - we should be able to recover from them. Maybe not fix everything, but preserve most of the data. This is a huge burden on filesystems. Filesystem development tends to be very slow in practise. Whatever code you have doing this, has to do this right. Older filesystems tend to be more trustworthy, when hardware changes, bugs you didn't know where in the filesystem may come up.
Going to make errors - we should be able to recover from them. Maybe not fix everything, but preserve most of the data. This is a huge burden on filesystems. Filesystem development tends to be very slow in practice. Whatever code you have doing this, has to do this right. Older filesystems tend to be more trustworthy, when hardware changes, bugs you didn't know where in the filesystem may come up.
 
 
what do we have today? Indexed filesystems.
 
There is typically a minimum storage allocation given to every file. That's the minimum size of a file, it's taking up 4 k / 8 k. this is not strictly true for all file systems. There was a filesystem that allowed arbitrarily sized files. (ReiserFS)
 
Unifying a keyvalue store for smaller and larger filesystems wasn't considered a priority.
 
Make filesystems do, on modern filesystem, rather than trying to optimize the storage of small files. It's not so much filesize that is the issue.
 
Floppy disks vs. hard disks
 
What is fast and what is slow?
 
fast - reading what is under the drive head at any given time. As long as you keep the head there, you can read the entire concentric circle really fast. That's the fastest operation you are going to get?
 
What's slow? Moving the head from one part of the disk to another - that's a slow operation.
 
Intuitively, why is it that slow? you have to move it with extreme precision..
 
moving data hard - seek time - the time it takes to move the drive head from one area to another on the disk.
 
coordinate system goes by Cylinder Head Sector - the geometry of the disk. If there is data I want to access in parallel, I can optimize this by putting the data on different platters.
 
How many heads there are, which of these, and which cylinder, and which sector, which is a count around.
 
The classic IBM pc BIOS expects cylinders, heads and sectors. It's a lie, your systems get rid of that completely. now they use LBA - linear block addressing. What does that mean? you give the count to the block. When I talk about hard disk blocks, how does that compare to ram. A block is the smallest addressable unit of storage. What is the smallest addressable unit in ram - a byte.
 
Older blocks could be 512 bytes. typical block sizes are 4k / 8k, when you do a transfer you read bytes in chunks. By blocks. On windows, common keyvalue store - the registry - a hierarchical keyvalue store for small keyvalues. it's like a filesystem, but every file is really small. It's stored in a file.
 
hard disks aren't perfect - they have bad blocks. They can't actually store 1s and 0s. in order to encode 4 bits, you have to encode 7 bits. Weird issues in the physics of storing the data. Teh signals they are trying to read off the hard disk is kind of messy.
 
Error correcting Codes
 
hard disk - proprietary information to the hardware manufacturers.
 
keys (filenames) -> mapped to values
 
the keys are really in directories - directory data structures that are storing the names of the files.
 
Random access for every block. Horrible on a hard drive, works well in RAM. Don't want to associate a list of blocks with every directory entry. Bad strategy. I have to do something better. What if I did something kind of like segment, a range of blocks? Modern systems normally do a list of extents (instead of doign a list of blocks) extents - list of segments instead of a list of blocks. The larger the file, the more blocks it's stored in. File is divided into one or more extents. This is exactly what I do not want to do in RAM. One simple way to do this? Have free space - have extra space on your harddisk. On a lot of filesystems - they will say they want to reserve a certain amount of space.
 
It wants that extra space, so that it can make sure it gets that long string of blocks. Files will all become fragmented, what happens to filesystem performance? It goes down. It's trying to resist fragmentation.

Latest revision as of 21:44, 5 November 2014

The audio from the lecture given on November 5, 2014 is now available.

How does an operating system know it's accessed memory it doesn't have access to? A lot of you said segments.

Filesystems - normally operating system mechanisms are

talking about access to hardware systems, persistent storage. - abstraction for persistent storage. Storage that maintains it's state when it loses power. A couple of challenges with persistent storage. What's weird about storing things in persistent storage? It's slow. durability and persistent.

Going to make errors - we should be able to recover from them. Maybe not fix everything, but preserve most of the data. This is a huge burden on filesystems. Filesystem development tends to be very slow in practice. Whatever code you have doing this, has to do this right. Older filesystems tend to be more trustworthy, when hardware changes, bugs you didn't know where in the filesystem may come up.


what do we have today? Indexed filesystems.

There is typically a minimum storage allocation given to every file. That's the minimum size of a file, it's taking up 4 k / 8 k. this is not strictly true for all file systems. There was a filesystem that allowed arbitrarily sized files. (ReiserFS)

Unifying a keyvalue store for smaller and larger filesystems wasn't considered a priority.

Make filesystems do, on modern filesystem, rather than trying to optimize the storage of small files. It's not so much filesize that is the issue.

Floppy disks vs. hard disks

What is fast and what is slow?

fast - reading what is under the drive head at any given time. As long as you keep the head there, you can read the entire concentric circle really fast. That's the fastest operation you are going to get?

What's slow? Moving the head from one part of the disk to another - that's a slow operation.

Intuitively, why is it that slow? you have to move it with extreme precision..

moving data hard - seek time - the time it takes to move the drive head from one area to another on the disk.

coordinate system goes by Cylinder Head Sector - the geometry of the disk. If there is data I want to access in parallel, I can optimize this by putting the data on different platters.

How many heads there are, which of these, and which cylinder, and which sector, which is a count around.

The classic IBM pc BIOS expects cylinders, heads and sectors. It's a lie, your systems get rid of that completely. now they use LBA - linear block addressing. What does that mean? you give the count to the block. When I talk about hard disk blocks, how does that compare to ram. A block is the smallest addressable unit of storage. What is the smallest addressable unit in ram - a byte.

Older blocks could be 512 bytes. typical block sizes are 4k / 8k, when you do a transfer you read bytes in chunks. By blocks. On windows, common keyvalue store - the registry - a hierarchical keyvalue store for small keyvalues. it's like a filesystem, but every file is really small. It's stored in a file.

hard disks aren't perfect - they have bad blocks. They can't actually store 1s and 0s. in order to encode 4 bits, you have to encode 7 bits. Weird issues in the physics of storing the data. Teh signals they are trying to read off the hard disk is kind of messy.

Error correcting Codes

hard disk - proprietary information to the hardware manufacturers.

keys (filenames) -> mapped to values

the keys are really in directories - directory data structures that are storing the names of the files.

Random access for every block. Horrible on a hard drive, works well in RAM. Don't want to associate a list of blocks with every directory entry. Bad strategy. I have to do something better. What if I did something kind of like segment, a range of blocks? Modern systems normally do a list of extents (instead of doign a list of blocks) extents - list of segments instead of a list of blocks. The larger the file, the more blocks it's stored in. File is divided into one or more extents. This is exactly what I do not want to do in RAM. One simple way to do this? Have free space - have extra space on your harddisk. On a lot of filesystems - they will say they want to reserve a certain amount of space.

It wants that extra space, so that it can make sure it gets that long string of blocks. Files will all become fragmented, what happens to filesystem performance? It goes down. It's trying to resist fragmentation.