COMP3000 Operating Systems F23: Tutorial 5: Difference between revisions
No edit summary |
|||
(4 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
In this tutorial you will be learning about files and filesystems by experimenting with and extending [https://people.scs.carleton.ca/~abdou/comp3000/f23/tut5/3000test.c 3000test.c], and creating and manipulating local filesystems. '''WARNING:''' Several of the commands here can lead to system corruption and data loss if not properly used. You have been warned. Please use a VM and make backups, when necessary. | In this tutorial you will be learning about files and filesystems by experimenting with and extending [https://people.scs.carleton.ca/~abdou/comp3000/f23/tut5/3000test.c 3000test.c], and creating and manipulating local filesystems. '''WARNING:''' Several of the commands here can lead to system corruption and data loss if not properly used. You have been warned. Please use a VM and make backups, when necessary. | ||
==Files and inodes== | ==Files and inodes== | ||
Line 18: | Line 16: | ||
#* a symbolic link | #* a symbolic link | ||
#* a device file (character or block) | #* a device file (character or block) | ||
# Change 3000test to use < | # Change 3000test to use <code>lstat()</code> rather than stat. How does its behavior change? | ||
# Modify 3000test so when it is given a symbolic link it reports the name of the target. Use < | # Modify 3000test so when it is given a symbolic link it reports the name of the target. Use <code>readlink(2)</code> (this notation describes how to access the man page, e.g., <code>man 2 readlink</code>). | ||
# Are there files or directories that you cannot run 3000test on? Can you configure file/directory permissions so as to make something inaccessible to 3000test? Note that this is twofold: 1) whether it is completely inaccessible to 3000test ('''nothing can be displayed''', except the error msg); 2) whether it can be accessed for the search ('''no access to file content'''). | # Are there files or directories that you cannot run 3000test on? Can you configure file/directory permissions so as to make something inaccessible to 3000test? Note that this is twofold: 1) whether it is completely inaccessible to 3000test ('''nothing can be displayed''', except the error msg); 2) whether it can be accessed for the search ('''no access to file content'''). | ||
==Creating, Mounting, and Unmounting Filesystem== | ==Creating, Mounting, and Unmounting Filesystem== | ||
In UNIX/Linux systems we always have a single hierarchical namespace for files and directories. Pathnames that start with a < | In UNIX/Linux systems we always have a single hierarchical namespace for files and directories. Pathnames that start with a <code>/</code> are absolute, starting at the top of the namespace, while ones that do not start with a / are relative to the current directory. | ||
The files in this single namespace come from many sources, each of which can have very different semantics. These sources are known as '''filesystems'''. | The files in this single namespace come from many sources, each of which can have very different semantics. These sources are known as '''filesystems'''. | ||
When first started, a system must have at least one filesystem. This first filesystem is known as the root filesystem and has the contents of < | When first started, a system must have at least one filesystem. This first filesystem is known as the root filesystem and has the contents of <code>/</code>. We can then add other filesystems to this initial set of files by '''mounting''' each filesystem on an existing directory. | ||
On Ubuntu 22.04, the root filesystem is ext4 by default and many others are mounted on top of it. For example, a proc filesystem is mounted on < | On Ubuntu 22.04, the root filesystem is ext4 by default and many others are mounted on top of it. For example, a proc filesystem is mounted on <code>/proc</code>, a sysfs is mounted on <code>/sys</code>. If you run "mount" with no arguments, it lists all the currently mounted filesystems. Note that most of these are virtual filesystems, in that they do not correspond to any persistent storage (like a disk). | ||
If you insert a USB stick on a laptop or desktop running Linux, it must be mounted before you can access its contents. The current convention is for it to be mounted in < | If you insert a USB stick on a laptop or desktop running Linux, it must be mounted before you can access its contents. The current convention is for it to be mounted in <code>/media/</code><mounting user><code>/</code><name of USB stick filesystem>. This usually happens automatically nowadays when the USB stick is plugged in. | ||
To create an ext4 filesystem, you just run | To create an ext4 filesystem, you just run | ||
mkfs.ext4 <writable block device> <span style="color:#ff0000">#Do not try it with any < | mkfs.ext4 <writable block device> <span style="color:#ff0000">#Do not try it with any <code>/dev/*</code></span> | ||
If you were to run this on the root filesystem of your current system, it would reformat your entire disk if it was allowed to complete. In all likelihood the system should prevent you from doing this as the root filesystem (and any mounted filesystem) gets special protections. | If you were to run this on the root filesystem of your current system, it would reformat your entire disk if it was allowed to complete. In all likelihood the system should prevent you from doing this as the root filesystem (and any mounted filesystem) gets special protections. | ||
Line 43: | Line 41: | ||
mount /dev/sdb1 /mnt <span style="color:#ff0000">#Do not try it unless you have that block device and want to do it</span> | mount /dev/sdb1 /mnt <span style="color:#ff0000">#Do not try it unless you have that block device and want to do it</span> | ||
This would mount the first partition of the second disk on < | This would mount the first partition of the second disk on <code>/mnt</code>. | ||
Filesystems can be stored on block devices - devices whose contents are accessed by specifying a block index/address. The other main type of device file, character devices, are used for devices where the data is accessed a single byte (character) at a time. Terminals, modems, printers – for example, will generally be represented as a character device. | Filesystems can be stored on block devices - devices whose contents are accessed by specifying a block index/address. The other main type of device file, character devices, are used for devices where the data is accessed a single byte (character) at a time. Terminals, modems, printers – for example, will generally be represented as a character device. | ||
Note that mount here can identify the filesystem on < | Note that mount here can identify the filesystem on <code>/dev/sdb1</code> by looking at its superblock. The superblock is usually either in block 0 or 1 of a device. The superblock holds metadata of a filesystem, allowing it to be identified and specified. Without the superblock you cannot do anything with a filesystem. Because of its importance, there are backup superblocks in addition to the primary one. The blocks of a filesystem can be classified as either being superblocks, inode blocks, or data blocks. | ||
Sometimes we want to play with a new filesystem but we don't have a physical device to format. If we have free space for a file in our current filesystem, however, we can put a filesystem in a file by using a loopback block device. A loopback block device is a block device where its data is stored in a regular file on another filesystem (rather than on a separate device). If you do filesystem commands with a regular file, it will transparently associate the file with an available loopback block device (e.g., < | Sometimes we want to play with a new filesystem but we don't have a physical device to format. If we have free space for a file in our current filesystem, however, we can put a filesystem in a file by using a loopback block device. A loopback block device is a block device where its data is stored in a regular file on another filesystem (rather than on a separate device). If you do filesystem commands with a regular file, it will transparently associate the file with an available loopback block device (e.g., <code>/dev/loop1</code>). | ||
Note that inode numbers are filesystem-specific. Thus, the contents of a file are uniquely specified by its filesystem and inode. Only the kernel has to worry about this level of detail; in userspace, a file's full path contains all the necessary information. | Note that inode numbers are filesystem-specific. Thus, the contents of a file are uniquely specified by its filesystem and inode. Only the kernel has to worry about this level of detail; in userspace, a file's full path contains all the necessary information. | ||
Filesystems on persistent storage are always at risk of corruption. File system checker programs (< | Filesystems on persistent storage are always at risk of corruption. File system checker programs (<code>fsck</code>) can detect and repair filesystem errors. | ||
==Tasks/Questions (Part B)== | ==Tasks/Questions (Part B)== | ||
# Run < | # Run <code>dd if=/dev/zero of=foo bs=8192 count=32K</code> and use the <code>ls</code> command to check it. What is the logical size of the file foo? How much space does it consume on disk? (Hint: Look at the size option to ls). In comparison, create another file with <code>dd if=/dev/zero of=foo2 bs=8192 seek=31K count=1K</code>. Any observations? | ||
# Run < | # Run <code>mkfs.ext4 foo</code>. Does foo consume any more space or less? Do the same (<code>mkfs.ext4 foo2</code>) to the other file and answer the same question. | ||
# Create any file in < | # Create any file in <code>/mnt</code> (e.g., <code>sudo touch test.txt</code>) and run <code>sudo mount foo /mnt</code>. Do you still see the file you just created? | ||
# Run < | # Run <code>df</code>. What device is mounted on <code>/mnt</code>? What is this device? | ||
# Run < | # Run <code>sudo umount /mnt</code>. What have gone away and what is back? | ||
# Run < | # Run <code>dd if=/dev/zero of=foo conv=notrunc count=10 bs=512</code>. How does the "<code>conv=notrunc</code>" change dd's behavior (versus the command in question 1)? | ||
# Run < | # Run <code>sudo mount foo /mnt</code>. What error do you get? | ||
# What command can you run to make foo mountable again? What characteristic of the file system enables this command to work? (Hint: revisit the instructions of this tutorial if you have no idea) | # What command can you run to make foo mountable again? What characteristic of the file system enables this command to work? (Hint: revisit the instructions of this tutorial if you have no idea) |
Latest revision as of 01:12, 25 October 2023
In this tutorial you will be learning about files and filesystems by experimenting with and extending 3000test.c, and creating and manipulating local filesystems. WARNING: Several of the commands here can lead to system corruption and data loss if not properly used. You have been warned. Please use a VM and make backups, when necessary.
Files and inodes
In UNIX/Linux filesystems, a filename does not directly refer to the contents of a file. Instead, a filename refers to an inode (as specified in a directory entry --- dentry, of its parent directory), and the inode then refers to the data. An inode is where file metadata is stored. File ownership, timestamps, and permissions are all stored in a file's inode. Regular files are just hard links connecting a file name to an inode.
In addition to regular files, we also have symbolic links, directories, block devices, character devices, and pipes. Each is its own type of inode.
Note that while you can find out a file's inode, you cannot go from an inode to a pathname or otherwise manipulate an inode directly from userspace - you always have to go through the pathname. The kernel, however, can access individual inodes directly (indeed it has to in order to get to the contents of a file when given a filename).
3000test.c uses stat() to give information on a given inode.
Tasks/Questions (Part A)
- Compile and run 3000test.c. It takes a filename as an argument and reports information on the file. Try giving it the following and see what it reports:
- a regular text file that exists
- a directory
- a symbolic link
- a device file (character or block)
- Change 3000test to use
lstat()
rather than stat. How does its behavior change? - Modify 3000test so when it is given a symbolic link it reports the name of the target. Use
readlink(2)
(this notation describes how to access the man page, e.g.,man 2 readlink
). - Are there files or directories that you cannot run 3000test on? Can you configure file/directory permissions so as to make something inaccessible to 3000test? Note that this is twofold: 1) whether it is completely inaccessible to 3000test (nothing can be displayed, except the error msg); 2) whether it can be accessed for the search (no access to file content).
Creating, Mounting, and Unmounting Filesystem
In UNIX/Linux systems we always have a single hierarchical namespace for files and directories. Pathnames that start with a /
are absolute, starting at the top of the namespace, while ones that do not start with a / are relative to the current directory.
The files in this single namespace come from many sources, each of which can have very different semantics. These sources are known as filesystems.
When first started, a system must have at least one filesystem. This first filesystem is known as the root filesystem and has the contents of /
. We can then add other filesystems to this initial set of files by mounting each filesystem on an existing directory.
On Ubuntu 22.04, the root filesystem is ext4 by default and many others are mounted on top of it. For example, a proc filesystem is mounted on /proc
, a sysfs is mounted on /sys
. If you run "mount" with no arguments, it lists all the currently mounted filesystems. Note that most of these are virtual filesystems, in that they do not correspond to any persistent storage (like a disk).
If you insert a USB stick on a laptop or desktop running Linux, it must be mounted before you can access its contents. The current convention is for it to be mounted in /media/
<mounting user>/
<name of USB stick filesystem>. This usually happens automatically nowadays when the USB stick is plugged in.
To create an ext4 filesystem, you just run
mkfs.ext4 <writable block device> #Do not try it with any /dev/*
If you were to run this on the root filesystem of your current system, it would reformat your entire disk if it was allowed to complete. In all likelihood the system should prevent you from doing this as the root filesystem (and any mounted filesystem) gets special protections.
To mount a filesystem, you specify the filesystem to be mounted (by its type or by the device) and the mountpoint:
mount /dev/sdb1 /mnt #Do not try it unless you have that block device and want to do it
This would mount the first partition of the second disk on /mnt
.
Filesystems can be stored on block devices - devices whose contents are accessed by specifying a block index/address. The other main type of device file, character devices, are used for devices where the data is accessed a single byte (character) at a time. Terminals, modems, printers – for example, will generally be represented as a character device.
Note that mount here can identify the filesystem on /dev/sdb1
by looking at its superblock. The superblock is usually either in block 0 or 1 of a device. The superblock holds metadata of a filesystem, allowing it to be identified and specified. Without the superblock you cannot do anything with a filesystem. Because of its importance, there are backup superblocks in addition to the primary one. The blocks of a filesystem can be classified as either being superblocks, inode blocks, or data blocks.
Sometimes we want to play with a new filesystem but we don't have a physical device to format. If we have free space for a file in our current filesystem, however, we can put a filesystem in a file by using a loopback block device. A loopback block device is a block device where its data is stored in a regular file on another filesystem (rather than on a separate device). If you do filesystem commands with a regular file, it will transparently associate the file with an available loopback block device (e.g., /dev/loop1
).
Note that inode numbers are filesystem-specific. Thus, the contents of a file are uniquely specified by its filesystem and inode. Only the kernel has to worry about this level of detail; in userspace, a file's full path contains all the necessary information.
Filesystems on persistent storage are always at risk of corruption. File system checker programs (fsck
) can detect and repair filesystem errors.
Tasks/Questions (Part B)
- Run
dd if=/dev/zero of=foo bs=8192 count=32K
and use thels
command to check it. What is the logical size of the file foo? How much space does it consume on disk? (Hint: Look at the size option to ls). In comparison, create another file withdd if=/dev/zero of=foo2 bs=8192 seek=31K count=1K
. Any observations? - Run
mkfs.ext4 foo
. Does foo consume any more space or less? Do the same (mkfs.ext4 foo2
) to the other file and answer the same question. - Create any file in
/mnt
(e.g.,sudo touch test.txt
) and runsudo mount foo /mnt
. Do you still see the file you just created? - Run
df
. What device is mounted on/mnt
? What is this device? - Run
sudo umount /mnt
. What have gone away and what is back? - Run
dd if=/dev/zero of=foo conv=notrunc count=10 bs=512
. How does the "conv=notrunc
" change dd's behavior (versus the command in question 1)? - Run
sudo mount foo /mnt
. What error do you get? - What command can you run to make foo mountable again? What characteristic of the file system enables this command to work? (Hint: revisit the instructions of this tutorial if you have no idea)