Operating Systems 2022F: Tutorial 5
In this tutorial you will be learning about files and filesystems by experimenting with and extending 3000test.c (listed below), creating and manipulating local filesystems, and mounting remote files using sshfs.
The following are written assuming you are using an openstack virtual machine. Some questions may require some changes if you are using another environment.
WARNING: Several of the commands here can lead to system corruption and data loss. Please use a VM and make backups (eg with scs-backup)!
A: Files and inodes
In UNIX filesystems, a filename does not directly refer to the contents of a file. Instead, a filename refers to a inode, and the inode then refers to the data. An inode is where file metadata is stored. File ownership, timestamps, and permissions are all stored in a file's inode. Regular files are just hard links connecting a file name to an inode.
In addition to regular files, we also have symbolic links, directories, block devices, character devices, and pipes. Each is its own type of inode.
Note that while you can find out a file's inode, you cannot go from an inode to a pathname or otherwise manipulate an inode directly from userspace - you always have to go through the pathname. The kernel, however, can access individual inodes directly (indeed it has to in order to get to the contents of a file when given a filename).
3000test.c uses fstat to give information on a given inode.
Tasks (Part A)
- Compile and run 3000test.c. It takes a filename as an argument and reports information on the file. Try giving it the following and see what it reports:
- a regular text file that exists
- a directory
- a symbolic link
- a device file (character or block)
- Change 3000test to use lstat rather than stat. How does its behavior change?
- Modify 3000test so when it is given a symbolic link it reports the name of the target. Use readlink(2).
- Are there files or directories that you cannot run 3000test on? Can you configure file/directory permissions so as to make something inaccessible to 3000test?
- How does the memory use of 3000test change as it runs? You may want to add calls to sleep(3) so you can observe its memory usage. You can create a 1 GB file of random data with the command dd if=/dev/urandom of=test bs=1024 count=1000000.
- (Optional) Create a program 3000compare.c based on 3000test that compares the contents of two files and says whether or not they differ.
- If given symbolic links it should report on where they point and only say they are equal if they refer to the same file.
- If given two device files, it should say that they are equal if they are both the same kind of device file and have the same major and minor numbers.
- If given two hard links to the same file, it should say that the files are identical because they refer to the same inode.
- Other kinds of files (directories, pipes) should not be compared. Instead, it should report on the type of each file.
B: Creating, Mounting, and Unmounting Filesystem
In UNIX systems we always have a single hierarchical namespace for files and directories. Pathnames that start with a / are absolute, starting at the top of the namespace, while ones that do not start with a / are relative to the current directory.
The files in this single namespace come from many sources, each of which can have very different semantics. These sources are known as filesystems.
When first started, a system must have at least one filesystem. This first filesystem is known as the root filesystem and has the contents of /. We can then add other filesystems to this initial set of files by mounting each filesystem on an empty directory.
On Ubuntu 21.04, the root filesystem is ext4 by default and many others are mounted on top of it. For example, a proc filesystem is mounted on /proc, a sysfs is mounted on /sys. If you run "mount" with no arguments it lists all of the currently mounted filesystems. Note that most of these are virtual filesystems, in that they do not correspond to any persistent storage (like a disk).
If you insert a usb stick on a laptop or desktop running Linux, it must be mounted before you can access its contents. The current convention is for it to be mounted in /media/<mounting user>/<name of USB stick filesystem>.
To create an ext4 filesystem, you just run
mkfs.ext4 <writable block device>
If you were to run this on the root filesystem of your current system, it would reformat your entire disk if it was allowed to complete. In all likelihood the system should prevent you from doing this as the root filesystem (and any mounted filesystem) gets special protections.
To mount a filesystem, you specify the filesystem to be mounted (by its type or by the device) and the mountpoint:
mount /dev/sdb1 /mnt
This would mount the first partition of the second disk on /mnt.`
Filesystems can be stored on block devices - devices whose contents are accessed by specifying a block index. The other main type of device file, character devices, are used for devices where the data is accessed a single byte (character) at a time. Terminals, modems, printers - basically anything that isn't used for mass storage will generally be represented as a character device.
Note that mount here can identify the filesystem on /dev/sdb1 by looking at its superblock. The superblock is either in block 0 or 1 of a device. The superblock holds metadata on a filesystem, allowing it to be identified and specified. Without the superblock you cannot do anything with a filesystem. The blocks of a filesystem can be classified as either being superblocks, inode blocks, or data blocks.
Sometimes we want to play with a new filesystems but we don't have a physical device to format. If have free space for a file in our current filesystem, however, we can put a filesystem in a file by using a loopback block device. A loopback block device is a block device where its data is stored in a regular file on another filesystem (rather than on a separate device). If you do filesystem commands with a regular file, it will transparently associate the file with an available loopback block device.
Note that inode numbers are filesystem specific. Thus, the contents of a file are uniquely specified by its filesystem and inode. Only the kernel has to worry about this level of detail; in userspace, a file's full path contains all the necessary information.
Filesystems on persistent storage are always at risk of corruption. File system checker programs (fsck) can detect and repair filesystem errors.
Tasks (Part B)
- Run ls -lai (by itself or for a specific directory). What are the numbers appearing in the left column?
- Run dd if=/dev/zero of=foo bs=8192 count=32K What is the logical size of the file? How much space does it consume on disk? (Hint: Look at the size option to ls.)
- Run mkfs.ext4 foo. (If asked, say "yes" to operating on a regular file.) Does foo consume any more space?
- Run dumpe2fs foo. What does the output of this command mean?
- What command do you run to check the filesystem in foo for errors?
- Run sudo mount foo /mnt. How does this command change what files are accessible?
- Run df. What device is mounted on /mnt? What is this device?
- Run chown student:student /mnt. (If you are on your own Linux system, substitute your current user for student.)
- Run rsync -a -v /etc /mnt. What does this command do? Explain the arguments as well. Did you get errors copying any files?
- Run sudo umount /mnt. What files can you still access, and what have gone away?
- Run dd if=/dev/zero of=foo conv=notrunc count=10 bs=512. How does the "conv=notrunc" change dd's behavior (versus the command in question 2)?
- Run sudo mount foo /mnt. What error do you get?
- What command can you run to make foo mountable again? What characteristic of the file system enables this command to work?
- Run the command truncate -s 1G bar. What is the logical size of bar, and how much space does it consume on disk? How does this compare with foo?
- How does the logical size of bar change when you create an ext4 filesystem in it? What about the space consumed on disk?
C: SSHFS
Here you will be learning about sshfs, a network filesystem built on FUSE. Note that this part requires that you already understand and have set up ssh (as explained in the previous tutorial).
Remote filesystems using sshfs
To mount the other user's files in a directory called "otherfiles", do the following (as user student, ubuntu, or your personal account):
mkdir otherfiles sshfs other@localhost: otherfiles
To unmount the filesystem:
fusermount -u otherfiles
Tasks (Part C)
- Look at the hard link counts of files locally and compare those to the link counts over sshfs. How do they compare?
- Try running some of the sudo'd commands above (e.g, "sudo mount foo /mnt") without the sudo. Why do they fail? Try running them under strace and see how their system calls change. Do they figure out that they aren't running as root and abort, or do they try doing a privileged operation and it fails?
- Can you access sshfs mounted files as root? (You can become root by typing "sudo su -".) What happens?
- Look at inode numbers in local and remote filesystems (as reported by ls -i). How do they compare?
- dd a large file to a local drive. Do same thing over sshfs. Which is faster? (What is a large file in this context?)
- Can you sshfs to the SCS systems (e.g., access.scs.carleton.ca)?
- How can you use the mount command to unmount a sshfs-mounted filesystem (rather than fusermount)?
Code
/* 3000test.c */
/* v1 Oct. 1, 2017 */
/* Licenced under the GPLv3, copyright Anil Somayaji */
/* You really shouldn't be incorporating parts of this in any other code,
it is meant for teaching, not production */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <errno.h>
#include <string.h>
void report_error(char *error)
{
fprintf(stderr, "Error: %s\n", error);
exit(-1);
}
int main(int argc, char *argv[])
{
struct stat statbuf;
char *fn;
int fd;
size_t len, i, count;
char *data;
if (argc < 2) {
if (argc < 1) {
report_error("no command line");
fprintf(stderr, "Usage: %s <file>\n", argv[0]);
} else {
report_error("Not enough arguments");
fprintf(stderr, "Usage: %s <file>\n", argv[0]);
}
}
fn = argv[1];
if (stat(fn, &statbuf)) {
report_error(strerror(errno));
}
len = statbuf.st_size;
printf("File %s: \n", fn);
printf(" inode %ld\n", statbuf.st_ino);
printf(" length %ld\n", len);
if (S_ISREG(statbuf.st_mode)) {
fd = open(fn, O_RDONLY);
if (fd == -1) {
report_error(strerror(errno));
}
data = (char *) mmap(NULL, len,
PROT_READ, MAP_SHARED, fd, 0);
if (data == MAP_FAILED) {
report_error(strerror(errno));
}
count = 0;
for (i=0; i<len; i++) {
if (data[i] == 'a') {
count++;
}
}
printf(" a count %ld\n", count);
if (munmap(data, len) == -1) {
report_error(strerror(errno));
}
close(fd);
}
return 0;
}