COMP 3000 2022F Assignment 3 Solutions 1. [2] Download and inspect [https://homeostasis.scs.carleton.ca/~soma/os-2022f/code/3000contain.sh 3000contain.sh]. Is there a risk of data loss from running this script? Specifically, how much of a risk is there from each rm command? Be specific. A: As it is, there is no real risk of data loss from the rm commands. There are three rm commands, we examine each below. The first, on line 24, deletes the image file which the script itself created. There can only be data loss if something important was stored in the image file or if the user had created another file called 3000fsimage and placed it in the current directory. The second, on line 31, only runs if 3000fs exists and is a directory. If the user of the script placed anything in 3000fs before we mounted the image there it could be deleted by this command; otherwise, this rm just removes an empty directory that was created by a previous run of the script. The third, on line 64, deletes 3000setupfs.sh, a file that is created by this script. Again, unless the user created a file with the same name, there's no risk of data loss. 2. [1] Run 3000contain.sh. After 3000contain.sh runs, you're put in a new shell where / is now the contents of 3000fs, and you can't see anything that wasn't in 3000fs. Exiting the shell gets you back to where you were. After exiting, how do you get back to the contained environment? A: In the 3000fs directory, run the command unshare --root=. -f -p --mount-proc This will get you back into the contained environment. (Well, technically, this makes a new confined environment using the same filesystem as before. To get back into the "same" environment we'd needt to have run a diferent command that saved the sharing context so it could be given to this command as an argument.) 3. [2] How does the output of ps differ when run inside the contained environment versus outside? What part of 3000contain.sh caused this difference? A: Inside the contained environment, ps just shows the processes of that environment, nothing else. So you'll see bash and ps if you run ps, with bash being PID 1! This is caused by the unshare command. The -p option ushares the PID space and --mount-proc mounts /proc with the pid restrictions. 4. [2] What does line 58 of 3000contain.sh do? When does it run? Be sure to explain all of its effects. A: Line 58 is the following: echo '/usr/bin/busybox --install' >> $SETUP On its own, it just appends this line to the end of $SETUP, which is 3000setupfs.sh. This line and the lines around it generate a shell script. This script is then run on on line 63 via a chroot command. By running it under chroot, it will run with / being the 3000fs directory, the directory that contains the mounted filesystem in 3000fsimage. This line causes busybox to run and create the hard links to all the commands it supports. Before this command, the new filesystem just has a few files. After this command, it has all the basic files needed for a little Linux environment, including ls, ps, vi, and many more. (They are all minimal versions, but they work!) 5. [2] What is the largest file we can create in the confined environment (once initialized by 3000contain.sh)? What determines this limit? A: The largest file that can be created immediately after running the script is 425783296 bytes (approximately 406MiB), assuming we are running as root. This size is determined first by the dd command, which creates a file of 491520000 bytes (8192 block size * 60000 blocks, 468.75 MiB). We then lose space to filesystem overhead, with us ending up with only 422.7 MiB total. Of this we lose 7.3 MiB for files. 422.7 - 7.3 = 415.4. We have about 9 MiB unaccounted for, but remember that file metadata is significant for larger files, so this is likely due to how ext4 is implemented. (For full credit, you just have to get the space roughly right by doing an experiment and then 1. note how space is initially reserved with dd and 2. how we lose space to filesystem overhead.) 6. [2] If you fill up the disk in the host system, how will it change the amount of data that can be stored in the confined environment? Does this depend on what has been previously stored in the confined environment? A: Filling up space in the host filesystem can reduce the space in the confined environment...if it is done before the confined environment filesystem is fully allocated. When the filesystem is created in 3000fsimage, it de-allocates blocks that should contain all zero bytes. These blocks will need to be allocated as the filesystem fills up, but if the host's filesystem is full these blocks can't be allocated. But, interestingly enough, you can sometimes still create files in the confined system. But instead of storing the contents, you'll just get null blocks (because they can't be allocated). This failure is in effect a "storage medium" failure and so doesn't show up as a normal errors; instead, you'll see errors in the kernel log about I/O errors when writing to /dev/loop3 or similar. In effect we have a virtual disk with bad blocks! But, if we had previously created a large file in the confined environment, all the necessary blocks will have been allocated to 3000fsimage, and so then we can fill up the host disk without any consequences for the confined environment. (And yes, you can avoid this issue by adding the -E nodiscard option to mkfs.ext4, as then it will fully allocate the filesystem at filesystem creation and won't throw away null blocks.) 7. [2] Many files in our confined environment refer to the same inode. What was the original name of this inode? How do you know? A: The original name is /usr/bin/busybox, as this is the program that created all of the other hard links to this file by running it with the --install option (added to the script on line 58, run on line 63). We can tell this by removing lines selectively from the SETUP script (lines 57-60) and noticing that when 58 is removed, the confined filesystem has hardly any files but busybox is there, but when it is in there, we have lots of hard links (all to busybox). 8. [1] Copy and make nano work in the new environment. What files did you have to copy to get it to work? How did you know to copy them? A: You have to copy /lib/x86_64-linux-gnu/libncursesw.so.6 to /lib in the confined environment. You can see this by copying nano into the environment and trying to run it, it reports that it can't find this file. You could also use the command sudo cp `ldd /usr/bin/nano | awk '{print $3}'` 3000fs but this will copy all of the dependencies, and only libncursesw.so.6 is new. (Note this answer is from Fall 2021.) 9. [2] How can you add a user "contain" to 3000fs using useradd (and nothing else)? Make sure the user also is in a new group "contain" and has a home directory /home/contain (in 3000fs). This user should only be visible when you're in the confined environment. How did you confirm that your answer works? A: The command: useradd -m -U -R `pwd` contain added to 3000contain.sh, run before the end (tested to run on line 52 after copying the passwd and related files) does what is required. The -m option makes the home directory, -U creates the group named the same as the user, -R `pwd` does a chroot into the current directory, and contain is the user & group to be created. I confirmed it by adding this line to the script, running it to get into the confined environment, and then checking that I could run "su - contain" to become the user and checked that I had a home directory with the right permissions. (If you set a password for contain you could also use login to log in as the user contain.) 10. [2] How can you mount the main root filesystem inside of the confined environment? What part of 3000contain.sh made this possible? A: First, in the main root filesystem run df to find out its device. You'll get output something like this: student@comp3000:~$ df . Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vg0-lv--0 8187320 4471888 3279824 58% / In the chroot'd environment we can then mount it as follows: bash-5.1# mkdir /mainfs bash-5.1# mount /dev/mapper/vg0-lv--0 /mainfs After this, /mainfs will contain all of the system's files. (A weird consequence of this is how Linux avoids duplicate vies of the confined files. For example, if we ran 3000makefs.sh in /home/student/Documents/A3, we'd see the following when just logging in: student@comp3000:~$ ls /home/student/Documents/A3 3000fs 3000fsimage 3000makefs.sh student@comp3000:~$ ls /home/student/Documents/A3/3000fs bin etc lib linuxrc mainfs root sbin tmp var dev home lib64 lost+found proc run sys usr However, if we look at the same paths after running chroot, things look a bit different than we might expect: bash-5.1# ls / bin home linuxrc proc sbin usr dev lib lost+found root sys var etc lib64 mainfs run tmp bash-5.1# ls /mainfs/home/student/Documents/A3/3000fs bash-5.1# ls /mainfs/home/student/Documents/A3 3000fs 3000fsimage 3000makefs.sh Note how 3000fs appears to be empty while A3 shows the files we would expect.) This is all made possible by line 60, the mounting of /dev in the setup script. Note that this command doesn't have to be run in the script, it also works after the unshare command, and in fact there isn't a straightforward way to prevent this with just the unshare command, we need other security mechanisms to prevent this. (Much of this answer, but not the last part, comes from last year's solutions.) 11. [2] How can you change the hostname in the confined environment to "mycontainer" without changing the hostname of the host system? (Note that the "hostname" command can be used to check and set a system's hostname.) Is this change persistent, i.e., will the hostname stay the same when you exit and re-enter the confined environment? A: We just have to add the option -u to the unshare command to unshare the UTS namespace, which is what holds the hostname. So we change the last line of the script to be unshare --root=. -f -p --mount-proc -u After this, running "hostname contain" won't change the hostname of the host. This change is not persistent, as it just changes some kernel state. Hostnames are only persistent because startup scripts set it on boot from a file (generally /etc/hostname). To make it persistent we would have change our script to set it every time we create the container.