COMP 3000 Fall 2019 Assignment 4 solutions 1. [1] Can you load a Linux kernel module more than once? Explain briefly. A: You can't load a module more than once, because modules are linked in to the kernel. Loading them more than once is the same as linking the same library more than once into a program. Doing this means the same global symbols are potentially defined multiple times, leading to serious problems (such as, which one is the valid definition?). 2. [1] For the modules we have created in class, does the code run continuously, or does it run in response to certain events? Explain, and be specific. A: The modules we have created have functions that run when the module is loaded and unloaded and in response to specific events for which the initialization function has registered handlers. For example, when registering a character device, we specify a struct with pointers to file operations. When a process accesses the device file, these specified functions are called. Similarly the rootkit module changes the system call table so its functions are called rather than the standard system call handlers. Outside of these specific contexts, the module code doesn't run. 3. [1] When the kernel does a printk(), does it write directly to /var/log/kern.log? Explain. A: printk() either writes directly to a console or it outputs to a kernel log buffer that can be read by a logging daemon (such as klogd or systemd). The file /var/log/kern.log is written to by syslogd/klogd/systemd. We don't want the kernel writing directly to a file because different systems may want logs to go to different places. So, a userspace process has to request the logs. If nobody is requesting the logs we're probably just booting up, so they go to a console. 4. [1] The makefile for a Linux kernel module is generally very simple; however, building a module seems to be a bit complicated, generating lots of files. Where is the module build process getting instructions for doing all this work? A: The module build process gets the files from the kernel source directory specified in the makefile (and passed to make with the -C option). The kernel source contains many makefiles, generally one per directory - these are called recursively as needed. By building the module as if it was in the kernel source directory we can use the kernel's full facilities for building kernels. 5. [1] If you change line 22 in the ones module to define CLASS_NAME to be "OSclass", will this change the observable behavior of the module? Explain briefly. A: It will change the directory created in /sys/class from being comp3000 to being OSclass. This happens because /sys/class lists every class of device driver in the system (and does so by accessing kernel data structures directly). 6. [1] When doing the work for a system call, how does the kernel keep track of which process made that system call (so it does the work on behalf of that process)? A: The kernel maintains a "current" pointer that points to the task_struct that represents the currently running process or thread. So most of the kernel just has to use "current" in order to access the appropriate code and data. 7. [2] How does the format of data returned by getdents(2) differ from that returned by readdir(3) (at a high level)? What is a key motivation for this difference? A: getdents returns multiple directory entries, while readdir returns one directory entry. The key reasons for this difference is we naturally get multiple directory entries at once as a single block read can hold multiple entries, and it makes sense to send all of these to userspace at once as it reduces the number of system calls made. The C library can then return each individual directory entry via readdir. 8. [2] How can you change the magic_prefix for the rootkit without changing the code of the module? How is this information passed to the kernel at runtime? A: You can pass arguments to a module using arguments to insmod. These arguments are then passed to the kernel using the init_module or finint_Module system calls (which take a character string of parameters as arguments). A module can make special declarations which will be filled in by supplied parameters (if any). 9. [2] When the kernel allocates memory for its own use, does it refer to that memory using virtual or physical addresses? How does the remember module show this? A: The kernel refers to its own memory using virtual addresses. The remember module shows this as the pointer used to access the buffer of saved data is a virtual address (we get a physical address and turn it into the corresponding virtual address). 10. [2] What is a significant reason why the kernel uses functions such as copy_to_user() when accessing process memory? Why not just access this memory directly? A: If the kernel accesses a userspace pointer directly as the kernel has its own address space. If you don't translate between the process's address space and the kernel's address space, the process's pointer might refer to a part of kernel memory that a process shouldn't be able to access. Thus on every memory access the userspace pointer has to be checked and potentially translated (depending on the architecture). 11. [2] Change the ones module so that it will allow writes, and the first character of whatever is written will become the character that is repeatedly output when reading (instead of '1'). What changes do you need to make? A: You basically make the ones module like the remember module, except that we only have to store one character in a global variable rather than dynamically allocating a buffer. In particular we have to: * define a write function in the operations struct, * change the device file to have write permissions, * make the write function store the first character of the buffer passed to it, * and change the read function to use the stored character rather than '1'. 12. [4] How could you make a "spooky rootkit" (based on 3000rootkit) that would randomly (with a .01 probability on each call to getdents) insert a file "BOO!" with an inode of 9999 into the stream of returned files? Note that you can get random bytes using the get_random_bytes() function in the kernel. A: A spooky rootkit would modify the getdents call so that it * get_random_bytes() to produce an unsigned int, mod 100 the int, and then check if we get a zero. If it is zero, do the rest * Make an empty buffer using kmalloc * make a "BOO" entry at the start using statically declared data, * copy in as many of the remaining entries as will fit * copy this modified data to the userspace buffer. * (If you want to be fancy, copy entries one at a time and insert BOO at a random place.)