COMP 3000 Fall 2019
Assignment 4 solutions

1.  [1] Can you load a Linux kernel module more than once?  Explain
briefly.

    A: You can't load a module more than once, because modules are
       linked in to the kernel.  Loading them more than once is the
       same as linking the same library more than once into a program.
       Doing this means the same global symbols are potentially
       defined multiple times, leading to serious problems (such as,
       which one is the valid definition?).

2.  [1] For the modules we have created in class, does the code run
continuously, or does it run in response to certain events?  Explain,
and be specific.

    A: The modules we have created have functions that run when the
       module is loaded and unloaded and in response to specific
       events for which the initialization function has registered
       handlers.  For example, when registering a character device, we
       specify a struct with pointers to file operations.  When a
       process accesses the device file, these specified functions are
       called.  Similarly the rootkit module changes the system call
       table so its functions are called rather than the standard
       system call handlers.  Outside of these specific contexts, the
       module code doesn't run.

3.  [1] When the kernel does a printk(), does it write directly to
/var/log/kern.log?  Explain.

    A: printk() either writes directly to a console or it outputs to a
       kernel log buffer that can be read by a logging daemon (such as
       klogd or systemd).  The file /var/log/kern.log is written to by
       syslogd/klogd/systemd.

       We don't want the kernel writing directly to a file because
       different systems may want logs to go to different places.  So,
       a userspace process has to request the logs.  If nobody is
       requesting the logs we're probably just booting up, so they go
       to a console.

4.  [1] The makefile for a Linux kernel module is generally very
simple; however, building a module seems to be a bit complicated,
generating lots of files.  Where is the module build process getting
instructions for doing all this work?

    A: The module build process gets the files from the kernel source
       directory specified in the makefile (and passed to make with
       the -C option).  The kernel source contains many makefiles,
       generally one per directory - these are called recursively as
       needed.  By building the module as if it was in the kernel
       source directory we can use the kernel's full facilities for
       building kernels.

5.  [1] If you change line 22 in the ones module to define CLASS_NAME
to be "OSclass", will this change the observable behavior of the
module?  Explain briefly.

    A: It will change the directory created in /sys/class from being
       comp3000 to being OSclass.  This happens because /sys/class
       lists every class of device driver in the system (and does so
       by accessing kernel data structures directly).

6.  [1] When doing the work for a system call, how does the kernel
keep track of which process made that system call (so it does the work
on behalf of that process)?

    A: The kernel maintains a "current" pointer that points to the
       task_struct that represents the currently running process or
       thread.  So most of the kernel just has to use "current" in
       order to access the appropriate code and data.

7.  [2] How does the format of data returned by getdents(2) differ
from that returned by readdir(3) (at a high level)?  What is a key
motivation for this difference?

    A: getdents returns multiple directory entries, while readdir
       returns one directory entry.  The key reasons for this
       difference is we naturally get multiple directory entries at
       once as a single block read can hold multiple entries, and it
       makes sense to send all of these to userspace at once as it
       reduces the number of system calls made.  The C library can
       then return each individual directory entry via readdir.

8.  [2] How can you change the magic_prefix for the rootkit without
changing the code of the module? How is this information passed to the
kernel at runtime?

    A: You can pass arguments to a module using arguments to insmod.
       These arguments are then passed to the kernel using the
       init_module or finint_Module system calls (which take a
       character string of parameters as arguments).  A module can
       make special declarations which will be filled in by supplied
       parameters (if any).

9.  [2] When the kernel allocates memory for its own use, does it
refer to that memory using virtual or physical addresses?  How does
the remember module show this?

    A: The kernel refers to its own memory using virtual addresses.
       The remember module shows this as the pointer used to access
       the buffer of saved data is a virtual address (we get a
       physical address and turn it into the corresponding virtual
       address).

10. [2] What is a significant reason why the kernel uses functions
such as copy_to_user() when accessing process memory?  Why not just
access this memory directly?

    A: If the kernel accesses a userspace pointer directly as the
       kernel has its own address space.  If you don't translate
       between the process's address space and the kernel's address
       space, the process's pointer might refer to a part of kernel
       memory that a process shouldn't be able to access.  Thus on
       every memory access the userspace pointer has to be checked and
       potentially translated (depending on the architecture).

11. [2] Change the ones module so that it will allow writes, and the
first character of whatever is written will become the character that
is repeatedly output when reading (instead of '1').  What changes do
you need to make?

    A: You basically make the ones module like the remember module,
       except that we only have to store one character in a global variable
       rather than dynamically allocating a buffer.  In particular we
       have to:

         * define a write function in the operations struct,
         * change the device file to have write permissions,
         * make the write function store the first character of the
           buffer passed to it,
         * and change the read function to use the stored character
           rather than '1'.

12. [4] How could you make a "spooky rootkit" (based on 3000rootkit)
that would randomly (with a .01 probability on each call to getdents)
insert a file "BOO!" with an inode of 9999 into the stream of returned
files?  Note that you can get random bytes using the
get_random_bytes() function in the kernel.

    A: A spooky rootkit would modify the getdents call so that it

       * get_random_bytes() to produce an unsigned int, mod 100 the int, and
         then check if we get a zero.  If it is zero, do the rest
       * Make an empty buffer using kmalloc
       * make a "BOO" entry at the start using statically declared data,
       * copy in as many of the remaining entries as will fit
       * copy this modified data to the userspace buffer.
       * (If you want to be fancy, copy entries one at a time and
	 insert BOO at a random place.)