Operating Systems 2019W: Assignment 4

From Soma-notes

For this assignment please work on the questions below. While you will be able to submit your answers, by default they will not be graded. Instead, an approximately 20 question multiple choice quiz (based on the questions below) will be made available by April 8 and will be due by 4 PM on April 10th. The grade for this quiz will substitute for Assignment 4.

In class on April 10th solutions for the assignment questions below and the Assignment 4 Quiz will be given and discussed.

If you feel your answers to the quiz do not demonstrate what you actually learned, you can make an appointment with Prof. Somayaji to grade your submitted Assignment 4 answers (assuming you submitted them by April 10th) before April 19th.

Questions

  1. [2] When you hit Ctrl-C while running the bc command, does the Ctrl-C generate a signal? How do you know?
  2. [2] Does bashreadline report commands given to bash via shell scripts? How do you know?
  3. [2] How can you modify 3000pc so the producer stops on the third time it fills its queue? Give your code as a diff -c versus the original 3000pc.
  4. [2] Under what conditions would 3000pc never send a signal (from the producer to the consumer or vice versa)? How could you modify 3000pc so this occurance was more likely?
  5. [2] Implement and critique your own version of sem_wait() and sem_post() in pure C with no system calls. Do you think your version is free of race conditions? Explain.
  6. [2] Change 3000pc so that uses a shared memory mapped file rather than simply shared memory to communicate between the producer and consumer. How big is the file? Does its size change?
  7. [2] How difficult is it for a userspace process to generate truly random numbers on its own, without the assistance of the kernel? Explain.
  8. [2] Does the ssh command use random numbers? How can you demonstrate this?
  9. [2] Can a setuid root process give up its privileges and run as a regular user? Is this change necessarily permanant?
  10. [2] What are the main factors that influence whether the attacker wins the race with 3000log-write? Explain what it means for the attacker to win.

Multiple choice questions

  1. What Linux distribution does Anil use?
    1. CentOS
    2. Arch
    3. Ubuntu
    4. Debian
    5. Gentoo
  2. Which of the following operations are permitted for an unprivileged user?
    1. Making a hard link to a file owned by root
    2. Making a symbolic link to a file owned by root
    3. Making a character device file (with mknod)
    4. Mounting a USB stick on /mnt
    5. None of the above
  3. Suppose you are running bc in a Linux terminal and press CTRL-C during its execution. Which of the following is TRUE?
    1. The kernel sends SIGSTOP to the terminal and causes bc to pause.
    2. Highlighted text gets copied to the clipboard on Linux.
    3. bc sends SIGTERM to itself and exits.
    4. bc sends SIGINT to itself and exits.
    5. None of the above
  4. Bashreadline can be used to observe
    1. commands given to bash when entered in a terminal
    2. commands given to bash from sourced scripts
    3. commands given to any program on the system via a terminal
    4. A and B
    5. A, B, and C
  5. The call b.attach_uretprobe() in bashreadline needs root access in order to succeed. Why does it need such privileges?
    1. It is directly modifying files owned by root.
    2. It is binding a privileged UNIX domain socket.
    3. It is accessing memory in an unsafe manner.
    4. It is loading code into the kernel.
    5. All of the above
  6. When the producer receives a SIGUSR1 signal in 3000pc, it immediately responds by
    1. printing a message to standard out
    2. changes the prod_waiting flag
    3. wakes up the producer
    4. All of the above
    5. None of the above
  7. Increasing the queue size in 3000pc will
    1. make the consumer send more signals
    2. make the producer sleep less frequently
    3. make the producer produce faster
    4. make the consumer consume faster
    5. All of the above
  8. An implementation of a semaphore tests the value of the semaphore and changes it if the semaphore is available. What makes it hard to write a good semaphore implementation?
    1. semaphore operations should be fast
    2. semaphores work on shared data
    3. testing and setting a variable are normally separate instructions
    4. CPU cores have separate caches
    5. All of the above
  9. Which of the following is true about "shared" in 3000pc?
    1. It is fixed sized
    2. Its size varies depending upon how many words are stored in it
    3. It is allocated in the standard process heap
    4. It is never cached
    5. All of the above
  10. Which of the following could be used to generate truly random numbers?
    1. Seeding rand() in C with the current time
    2. Gathering input from a webcam
    3. Using an algorithm that generates numbers using a cryptographic hash
    4. Calculating digits of a transcendental number such as pi
    5. None of the above
  11. The ssh command on Linux opens all of these files EXCEPT
    1. /etc/ld.so.cache
    2. /etc/services
    3. /dev/random
    4. /dev/urandom
    5. /etc/passwd
  12. With what uid and euid values is it possible for a process to (successfully) call seteuid? Note that seteuid is a privileged system call that regular user processes cannot invoke.
    1. A process is running with uid=0 and euid=0
    2. A process is running with uid=1000 and euid=0
    3. A process is running with uid=1000 and euid=1000
    4. A process is running with uid=0 and euid=1000
    5. All of the above
  13. A process's execution can be paused, allowing another process to run on the CPU, when it
    1. makes a system call
    2. copies a string
    3. reads a file
    4. All of the above
    5. None of the above
  14. Increasing a process's priority (lowering its niceness value)
    1. increases the likelihood of it running when it needs CPU time
    2. gives it a guaranteed share of CPU time
    3. makes it send signal faster
    4. All of the above
    5. None of the above
  15. Assuming that the attacker has already set up the TOCTTOU attack in advance, when must the attacker's code run relative to the code in safe_write() in 3000log-write?
    1. Before the call to access()
    2. After the call to access(), before the open()
    3. After the call to open(), before the write()
    4. After the call to write()
    5. None of the above

Solutions

COMP 3000 Winter 2019
Assignment 4 Solutions

1. [2] When you hit Ctrl-C while running the bc command, does the
   Ctrl-C generate a signal?  How do you know?

   A: It does generate a signal that bc sends to itself.  You can tell
   this by running killsnoop (or potentially strace) and seeing the
   source and destination of the signal 2 (SIGINT).


2. [2] Does bashreadline report commands given to bash via shell
   scripts?  How do you know?

   A: bashreadline does not report commands given to bash via shell
   scripts.  You can confirm this by creating a small shell script
   with a few commands in it (such as whoami and id).  With
   bashreadline it will show you running the script but it won't show
   the commands in the script.  This behavior makes sense as the
   readline function is only used when bash reads from a terminal, not
   when it reads from a file.
 

3. [2] How can you modify 3000pc so the producer stops on the third
   time it fills its queue?  Give your code as a diff -c versus the
   original 3000pc.

   A: Your code should maintain a counter in the producer that is
   incremented every time it sleeps in wait_for_consumer().  When this
   counter reaches 3 then the producer could go into an infinite sleep
   (i.e., change the while loop with the call to nanosleep to never
   exit).

   [FIX: GIVE IMPLEMENTATION, but this is pretty straightforward]


4. [2] Under what conditions would 3000pc never send a signal (from
   the producer to the consumer or vice versa)?  How could you modify
   3000pc so this occurance was more likely?

   A: The producer only sends a signal when the consumer is sleeping,
   and the consumer only sends a signal when the producer is sleeping.
   They go to sleep when the queue is empty or full, respectively.
   Thus to prevent this from happening you want to make it so the
   queue is never full or empty.  While increasing the size of the
   queue can make it less likely for the queue to get full, if the
   producer produces faster than the consumer consumes the producer
   will eventually fill it.  Similarly, if the consumer consumes
   faster than the producer produces, the queue will eventually be
   emptied by the consumer.  To reduce the probability of either of
   these events, the producer and consumer must run at the same rate.
   To determine this you could benchmark each.  Then you could add
   sleeps or busy loops to the faster until they run at the same rate.


5. [2] Implement and critique your own version of sem_wait() and
   sem_post() in pure C with no system calls.  Do you think your
   version is free of race conditions?  Explain.

   A: You can implement sem_wait and sem_post as simple functions that
   test a condition variable and change its value, with sem_wait()
   doing a loop (busy waiting) until the semaphore was available.  No
   matter what you do, however, these versions will have race
   conditions because it is not possible to both test a value and
   change it atomically.  To combine them in an atomic operation we
   need special CPU instructions for which we'd need inline assembly
   or other non-standard mechanism.

   [FIX: ADD SAMPLE IMPLEMENTATIONS]


6. [2] Change 3000pc so that uses a shared memory mapped file rather
   than simply shared memory to communicate between the producer and
   consumer.  How big is the file?  Does its size change?

   A: You simply have to change the mmap to drop the MAP_ANONYMOUS
   flag and change the -1 to a valid file descriptor of a file you've
   previously opened for reading and writing.  The size of the file
   should be the size of the shared struct (or larger), rounded up to
   the next page size (so divisible by 4096).  The file size won't
   change because the size of shared is fixed.

   [FIX: IMPLEMENT TO CHECK]


7. [2] How difficult is it for a userspace process to generate truly
   random numbers on its own, without the assistance of the kernel?
   Explain.

   A: It is nearly impossible for a process to generate random numbers
   on its own because randomness cannot be produced algorithmically,
   it must come from a physical process.  If a CPU has special
   (non-privileged) instructions for generating randomness or if a
   program can observe CPU registers (such as timing registers) that
   may have a stochastic component, then it would be possible.
   Otherwise programs must interact with input devices (keyboard,
   network, spinning hard drives, webcams) that do not behave
   completely deterministically, and to interact with those devices a
   process must interact with the kernel through system calls.


8. [2] Does the ssh command use random numbers?  How can you
   demonstrate this?

   A: The ssh command does use random numbers.  You can tell because
   it opens and reads from /dev/urandom.  You can demonstate this by
   looking through the output of strace.


9. [2] Can a setuid root process give up its privileges and run as a
   regular user?  Is this change necessarily permanant?

   A: A setuid root process can change its effective user ID (euid) to
   another user ID using seteuid, limiting its privileges to that of
   the new euid.  However, that process can later call seteuid again
   and change back to root (uid=0), as the old euid is saved.  To make
   the change permanent the process should do an execve while
   privileges are dropped.


10. [2] What are the main factors that influence whether the attacker
   wins the race with 3000run-write?  Explain what it means for the
   attacker to win.

   A: For the attacker to win the race, they must be able to switch
   between the safe file and the victim file after the call to access
   but before the call to open.  When this happens depends completely
   on when the kernel decides to schedule the fast-swap process versus
   the log-write process.  Scheduling decisions depend in general on
   the relative priority of the processes, how much CPU and I/O they
   have done, what, and whether they are waiting for an event (such a
   data to be read from disk).  You can thus influnence the likelihood
   of the attacker winning by changing the resources and behavior of
   the fast-swap process and other processes on the system.

   (In developing the assignment I tried creating a script that would
   generate a lot of processes that would then do dd or find
   operations.  They managed to generate a moderate amount of load;
   however, I managed to get the race to work without any additional
   resource consumption.)