Operating Systems 2019W: Assignment 4: Difference between revisions
No edit summary |
No edit summary |
||
(9 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
For this assignment please work on the questions below. While you will be able to submit your answers, by | For this assignment please work on the questions below. While you will be able to submit your answers, by default '''they will not be graded.''' Instead, an approximately 20 question multiple choice quiz (based on the questions below) will be made available by April 8 and will be due by 4 PM on April 10th. The grade for this quiz will substitute for Assignment 4. | ||
In class on April 10th solutions for the assignment questions below and the Assignment 4 Quiz will be given and discussed. | In class on April 10th solutions for the assignment questions below and the Assignment 4 Quiz will be given and discussed. | ||
If you feel your answers to the quiz do not demonstrate what you actually learned, you can make an appointment with Prof. Somayaji to grade your submitted Assignment 4 answers (assuming you submitted them by April 10th). | If you feel your answers to the quiz do not demonstrate what you actually learned, you can make an appointment with Prof. Somayaji to grade your submitted Assignment 4 answers (assuming you submitted them by April 10th) before April 19th. | ||
==Questions== | ==Questions== | ||
Line 9: | Line 9: | ||
# [2] When you hit Ctrl-C while running the bc command, does the Ctrl-C generate a signal? How do you know? | # [2] When you hit Ctrl-C while running the bc command, does the Ctrl-C generate a signal? How do you know? | ||
# [2] Does bashreadline report commands given to bash via shell scripts? How do you know? | # [2] Does bashreadline report commands given to bash via shell scripts? How do you know? | ||
# [2] How can you modify 3000pc so the producer stops on the third time it fills its queue? Give your code as a diff -c versus the original 3000pc | # [2] How can you modify 3000pc so the producer stops on the third time it fills its queue? Give your code as a diff -c versus the original 3000pc. | ||
# [2] Under what conditions would 3000pc never send a signal (from the producer to the consumer or vice versa)? How could you modify 3000pc so this occurance was more likely? | # [2] Under what conditions would 3000pc never send a signal (from the producer to the consumer or vice versa)? How could you modify 3000pc so this occurance was more likely? | ||
# [2] Implement and critique your own version of sem_wait() and sem_post() in pure C with no system calls. Do you think your version is free of race conditions? Explain. | # [2] Implement and critique your own version of sem_wait() and sem_post() in pure C with no system calls. Do you think your version is free of race conditions? Explain. | ||
# [2] Change | # [2] Change 3000pc so that uses a shared memory mapped file rather than simply shared memory to communicate between the producer and consumer. How big is the file? Does its size change? | ||
# [2] How difficult is it for a userspace process to generate truly random numbers on its own, without the assistance of the kernel? Explain. | # [2] How difficult is it for a userspace process to generate truly random numbers on its own, without the assistance of the kernel? Explain. | ||
# [2] Does the ssh command use random numbers? How can you demonstrate this? | # [2] Does the ssh command use random numbers? How can you demonstrate this? | ||
# [2] Can a setuid root process give up its privileges and run as a regular user? Is this change necessarily permanant? | # [2] Can a setuid root process give up its privileges and run as a regular user? Is this change necessarily permanant? | ||
# [2] What are the main factors that influence whether the attacker wins the race with 3000run-write? Explain what it means for the attacker to win. | # [2] What are the main factors that influence whether the attacker wins the race with 3000log-write? Explain what it means for the attacker to win. | ||
==Multiple choice questions== | |||
<ol> | |||
<li>What Linux distribution does Anil use? | |||
<ol type="A"> | |||
<li>CentOS</li> | |||
<li>Arch</li> | |||
<li> Ubuntu</li> | |||
<li> Debian</li> | |||
<li> Gentoo</li> | |||
</ol> | |||
</li> | |||
<li>Which of the following operations are permitted for an unprivileged user? | |||
<ol type="A"> | |||
<li> Making a hard link to a file owned by root</li> | |||
<li> Making a symbolic link to a file owned by root</li> | |||
<li> Making a character device file (with mknod)</li> | |||
<li> Mounting a USB stick on /mnt</li> | |||
<li> None of the above</li> | |||
</ol> | |||
</li> | |||
<li> Suppose you are running bc in a Linux terminal and press CTRL-C during its execution. Which of the following is TRUE? | |||
<ol type="A"> | |||
<li> The kernel sends SIGSTOP to the terminal and causes bc to pause.</li> | |||
<li> Highlighted text gets copied to the clipboard on Linux.</li> | |||
<li> bc sends SIGTERM to itself and exits.</li> | |||
<li> bc sends SIGINT to itself and exits.</li> | |||
<li> None of the above</li> | |||
</ol> | |||
</li> | |||
<li> Bashreadline can be used to observe | |||
<ol type="A"> | |||
<li> commands given to bash when entered in a terminal</li> | |||
<li> commands given to bash from sourced scripts </li> | |||
<li> commands given to any program on the system via a terminal</li> | |||
<li> A and B</li> | |||
<li> A, B, and C</li> | |||
</ol> | |||
</li> | |||
<li> The call b.attach_uretprobe() in bashreadline needs root access in order to succeed. Why does it need such privileges? | |||
<ol type="A"> | |||
<li> It is directly modifying files owned by root.</li> | |||
<li> It is binding a privileged UNIX domain socket.</li> | |||
<li> It is accessing memory in an unsafe manner.</li> | |||
<li> It is loading code into the kernel.</li> | |||
<li> All of the above</li> | |||
</ol> | |||
</li> | |||
<li> When the producer receives a SIGUSR1 signal in 3000pc, it immediately responds by | |||
<ol type="A"> | |||
<li> printing a message to standard out</li> | |||
<li> changes the prod_waiting flag</li> | |||
<li> wakes up the producer </li> | |||
<li> All of the above</li> | |||
<li> None of the above</li> | |||
</ol> | |||
</li> | |||
<li> Increasing the queue size in 3000pc will | |||
<ol type="A"> | |||
<li> make the consumer send more signals</li> | |||
<li> make the producer sleep less frequently</li> | |||
<li> make the producer produce faster</li> | |||
<li> make the consumer consume faster</li> | |||
<li> All of the above</li> | |||
</ol> | |||
</li> | |||
<li> An implementation of a semaphore tests the value of the semaphore and changes it if the semaphore is available. What makes it hard to write a good semaphore implementation? | |||
<ol type="A"> | |||
<li> semaphore operations should be fast</li> | |||
<li> semaphores work on shared data</li> | |||
<li> testing and setting a variable are normally separate instructions</li> | |||
<li> CPU cores have separate caches</li> | |||
<li> All of the above</li> | |||
</ol> | |||
</li> | |||
<li> Which of the following is true about "shared" in 3000pc? | |||
<ol type="A"> | |||
<li> It is fixed sized</li> | |||
<li> Its size varies depending upon how many words are stored in it</li> | |||
<li> It is allocated in the standard process heap</li> | |||
<li> It is never cached</li> | |||
<li> All of the above</li> | |||
</ol> | |||
</li> | |||
<li> Which of the following could be used to generate truly random numbers? | |||
<ol type="A"> | |||
<li> Seeding rand() in C with the current time</li> | |||
<li> Gathering input from a webcam</li> | |||
<li> Using an algorithm that generates numbers using a cryptographic hash</li> | |||
<li> Calculating digits of a transcendental number such as pi</li> | |||
<li> None of the above</li> | |||
</ol> | |||
</li> | |||
<li> The ssh command on Linux opens all of these files EXCEPT | |||
<ol type="A"> | |||
<li> /etc/ld.so.cache</li> | |||
<li> /etc/services</li> | |||
<li> /dev/random</li> | |||
<li> /dev/urandom</li> | |||
<li> /etc/passwd</li> | |||
</ol> | |||
</li> | |||
<li> With what uid and euid values is it '''possible''' for a process to (successfully) call seteuid? Note that seteuid is a privileged system call that regular user processes cannot invoke. | |||
<ol type="A"> | |||
<li> A process is running with uid=0 and euid=0</li> | |||
<li> A process is running with uid=1000 and euid=0</li> | |||
<li> A process is running with uid=1000 and euid=1000</li> | |||
<li> A process is running with uid=0 and euid=1000</li> | |||
<li> All of the above</li> | |||
</ol> | |||
</li> | |||
<li> A process's execution can be paused, allowing another process to run on the CPU, when it | |||
<ol type="A"> | |||
<li> makes a system call</li> | |||
<li> copies a string</li> | |||
<li> reads a file</li> | |||
<li> All of the above</li> | |||
<li> None of the above</li> | |||
</ol> | |||
</li> | |||
<li> Increasing a process's priority (lowering its niceness value) | |||
<ol type="A"> | |||
<li> increases the likelihood of it running when it needs CPU time</li> | |||
<li> gives it a guaranteed share of CPU time</li> | |||
<li> makes it send signal faster</li> | |||
<li> All of the above</li> | |||
<li> None of the above</li> | |||
</ol> | |||
</li> | |||
<li> Assuming that the attacker has already set up the TOCTTOU attack in advance, when must the attacker's code run relative to the code in safe_write() in 3000log-write? | |||
<ol type="A"> | |||
<li> Before the call to access()</li> | |||
<li> After the call to access(), before the open()</li> | |||
<li> After the call to open(), before the write()</li> | |||
<li> After the call to write()</li> | |||
<li> None of the above</li> | |||
</ol> | |||
</li> | |||
</ol> | |||
==Solutions== | |||
<pre> | |||
COMP 3000 Winter 2019 | |||
Assignment 4 Solutions | |||
1. [2] When you hit Ctrl-C while running the bc command, does the | |||
Ctrl-C generate a signal? How do you know? | |||
A: It does generate a signal that bc sends to itself. You can tell | |||
this by running killsnoop (or potentially strace) and seeing the | |||
source and destination of the signal 2 (SIGINT). | |||
2. [2] Does bashreadline report commands given to bash via shell | |||
scripts? How do you know? | |||
A: bashreadline does not report commands given to bash via shell | |||
scripts. You can confirm this by creating a small shell script | |||
with a few commands in it (such as whoami and id). With | |||
bashreadline it will show you running the script but it won't show | |||
the commands in the script. This behavior makes sense as the | |||
readline function is only used when bash reads from a terminal, not | |||
when it reads from a file. | |||
3. [2] How can you modify 3000pc so the producer stops on the third | |||
time it fills its queue? Give your code as a diff -c versus the | |||
original 3000pc. | |||
A: Your code should maintain a counter in the producer that is | |||
incremented every time it sleeps in wait_for_consumer(). When this | |||
counter reaches 3 then the producer could go into an infinite sleep | |||
(i.e., change the while loop with the call to nanosleep to never | |||
exit). | |||
[FIX: GIVE IMPLEMENTATION, but this is pretty straightforward] | |||
4. [2] Under what conditions would 3000pc never send a signal (from | |||
the producer to the consumer or vice versa)? How could you modify | |||
3000pc so this occurance was more likely? | |||
A: The producer only sends a signal when the consumer is sleeping, | |||
and the consumer only sends a signal when the producer is sleeping. | |||
They go to sleep when the queue is empty or full, respectively. | |||
Thus to prevent this from happening you want to make it so the | |||
queue is never full or empty. While increasing the size of the | |||
queue can make it less likely for the queue to get full, if the | |||
producer produces faster than the consumer consumes the producer | |||
will eventually fill it. Similarly, if the consumer consumes | |||
faster than the producer produces, the queue will eventually be | |||
emptied by the consumer. To reduce the probability of either of | |||
these events, the producer and consumer must run at the same rate. | |||
To determine this you could benchmark each. Then you could add | |||
sleeps or busy loops to the faster until they run at the same rate. | |||
5. [2] Implement and critique your own version of sem_wait() and | |||
sem_post() in pure C with no system calls. Do you think your | |||
version is free of race conditions? Explain. | |||
A: You can implement sem_wait and sem_post as simple functions that | |||
test a condition variable and change its value, with sem_wait() | |||
doing a loop (busy waiting) until the semaphore was available. No | |||
matter what you do, however, these versions will have race | |||
conditions because it is not possible to both test a value and | |||
change it atomically. To combine them in an atomic operation we | |||
need special CPU instructions for which we'd need inline assembly | |||
or other non-standard mechanism. | |||
[FIX: ADD SAMPLE IMPLEMENTATIONS] | |||
6. [2] Change 3000pc so that uses a shared memory mapped file rather | |||
than simply shared memory to communicate between the producer and | |||
consumer. How big is the file? Does its size change? | |||
A: You simply have to change the mmap to drop the MAP_ANONYMOUS | |||
flag and change the -1 to a valid file descriptor of a file you've | |||
previously opened for reading and writing. The size of the file | |||
should be the size of the shared struct (or larger), rounded up to | |||
the next page size (so divisible by 4096). The file size won't | |||
change because the size of shared is fixed. | |||
[FIX: IMPLEMENT TO CHECK] | |||
7. [2] How difficult is it for a userspace process to generate truly | |||
random numbers on its own, without the assistance of the kernel? | |||
Explain. | |||
A: It is nearly impossible for a process to generate random numbers | |||
on its own because randomness cannot be produced algorithmically, | |||
it must come from a physical process. If a CPU has special | |||
(non-privileged) instructions for generating randomness or if a | |||
program can observe CPU registers (such as timing registers) that | |||
may have a stochastic component, then it would be possible. | |||
Otherwise programs must interact with input devices (keyboard, | |||
network, spinning hard drives, webcams) that do not behave | |||
completely deterministically, and to interact with those devices a | |||
process must interact with the kernel through system calls. | |||
8. [2] Does the ssh command use random numbers? How can you | |||
demonstrate this? | |||
A: The ssh command does use random numbers. You can tell because | |||
it opens and reads from /dev/urandom. You can demonstate this by | |||
looking through the output of strace. | |||
9. [2] Can a setuid root process give up its privileges and run as a | |||
regular user? Is this change necessarily permanant? | |||
A: A setuid root process can change its effective user ID (euid) to | |||
another user ID using seteuid, limiting its privileges to that of | |||
the new euid. However, that process can later call seteuid again | |||
and change back to root (uid=0), as the old euid is saved. To make | |||
the change permanent the process should do an execve while | |||
privileges are dropped. | |||
10. [2] What are the main factors that influence whether the attacker | |||
wins the race with 3000run-write? Explain what it means for the | |||
attacker to win. | |||
A: For the attacker to win the race, they must be able to switch | |||
between the safe file and the victim file after the call to access | |||
but before the call to open. When this happens depends completely | |||
on when the kernel decides to schedule the fast-swap process versus | |||
the log-write process. Scheduling decisions depend in general on | |||
the relative priority of the processes, how much CPU and I/O they | |||
have done, what, and whether they are waiting for an event (such a | |||
data to be read from disk). You can thus influnence the likelihood | |||
of the attacker winning by changing the resources and behavior of | |||
the fast-swap process and other processes on the system. | |||
(In developing the assignment I tried creating a script that would | |||
generate a lot of processes that would then do dd or find | |||
operations. They managed to generate a moderate amount of load; | |||
however, I managed to get the race to work without any additional | |||
resource consumption.) | |||
</pre> |
Latest revision as of 20:34, 10 April 2019
For this assignment please work on the questions below. While you will be able to submit your answers, by default they will not be graded. Instead, an approximately 20 question multiple choice quiz (based on the questions below) will be made available by April 8 and will be due by 4 PM on April 10th. The grade for this quiz will substitute for Assignment 4.
In class on April 10th solutions for the assignment questions below and the Assignment 4 Quiz will be given and discussed.
If you feel your answers to the quiz do not demonstrate what you actually learned, you can make an appointment with Prof. Somayaji to grade your submitted Assignment 4 answers (assuming you submitted them by April 10th) before April 19th.
Questions
- [2] When you hit Ctrl-C while running the bc command, does the Ctrl-C generate a signal? How do you know?
- [2] Does bashreadline report commands given to bash via shell scripts? How do you know?
- [2] How can you modify 3000pc so the producer stops on the third time it fills its queue? Give your code as a diff -c versus the original 3000pc.
- [2] Under what conditions would 3000pc never send a signal (from the producer to the consumer or vice versa)? How could you modify 3000pc so this occurance was more likely?
- [2] Implement and critique your own version of sem_wait() and sem_post() in pure C with no system calls. Do you think your version is free of race conditions? Explain.
- [2] Change 3000pc so that uses a shared memory mapped file rather than simply shared memory to communicate between the producer and consumer. How big is the file? Does its size change?
- [2] How difficult is it for a userspace process to generate truly random numbers on its own, without the assistance of the kernel? Explain.
- [2] Does the ssh command use random numbers? How can you demonstrate this?
- [2] Can a setuid root process give up its privileges and run as a regular user? Is this change necessarily permanant?
- [2] What are the main factors that influence whether the attacker wins the race with 3000log-write? Explain what it means for the attacker to win.
Multiple choice questions
- What Linux distribution does Anil use?
- CentOS
- Arch
- Ubuntu
- Debian
- Gentoo
- Which of the following operations are permitted for an unprivileged user?
- Making a hard link to a file owned by root
- Making a symbolic link to a file owned by root
- Making a character device file (with mknod)
- Mounting a USB stick on /mnt
- None of the above
- Suppose you are running bc in a Linux terminal and press CTRL-C during its execution. Which of the following is TRUE?
- The kernel sends SIGSTOP to the terminal and causes bc to pause.
- Highlighted text gets copied to the clipboard on Linux.
- bc sends SIGTERM to itself and exits.
- bc sends SIGINT to itself and exits.
- None of the above
- Bashreadline can be used to observe
- commands given to bash when entered in a terminal
- commands given to bash from sourced scripts
- commands given to any program on the system via a terminal
- A and B
- A, B, and C
- The call b.attach_uretprobe() in bashreadline needs root access in order to succeed. Why does it need such privileges?
- It is directly modifying files owned by root.
- It is binding a privileged UNIX domain socket.
- It is accessing memory in an unsafe manner.
- It is loading code into the kernel.
- All of the above
- When the producer receives a SIGUSR1 signal in 3000pc, it immediately responds by
- printing a message to standard out
- changes the prod_waiting flag
- wakes up the producer
- All of the above
- None of the above
- Increasing the queue size in 3000pc will
- make the consumer send more signals
- make the producer sleep less frequently
- make the producer produce faster
- make the consumer consume faster
- All of the above
- An implementation of a semaphore tests the value of the semaphore and changes it if the semaphore is available. What makes it hard to write a good semaphore implementation?
- semaphore operations should be fast
- semaphores work on shared data
- testing and setting a variable are normally separate instructions
- CPU cores have separate caches
- All of the above
- Which of the following is true about "shared" in 3000pc?
- It is fixed sized
- Its size varies depending upon how many words are stored in it
- It is allocated in the standard process heap
- It is never cached
- All of the above
- Which of the following could be used to generate truly random numbers?
- Seeding rand() in C with the current time
- Gathering input from a webcam
- Using an algorithm that generates numbers using a cryptographic hash
- Calculating digits of a transcendental number such as pi
- None of the above
- The ssh command on Linux opens all of these files EXCEPT
- /etc/ld.so.cache
- /etc/services
- /dev/random
- /dev/urandom
- /etc/passwd
- With what uid and euid values is it possible for a process to (successfully) call seteuid? Note that seteuid is a privileged system call that regular user processes cannot invoke.
- A process is running with uid=0 and euid=0
- A process is running with uid=1000 and euid=0
- A process is running with uid=1000 and euid=1000
- A process is running with uid=0 and euid=1000
- All of the above
- A process's execution can be paused, allowing another process to run on the CPU, when it
- makes a system call
- copies a string
- reads a file
- All of the above
- None of the above
- Increasing a process's priority (lowering its niceness value)
- increases the likelihood of it running when it needs CPU time
- gives it a guaranteed share of CPU time
- makes it send signal faster
- All of the above
- None of the above
- Assuming that the attacker has already set up the TOCTTOU attack in advance, when must the attacker's code run relative to the code in safe_write() in 3000log-write?
- Before the call to access()
- After the call to access(), before the open()
- After the call to open(), before the write()
- After the call to write()
- None of the above
Solutions
COMP 3000 Winter 2019 Assignment 4 Solutions 1. [2] When you hit Ctrl-C while running the bc command, does the Ctrl-C generate a signal? How do you know? A: It does generate a signal that bc sends to itself. You can tell this by running killsnoop (or potentially strace) and seeing the source and destination of the signal 2 (SIGINT). 2. [2] Does bashreadline report commands given to bash via shell scripts? How do you know? A: bashreadline does not report commands given to bash via shell scripts. You can confirm this by creating a small shell script with a few commands in it (such as whoami and id). With bashreadline it will show you running the script but it won't show the commands in the script. This behavior makes sense as the readline function is only used when bash reads from a terminal, not when it reads from a file. 3. [2] How can you modify 3000pc so the producer stops on the third time it fills its queue? Give your code as a diff -c versus the original 3000pc. A: Your code should maintain a counter in the producer that is incremented every time it sleeps in wait_for_consumer(). When this counter reaches 3 then the producer could go into an infinite sleep (i.e., change the while loop with the call to nanosleep to never exit). [FIX: GIVE IMPLEMENTATION, but this is pretty straightforward] 4. [2] Under what conditions would 3000pc never send a signal (from the producer to the consumer or vice versa)? How could you modify 3000pc so this occurance was more likely? A: The producer only sends a signal when the consumer is sleeping, and the consumer only sends a signal when the producer is sleeping. They go to sleep when the queue is empty or full, respectively. Thus to prevent this from happening you want to make it so the queue is never full or empty. While increasing the size of the queue can make it less likely for the queue to get full, if the producer produces faster than the consumer consumes the producer will eventually fill it. Similarly, if the consumer consumes faster than the producer produces, the queue will eventually be emptied by the consumer. To reduce the probability of either of these events, the producer and consumer must run at the same rate. To determine this you could benchmark each. Then you could add sleeps or busy loops to the faster until they run at the same rate. 5. [2] Implement and critique your own version of sem_wait() and sem_post() in pure C with no system calls. Do you think your version is free of race conditions? Explain. A: You can implement sem_wait and sem_post as simple functions that test a condition variable and change its value, with sem_wait() doing a loop (busy waiting) until the semaphore was available. No matter what you do, however, these versions will have race conditions because it is not possible to both test a value and change it atomically. To combine them in an atomic operation we need special CPU instructions for which we'd need inline assembly or other non-standard mechanism. [FIX: ADD SAMPLE IMPLEMENTATIONS] 6. [2] Change 3000pc so that uses a shared memory mapped file rather than simply shared memory to communicate between the producer and consumer. How big is the file? Does its size change? A: You simply have to change the mmap to drop the MAP_ANONYMOUS flag and change the -1 to a valid file descriptor of a file you've previously opened for reading and writing. The size of the file should be the size of the shared struct (or larger), rounded up to the next page size (so divisible by 4096). The file size won't change because the size of shared is fixed. [FIX: IMPLEMENT TO CHECK] 7. [2] How difficult is it for a userspace process to generate truly random numbers on its own, without the assistance of the kernel? Explain. A: It is nearly impossible for a process to generate random numbers on its own because randomness cannot be produced algorithmically, it must come from a physical process. If a CPU has special (non-privileged) instructions for generating randomness or if a program can observe CPU registers (such as timing registers) that may have a stochastic component, then it would be possible. Otherwise programs must interact with input devices (keyboard, network, spinning hard drives, webcams) that do not behave completely deterministically, and to interact with those devices a process must interact with the kernel through system calls. 8. [2] Does the ssh command use random numbers? How can you demonstrate this? A: The ssh command does use random numbers. You can tell because it opens and reads from /dev/urandom. You can demonstate this by looking through the output of strace. 9. [2] Can a setuid root process give up its privileges and run as a regular user? Is this change necessarily permanant? A: A setuid root process can change its effective user ID (euid) to another user ID using seteuid, limiting its privileges to that of the new euid. However, that process can later call seteuid again and change back to root (uid=0), as the old euid is saved. To make the change permanent the process should do an execve while privileges are dropped. 10. [2] What are the main factors that influence whether the attacker wins the race with 3000run-write? Explain what it means for the attacker to win. A: For the attacker to win the race, they must be able to switch between the safe file and the victim file after the call to access but before the call to open. When this happens depends completely on when the kernel decides to schedule the fast-swap process versus the log-write process. Scheduling decisions depend in general on the relative priority of the processes, how much CPU and I/O they have done, what, and whether they are waiting for an event (such a data to be read from disk). You can thus influnence the likelihood of the attacker winning by changing the resources and behavior of the fast-swap process and other processes on the system. (In developing the assignment I tried creating a script that would generate a lot of processes that would then do dd or find operations. They managed to generate a moderate amount of load; however, I managed to get the race to work without any additional resource consumption.)