Operating Systems 2018F: Assignment 1

Please submit the answers to the following questions via CULearn by 2:30 PM on September 26, 2018. There are 20 points in 12 questions.

Submit your answers as a single text file named "<username>-comp3000-assign1.txt" (where username is your MyCarletonOne username). The first four lines of this file should be "COMP 3000 Assignment 1", your name, student number, and the date of submission. You may wish to format your answers in Markdown to improve their appearance.

No other formats will be accepted. Submitting in another format will likely result in your assignment not being graded and you receiving no marks for this assignment. In particular do not submit an MS Word or OpenOffice file as your answers document!

Don't forget to include what outside resources you used to complete each of your answers, including other students, man pages, and web resources. You do not need to list help from the instructor, TA, or information found in the textbook.

Questions

[1] What does it mean to run a program "in the background"? Specifically, what is the key difference between running a program in the foreground and in the background?
[1] Are system calls used to receive signals? Explain briefly. (Hint: strace a program and send it a signal. What happens?)
[1] The system calls 3000shell uses to search for a program to run (in one of the PATH directories) are not the same as those used by standard shells such as bash. What's the difference?
[1] How could you change 3000shell so that it generates zombie processes?
[2] How would the behaviour of 3000shell change if line 286 was removed? (pid = fork();) Why?
[2] In 3000shell, when are lines 299 and 300 executed? Why?
[2] Does using getenv() generate any additional library calls (as reported by ltrace)? Does it generate any additional system calls (as reported by strace)? Why?
[2] Does parse_args() allocate any memory on the heap (i.e., any memory that stays allocated after the function returns)? How do you know? Give a brief argument.
[2] execve overwrites all of a process's memory with that of a new executable. Does execve also close all open file descriptors? How do you know (from your experience with 3000shell)?
[2] What happens to an in-progress system call when a process receives a signal? (An example is a program waiting for input from a terminal with a blocking read call.) What does "restarting" a system call have to do with this?
[2] If you changed plist() so it took proc_prefix as an argument (rather than accessing it as a global variable), what output would it produce when given an argument other than "/proc"? Explain briefly.
[2] Describe how you could add output redirection for external programs to 3000shell.

Solutions

The difference between running a program in the foreground and the background is that when run in the foreground, the shell waits for the program to terminate (using wait or waitpid), while when running in the background the shell does not wait and immediately gives another prompt so the user can enter more programs to run (or commands for the shell to execute itself).
System calls are not used to receive signals. Instead, a process registers signal handlers with the kernel and the kernel calls these functions directly (by manipulating the process's instruction pointer and stack).
3000shell makes stat() calls to see if a file exists before doing an execve. bash does similar stat()'s, but it also checks the permissions on the file (using access) to see if it is executable before doing an execve. (You can also say bash just does execve's directly rather than doing stats, but I did not see that behavior in later tests.)
You can get 3000shell to generate zombie processes by removing the calls to sigaction that register a signal handler. If 3000shell doesn't handle SIGCHLD (and thus never calls wait on background processes), it will create zombie processes that will only go away when 3000shell terminates.
If you remove the call to fork, only one of the branches of the if statement will be executed. Which branch is executed depends on the value of pid; since it was never initialized, its value is unspecified and thus could be anything. This happens because without the call to fork we never create a child process and we never initialize the pid variable. (1 point for one process, 1 for explaining the uninitialized variable)
Lines 299 and 300 only execute if the execve call fails. If it succeeds, the code of the process is replaced with the code of the specified program binary; thus, lines 299 and 300 no longer are loaded and thus cannot run. In other words, execve only has a return value when it fails.
getenv() is a dynamically-linked library function, thus it shows up when doing an ltrace. This function makes no system calls (as getenv() is just a convenience function for accessing data in envp), thus strace reports no additional system calls generated by getenv().
parse_args() does not allocate any memory on the heap as it does not call malloc() or any other function that allocates heap memory. The only functions it calls are strsep() and strlen(), and their man pages make no mention of allocating memory on the heap. (In fact, very few standard C library functions dynamically allocate memory on the heap, and when they do it is clearly documented as such allocation can be problematic in the context of many C programs.)
execve does not close open file descriptors. If it did, then it would not be possible to redirect file descriptors for standard in, standard out, and standard error (file descriptors 0, 1, and 2), as these are assumed to already be open when a program starts.
When a process receives a signal, any currently executing system call is interrupted. If the system call is set to be restarted, then the system call is resumed once the signal handler returns; otherwise, interrupted system calls return (after the signal handler has finished executing). It is particularly important when the read system call is interrupted, as it can mean that data that would normally be read is not read (and an additional read system call must be executed to get the rest of the data). In the case of 3000shell, an interrupted system call can cause a read to think that the "end of the file" has been reached (the terminal has been closed), thus causing it to terminate unless we tell the system to allow system calls to be restarted (line 243).
If given an arbitrary directory other than /proc, it would list all the files that started with a number and would then attempt to treat each as a directory and would look inside it for a "comm" file in order to get the contents of the second column. Since most directories have few files starting with a number, and few of those are directories with a "comm" file, plist() won't output much for most directories.
To add output redirection to 3000shell, add the following code somewhere between 266 and 286 (so once it completes the execve completes as usual):
- Add in code that looks for arguments starting with ">" followed by a filename (that may or may not be separated by a space from ">"). Make sure to remove the ">" and the specified file from the argument list, and generate an error if no filename is specified.
- Next, add in code that opens the specified file for writing. Call the returned file descriptor fd.
- Call dup2(fd, 1), to make standard out go to this file.
- close(fd), because we don't want the specified file to be open on a random file descriptor.