COMP3000 Operating Systems F23: Tutorial 3

In this tutorial, you will be experimenting with and extending 3000shell.c, a proof-of-concept program to show you how a Linux shell works. Also, you will be learning to read and modify C code, which prepares you for subsequent tutorials and assignments.

Background

Processes

Each application running on a system is assigned a unique process identifier (PID). The ps command shows the process identifiers for running processes. Each process running on the system is kept separated from other processes by the operating system.

When you enter a command at a shell prompt, most of the time you are creating a new process which runs the program you specified.

Controlling Processes

On Linux, once the processes have been created, you can control them by sending them signals. From the terminal (e.g., command line), you send signals when you type certain key sequences in most shells: Control-C sends INT (interrupt), Control-Z sends STOP.

You can also send a signal to a process using the kill command:

kill -<signal> <process ID>

So, to stop process 4542, type:

kill -STOP 4542

By default, kill sends the TERM signal.

Each signal has both a string name (e.g., STOP) and a numeric form (e.g., 19 correspondingly). See the full list of signals with kill -l.

Standard input/output, Shell and Terminal

Recall that the shell is a command interpreter (program). In comparison, a terminal is a device used to enter data into and display/printing data from a computer. It used to be physical - teletypes, or "TTY"'s, but today we mostly use virtual teminals, saving the need for the physical form but providing the same interface (another loose example of virtualization). There is a special virtual terminal mechanism in Linux/UNIX referred to as pts -- pseudo teletypes, e.g., /dev/pts/1 (check-out man ptmx if you are curious, though this is not needed in this tutorial).

The standard input (e.g., /dev/stdin), output (e.g., /dev/stdout) and error (/dev/stderr) are just symbolic endpoints in any programs (of which the shell is an example), that can be redirected anywhere, where output goes, input comes and error is reported. By default, they refer to file descriptors 0, 1, and 2 for the current process.

So, in summary, you connect to a terminal to interact with a computer, by running a shell to receive your commands, which redirects standard in/out/err to the terminal.

As you progress in this course, you will gradually find how such stdin/stdout/stderr can be useful, especially when combined with the pipeline/redirection. For instance:

ls | less        #to paginate long output from ls

ps -ef | grep 3000shell    #to look for the 3000shell process from the process list

gcc -O2 -o csimpleshell csimpleshell.c 2 > err.txt    #to redirect error messages to a file err.txt

Later in the term if you find your disk space filled up, you can use the following command to pinpoint the top 20 largest files (no need to understand the specifics unless you are curious):

sudo find / -type f -exec du -h {} 2>&- + | sort -rh | head -20

Tasks/Questions

The purpose of the following questions and tasks is to help you understand how 3000shell (a simple implementation of the shell, but still more complex than the previous csimpleshell) works. Eventually, you should have an understanding of every function and every line of the code. If you understand 3000shell, then you understand the basics of all Linux/UNIX shells. Your understanding of the code will be tested later, so use this opportunity to dive deep into the code.

Compile and run 3000shell.c.
Compared to csimpleshell from Tutorial 1, what functionality improvements has 3000shell introduced? List at least two improvements (functional differences). Did you find them by reading the source code or trying both shells?
Pay attention to lines 229-235. How can you run a program in the background? Note: “in the background” means the shell starting a program without waiting for it to finish. So, the shell returns immediately to be ready to accept the next command.
Observe the behavior when running different programs in the background. What happens to the input and output of the program? Try this for different types of programs, such as ls and bc, or nano and top, or others of your choice. Write down your observations.
You may have trouble interacting with the shell after running programs in the background. How can you recover from such a situation? Note: you can do it in a harsh way for this question. We will see a smarter way next.
3000shell implements a simple form of output redirection (i.e., the standard output mentioned earlier). What syntax should you use to redirect standard output to a file?
Make the shell output "Ouch!" when you send it a SIGUSR1 signal.
If you delete line 324 (SA_RESTART), how does the behavior of 3000shell change?
Replace the use of find_env() with getenv(). How do their interfaces differ?
Make plist output the parent process id for each process, e.g., "5123 ls (5122)". Pay attention to the stat and status files in the per-process directories in /proc.
If the shell gets stuck as a result of running a program in the background, how can you recover elegantly (resuming it) by sending just a signal (which signal)? Why?