COMP3000 Operating Systems W22: Tutorial 3: Difference between revisions

From Soma-notes
mNo edit summary
 
(5 intermediate revisions by the same user not shown)
Line 35: Line 35:
Recall that the shell is a command interpreter (program). In comparison, a terminal is a device used to enter data into and display/printing data from a computer. It used to be physical - [https://en.wikipedia.org/wiki/Teleprinter teletypes, or "TTY"'s], hence we call the virtual ones today pseudoterminals. In this context, we refer to it as pts -- pseudo teletypes, e.g., <tt>/dev/pts/1</tt> (they are just the slave side. Those who are curious can refer to <tt>man ptmx</tt>, but not needed in this tutorial).
Recall that the shell is a command interpreter (program). In comparison, a terminal is a device used to enter data into and display/printing data from a computer. It used to be physical - [https://en.wikipedia.org/wiki/Teleprinter teletypes, or "TTY"'s], hence we call the virtual ones today pseudoterminals. In this context, we refer to it as pts -- pseudo teletypes, e.g., <tt>/dev/pts/1</tt> (they are just the slave side. Those who are curious can refer to <tt>man ptmx</tt>, but not needed in this tutorial).


The standard output (e.g., <tt>/dev/stdout</tt>), input (e.g., <tt>/dev/stdin</tt>) and error (<tt>/dev/stderr</tt>) are just symbolic endpoints in any programs (of which the shell is an example), that can be redirected anywhere, where output goes, input comes and error is reported.
The standard input (e.g., <tt>/dev/stdin</tt>), output (e.g., <tt>/dev/stdout</tt>) and error (<tt>/dev/stderr</tt>) are just symbolic endpoints in any programs (of which the shell is an example), that can be redirected anywhere, where output goes, input comes and error is reported. By default, they refer to file descriptors 0, 1, and 2 for the current process.


So, in summary, you connect to a terminal to interact with a computer, by running a shell to receive your commands, which redirects standard in/out/err to the terminal.
<u>So, in summary, you connect to a terminal to interact with a computer, by running a shell to receive your commands, which redirects standard in/out/err to the terminal.</u>
 
As you progress in this course, you will gradually find how such stdin/stdout/stderr can be useful, especially when combined with the pipeline/redirection. For instance:
 
ls | less        #to paginate long output from ls
 
ps -ef | grep 3000shell    #to look for the 3000shell process from the process list
 
gcc -O2 -o csimpleshell csimpleshell.c 2>err.txt    #to redirect error messages to a file err.txt
 
Later in the term if you find your disk space filled up, you can use the following command to pinpoint the top 20 largest files (no need to understand the specifics unless you are curious):
 
sudo find / -type f -exec du -h {} 2>&- + | sort -rh | head -20
 
==Tasks/Questions==
The purpose of the following questions and tasks is to help you understand how 3000shell (a simple implementation of the shell, but still more complex than the previous <tt>csimpleshell</tt>) works. Eventually, you should have an understanding of every function and every line of the code. If you understand 3000shell, then you understand the basics of all UNIX shells. Your understanding of the code will be tested later, so use this opportunity to dive deep into the code.
 
You do not need to submit a separate file for source code changes. Just describe your changes or paste the code snippet (usually a few lines at the most).
 
# Compile and run [https://people.scs.carleton.ca/~lianyingzhao/comp3000/w22/tut3/3000shell.c 3000shell.c].
# Compared to [https://people.scs.carleton.ca/~lianyingzhao/comp3000/w22/tut1/csimpleshell.c csimpleshell] from Tutorial 1, what functionality improvements has 3000shell introduced? List at least two improvements (functional differences). You can mention whether you found them by reading the source code or trying both shells.
# Pay attention to lines 229-235. How can you run a program in the background? Note: “in the background” means the shell starting a program without waiting for it to finish. So, the shell returns immediately to be ready to accept the next command.
# Observe the behavior when running different programs in the background. What happens to the input and output of the program? Try this for different types of programs, such as <tt>ls</tt> and <tt>bc</tt>, or <tt>nano</tt> and <tt>top</tt>, or others of your choice. Write down your observations.
# You may have trouble interacting with the shell after running programs in the background. How can you recover from such a situation? Note: you can do it in a harsh way for this question. We will see a smarter way next.
# 3000shell implements a simple form of output redirection (i.e., the standard output mentioned earlier). What syntax should you use to redirect standard output to a file?
# Make the shell output "Ouch!" when you send it a <tt>SIGUSR1</tt> signal.
# If you delete line 324 (<tt>SA_RESTART</tt>), how does the behavior of 3000shell change?
# Replace the use of <tt>find_env()</tt> with <tt>getenv()</tt>. How do their interfaces differ?
# Make <tt>plist</tt> output the parent process id for each process, e.g., "5123 ls (5122)". Pay attention to the <tt>stat</tt> and <tt>status</tt> files in the per-process directories in <tt>/proc</tt>.
# If the shell gets stuck as a result of running a program in the background, how can you recover elegantly (resuming it) by sending just a signal (which signal)? <u>Why?</u>

Latest revision as of 18:23, 22 January 2022

In this tutorial, you will be experimenting with and extending 3000shell.c, a proof-of-concept program to show you how a Linux shell works. Also, you will be learning to read and modify C code, which prepares you for subsequent tutorials and assignments.

Make sure you use the original code from 3000shell for each question/task.

Tutorials are graded based on participation and effort (so no need to try to have the “correct” answers — what matters is the process), but you should still turn in your work. Submit your answers on Brightspace as a single text file named "<username>-comp3000-t3.txt" (where username is your MyCarletonOne username). The first four lines of this file should be "COMP 3000 Tutorial 3", your name, student number, and the date of submission.

The deadline is usually four days after the tutorial date (see the actual deadline on the submission entry). Note that the submission entry is enforced by the system, so you may fail to get the effort marks even if it is one minute past the deadline.

You should also check in with your assigned TA online (by responding to the poll in the Teams channel tutorials-public or the private channel). Your TA will be your first point of contact when you have questions or encounter any issues during the tutorial session.

You get 1.5 marks for submitting answers that shows your effort and 0.5 for checking in, making this tutorial worth 2 points total.

Background

Processes

Each application running on a system is assigned a unique process identifier (PID). The ps command shows the process identifiers for running processes. Each process running on the system is kept separated from other processes by the operating system.

When you enter a command at a shell prompt, most of the time you are creating a new process which runs the program you specified.

Controlling Processes

On Linux, once the processes have been created, you can control them by sending them signals. You send signals when you type certain key sequences in most shells: Control-C sends INT (interrupt), Control-Z sends STOP.

You can send a signal to a process using the kill command:

kill -<signal> <process ID>

So, to stop process 4542, type:

kill -STOP 4542

By default, kill sends the TERM signal.

Each signal has both a string name (e.g., STOP) and a numeric form (e.g., 19 correspondingly). See the full list of signals with kill -l.

Standard input/output, Shell and Terminal

Recall that the shell is a command interpreter (program). In comparison, a terminal is a device used to enter data into and display/printing data from a computer. It used to be physical - teletypes, or "TTY"'s, hence we call the virtual ones today pseudoterminals. In this context, we refer to it as pts -- pseudo teletypes, e.g., /dev/pts/1 (they are just the slave side. Those who are curious can refer to man ptmx, but not needed in this tutorial).

The standard input (e.g., /dev/stdin), output (e.g., /dev/stdout) and error (/dev/stderr) are just symbolic endpoints in any programs (of which the shell is an example), that can be redirected anywhere, where output goes, input comes and error is reported. By default, they refer to file descriptors 0, 1, and 2 for the current process.

So, in summary, you connect to a terminal to interact with a computer, by running a shell to receive your commands, which redirects standard in/out/err to the terminal.

As you progress in this course, you will gradually find how such stdin/stdout/stderr can be useful, especially when combined with the pipeline/redirection. For instance:

ls | less        #to paginate long output from ls
ps -ef | grep 3000shell    #to look for the 3000shell process from the process list
gcc -O2 -o csimpleshell csimpleshell.c 2>err.txt    #to redirect error messages to a file err.txt

Later in the term if you find your disk space filled up, you can use the following command to pinpoint the top 20 largest files (no need to understand the specifics unless you are curious):

sudo find / -type f -exec du -h {} 2>&- + | sort -rh | head -20

Tasks/Questions

The purpose of the following questions and tasks is to help you understand how 3000shell (a simple implementation of the shell, but still more complex than the previous csimpleshell) works. Eventually, you should have an understanding of every function and every line of the code. If you understand 3000shell, then you understand the basics of all UNIX shells. Your understanding of the code will be tested later, so use this opportunity to dive deep into the code.

You do not need to submit a separate file for source code changes. Just describe your changes or paste the code snippet (usually a few lines at the most).

  1. Compile and run 3000shell.c.
  2. Compared to csimpleshell from Tutorial 1, what functionality improvements has 3000shell introduced? List at least two improvements (functional differences). You can mention whether you found them by reading the source code or trying both shells.
  3. Pay attention to lines 229-235. How can you run a program in the background? Note: “in the background” means the shell starting a program without waiting for it to finish. So, the shell returns immediately to be ready to accept the next command.
  4. Observe the behavior when running different programs in the background. What happens to the input and output of the program? Try this for different types of programs, such as ls and bc, or nano and top, or others of your choice. Write down your observations.
  5. You may have trouble interacting with the shell after running programs in the background. How can you recover from such a situation? Note: you can do it in a harsh way for this question. We will see a smarter way next.
  6. 3000shell implements a simple form of output redirection (i.e., the standard output mentioned earlier). What syntax should you use to redirect standard output to a file?
  7. Make the shell output "Ouch!" when you send it a SIGUSR1 signal.
  8. If you delete line 324 (SA_RESTART), how does the behavior of 3000shell change?
  9. Replace the use of find_env() with getenv(). How do their interfaces differ?
  10. Make plist output the parent process id for each process, e.g., "5123 ls (5122)". Pay attention to the stat and status files in the per-process directories in /proc.
  11. If the shell gets stuck as a result of running a program in the background, how can you recover elegantly (resuming it) by sending just a signal (which signal)? Why?