Operating Systems 2021F: Tutorial 3

From Soma-notes
Jump to navigation Jump to search

In this tutorial you will be experimenting with and extending 3000shell.c (listed below).

Make sure you use the original code from 3000shell for each question.

Getting Started

You should download 3000shell.c on your openstack instance (or a Ubuntu Linux 21.04 system or similar). Compile it using the command

gcc -O -g -Wall 3000shell.c -o 3000shell

Standard I/O

Recall that the shell is a command interpreter (program). In comparison, a terminal is a device used to enter data into and display data from a computer. It used to be physical - teletypes, or "TTY"'s. Today they are virtual - pseudoterminals, pseudo teletypes, or pts's. /dev/pts/1 is a pts for example. By themselves, a pts does nothing; however, they allow any program to appear as a virtual teletype to any other program. (To learn more, man ptmx. However, this goes into more detail than you need for this tutorial.)

When you ssh to a system interactively, the remote ssh process uses a pseudo tty to connect network I/O with the programs you run at the command line. In this context you can thus think of ssh (plus a pseudo tty) as an adaptor that converts network traffic into terminal-like I/O.

Standard input, (e.g., /dev/stdin), output (e.g., /dev/stdout) and error (/dev/stderr) are just references to file descriptors 0, 1, and 2 for the current process. They can refer to basically any file. If a process is run in a terminal, stdin, stdout, and stderr are set to refer to the terminal's I/O channels via a tty or pts.

Shells provide easy interfaces for changing these file descriptors. For example:

 ls > ls.log

Will redirect ls's standard output to the file ls.log. Similarly,

 bc -l < math.txt

will take math input from /tmp/math.txt and output it to the current terminal. We can also use the pipe operator to direct the standard output of one program to the standard input of another:

 ls -l | less

(This is really good if you have lots of files in a directory.)

Part of the magic of UNIX is that it allows problems to be solved by combining multiple programs together. For example, to get a list of unique words in a file, you can do something like the following:

 tr ' \t' '\n' < file.txt | tr -d ',.!()?-' | sort | uniq | less  

In summary, shells run in terminals, and the terminal interface is implemented by teletype-like devices, either a "tty" or "pts" device. Standard in, out, and error are abstractions for interactions with such terminals or, with I/O redirection, other arbitrary files.

Tasks/Questions

The purpose of the following questions and tasks is to help you understand how 3000shell works. At the end of this you should have an understanding of every function and every line of the code. If you understand 3000shell, then you understand the basics of all UNIX shells. Your understanding of the code will be tested later, so use this opportunity to dive deep into the code. These tasks and questions should help you build up a mental model of how 3000shell works.

  1. Compile and run 3000shell.c
  2. Try running programs in the background using & after commands entered in 3000shell. What happens to the input and output of the program? Try this for simple programs like ls and bc. Then, try it for more complex interactive programs such as nano and top.
  3. You may have trouble interacting with the shell after running programs in the background. How can you recover from such a situation?
  4. Run 3000shell under gdb and observe all the system calls it makes using catch syscall (after setting a breakpoint on main so you don't see the syscalls when it starts). Where does each system call happen? In what context (source and assembly)? Consider both parent and child processes (by setting follow-fork-mode to parent and child). Compare with the output of strace (e.g., run strace -fqo 3000shell.log ./3000shell).
  5. 3000shell implements a simple form of output redirection. What syntax should you use to redirect standard output to a file?
  6. Why are lines 207-210 there (the check for pid == -1)?
  7. Make find_binary show every attempt to find a binary.
  8. Make the shell output "Ouch!" when you send it a SIGUSR1 signal.
  9. Delete line 324 (SA_RESTART). How does the behavior of 3000shell change?
  10. Replace the use of find_env() with getenv(). How do their interfaces differ?
  11. Make plist output the parent process id for each process, e.g. "5123 ls (5122)". Pay attention to the stat and status files in the per-process directories in /proc.
  12. Implement redirection of standard error
  13. Implement redirection of standard out for plist() (the same as if it was an external command).
  14. Implement a built-in 3000kill command that works like the standard kill command. (What system call/library call is used to send signals?)

Code

3000shell.c

  1 /* 3000shell.c */
  2 /* v2 Sept. 15, 2019 */
  3 /* v1 Sept. 24, 2017 */
  4 /* based off of csimpleshell.c, Enrico Franchi © 2005
  5       https://web.archive.org/web/20170223203852/
  6       http://rik0.altervista.org/snippets/csimpleshell.html */
  7 /* Original under "BSD" license */
  8 /* This version is under GPLv3, copyright 2017, 2019 Anil Somayaji */
  9 /* You really shouldn't be incorporating parts of this in any other code,
 10    it is meant for teaching, not production */
 11 
 12 #include <stdio.h>
 13 #include <stdlib.h>
 14 #include <unistd.h>
 15 #include <string.h>
 16 #include <errno.h>
 17 #include <sys/types.h>
 18 #include <sys/stat.h>
 19 #include <sys/wait.h>
 20 #include <dirent.h>
 21 #include <ctype.h>
 22 #include <fcntl.h>
 23 #include <signal.h>
 24 
 25 #define BUFFER_SIZE 1<<16
 26 #define ARR_SIZE 1<<16
 27 #define COMM_SIZE 32
 28 
 29 const char *proc_prefix = "/proc";
 30 
 31 void parse_args(char *buffer, char** args, 
 32                 size_t args_size, size_t *nargs)
 33 {
 34         char *buf_args[args_size]; /* You need C99 */
 35         char **cp, *wbuf;
 36         size_t i, j;
 37         
 38         wbuf=buffer;
 39         buf_args[0]=buffer; 
 40         args[0] =buffer;
 41         
 42         for(cp=buf_args; (*cp=strsep(&wbuf, " \n\t")) != NULL ;){
 43                 if ((*cp != NULL) && (++cp >= &buf_args[args_size]))
 44                         break;
 45         }
 46         
 47         for (j=i=0; buf_args[i]!=NULL; i++){
 48                 if (strlen(buf_args[i]) > 0)
 49                         args[j++]=buf_args[i];
 50         }
 51         
 52         *nargs=j;
 53         args[j]=NULL;
 54 }
 55 
 56 /* this is kind of like getenv() */
 57 char *find_env(char *envvar, char *notfound, char *envp[])
 58 {
 59         const int MAXPATTERN = 128;
 60         int i, p;
 61         char c;
 62         char pattern[MAXPATTERN];
 63         char *value = NULL;
 64 
 65         p = 0;
 66         while ((c = envvar[p])) {
 67                 pattern[p] = c;
 68                 p++;
 69                 if (p == (MAXPATTERN - 2)) {
 70                         break;
 71                 }
 72         }
 73         pattern[p] = '=';
 74         p++;
 75         pattern[p] = '\0';
 76         
 77         i = 0;
 78         while (envp[i] != NULL) {
 79                 if (strncmp(pattern, envp[i], p) == 0) {                        
 80                         value = envp[i] + p;
 81                 }
 82                 i++;
 83         }
 84 
 85         if (value == NULL) {
 86                 return notfound;
 87         } else {
 88                 return value;
 89         }
 90 }
 91 
 92 void find_binary(char *name, char *path, char *fn, int fn_size) {
 93         char *n, *p;
 94         int r, stat_return;
 95 
 96         struct stat file_status;
 97 
 98         if (name[0] == '.' || name[0] == '/') {
 99                 strncpy(fn, name, fn_size);
100                 return;
101         }
102         
103         p = path;
104         while (*p != '\0') {       
105                 r = 0;
106                 while (*p != '\0' && *p != ':' && r < fn_size - 1) {
107                         fn[r] = *p;
108                         r++;
109                         p++;
110                 }
111 
112                 fn[r] = '/';
113                 r++;
114                 
115                 n = name;
116                 while (*n != '\0' && r < fn_size) {
117                         fn[r] = *n;
118                         n++;
119                         r++;
120                 }
121                 fn[r] = '\0';
122 
123                 
124                 stat_return = stat(fn, &file_status);
125 
126                 if (stat_return == 0) {
127                         return;
128                 }
129 
130                 if (*p != '\0') {
131                         p++;
132                 }
133         }
134 }
135 
136 void setup_comm_fn(char *pidstr, char *comm_fn)
137 {
138         char *c;
139 
140         strcpy(comm_fn, proc_prefix);
141         c = comm_fn + strlen(comm_fn);
142         *c = '/';
143         c++;
144         strcpy(c, pidstr);
145         c = c + strlen(pidstr);
146         strcpy(c, "/comm");
147 }
148 
149 void plist()
150 {
151         DIR *proc;
152         struct dirent *e;
153         int result;
154         char comm[COMM_SIZE];  /* seems to just need 16 */        
155         char comm_fn[512];
156         int fd, i, n;
157 
158         proc = opendir(proc_prefix);
159 
160         if (proc == NULL) {
161                 fprintf(stderr, "ERROR: Couldn't open /proc.\n");
162         }
163         
164         for (e = readdir(proc); e != NULL; e = readdir(proc)) {
165                 if (isdigit(e->d_name[0])) {
166                         setup_comm_fn(e->d_name, comm_fn);
167                         fd = open(comm_fn, O_RDONLY);
168                         if (fd > -1) {                                
169                                 n = read(fd, comm, COMM_SIZE);
170                                 close(fd);
171                                 for (i=0; i < n; i++) {
172                                         if (comm[i] == '\n') {
173                                                 comm[i] = '\0';
174                                                 break;
175                                         }
176                                 }
177                                 printf("%s: %s\n", e->d_name, comm);
178                         } else {
179                                 printf("%s\n", e->d_name);
180                         }
181                 }
182         }
183         
184         result = closedir(proc);
185         if (result) {
186                 fprintf(stderr, "ERROR: Couldn't close /proc.\n");
187         }
188 }
189 
190 void signal_handler(int the_signal)
191 {
192         int pid, status;
193 
194         if (the_signal == SIGHUP) {
195                 fprintf(stderr, "Received SIGHUP.\n");
196                 return;
197         }
198         
199         if (the_signal != SIGCHLD) {
200                 fprintf(stderr, "Child handler called for signal %d?!\n",
201                         the_signal);
202                 return;
203         }
204 
205         pid = wait(&status);
206 
207         if (pid == -1) {
208                 /* nothing to wait for */
209                 return;
210         }
211         
212         if (WIFEXITED(status)) {
213                 fprintf(stderr, "\nProcess %d exited with status %d.\n",
214                         pid, WEXITSTATUS(status));
215         } else {
216                 fprintf(stderr, "\nProcess %d aborted.\n", pid);
217         }
218 }
219 
220 void run_program(char *args[], int background, char *stdout_fn,
221                  char *path, char *envp[])
222 {
223         pid_t pid;
224         int fd, *ret_status = NULL;
225         char bin_fn[BUFFER_SIZE];
226 
227         pid = fork();
228         if (pid) {
229                 if (background) {
230                         fprintf(stderr,
231                                 "Process %d running in the background.\n",
232                                 pid);
233                 } else {
234                         pid = wait(ret_status);
235                 }
236         } else {
237                 find_binary(args[0], path, bin_fn, BUFFER_SIZE);
238 
239                 if (stdout_fn != NULL) {
240                         fd = creat(stdout_fn, 0666);
241                         dup2(fd, 1);
242                         close(fd);
243                 }
244                 
245                 if (execve(bin_fn, args, envp)) {
246                         puts(strerror(errno));
247                         exit(127);
248                 }
249         }
250 }
251 
252 void prompt_loop(char *username, char *path, char *envp[])
253 {
254         char buffer[BUFFER_SIZE];
255         char *args[ARR_SIZE];
256         
257         int background;
258         size_t nargs;
259         char *s;
260         int i, j;
261         char *stdout_fn;
262         
263         while(1){
264                 printf("%s $ ", username);
265                 s = fgets(buffer, BUFFER_SIZE, stdin);
266                 
267                 if (s == NULL) {
268                         /* we reached EOF */
269                         printf("\n");
270                         exit(0);
271                 }
272                 
273                 parse_args(buffer, args, ARR_SIZE, &nargs); 
274                 
275                 if (nargs==0) continue;
276                 
277                 if (!strcmp(args[0], "exit")) {
278                         exit(0);
279                 }
280                 
281                 if (!strcmp(args[0], "plist")) {
282                         plist();
283                         continue;
284                 }
285 
286                 background = 0;            
287                 if (strcmp(args[nargs-1], "&") == 0) {
288                         background = 1;
289                         nargs--;
290                         args[nargs] = NULL;
291                 }
292 
293                 stdout_fn = NULL;
294                 for (i = 1; i < nargs; i++) {
295                         if (args[i][0] == '>') {
296                                 stdout_fn = args[i];
297                                 stdout_fn++;
298                                 printf("Set stdout to %s\n", stdout_fn);
299                                 for (j = i; j < nargs - 1; j++) {
300                                         args[j] = args[j+1];
301                                 }
302                                 nargs--;
303                                 args[nargs] = NULL;
304                                 break;
305                         }
306                 }
307                 
308                 run_program(args, background, stdout_fn, path, envp);
309         }    
310 }
311 
312 int main(int argc, char *argv[], char *envp[])
313 {
314         struct sigaction signal_handler_struct;
315                 
316         char *username;
317         char *default_username = "UNKNOWN";
318         
319         char *path;
320         char *default_path = "/usr/bin:/bin";
321         
322         memset (&signal_handler_struct, 0, sizeof(signal_handler_struct));
323         signal_handler_struct.sa_handler = signal_handler;
324         signal_handler_struct.sa_flags = SA_RESTART;
325         
326         if (sigaction(SIGCHLD, &signal_handler_struct, NULL)) {
327                 fprintf(stderr, "Couldn't register SIGCHLD handler.\n");
328         }
329         
330         if (sigaction(SIGHUP, &signal_handler_struct, NULL)) {
331                 fprintf(stderr, "Couldn't register SIGHUP handler.\n");
332         }
333         
334         username = find_env("USER", default_username, envp);
335         path = find_env("PATH", default_path, envp);
336 
337         prompt_loop(username, path, envp);
338         
339         return 0;
340 }