Difference between revisions of "Operating Systems 2022F: Tutorial 3"

From Soma-notes
Jump to navigation Jump to search
(Created page with "In this tutorial you will be experimenting with and extending [https://homeostasis.scs.carleton.ca/~soma/os-2019f/code/3000shell.c 3000shell.c] (listed below). '''Make sure you use the original code from 3000shell for each question.''' ==Getting Started== You should download [https://homeostasis.scs.carleton.ca/~soma/os-2019f/code/3000shell.c 3000shell.c] on your openstack instance (or a Ubuntu Linux 21.04 system or similar). Compile it using the command gcc -O -g -...")
 
 
Line 44: Line 44:
# Try running programs in the background using & after commands entered in 3000shell.  What happens to the input and output of the program?  Try this for simple programs like ls and bc.  Then, try it for more complex interactive programs such as nano and top.
# Try running programs in the background using & after commands entered in 3000shell.  What happens to the input and output of the program?  Try this for simple programs like ls and bc.  Then, try it for more complex interactive programs such as nano and top.
# You may have trouble interacting with the shell after running programs in the background.  How can you recover from such a situation?
# You may have trouble interacting with the shell after running programs in the background.  How can you recover from such a situation?
# Run 3000shell under gdb and observe all the system calls it makes using <tt>catch syscall</tt> (after setting a breakpoint on main so you don't see the syscalls when it starts).  Where does each system call happen?  In what context (source and assembly)?  Consider both parent and child processes (by setting follow-fork-mode to parent and child).  Compare with the output of <tt>strace</tt> (e.g., run <tt>strace -fqo 3000shell.log ./3000shell</tt>).
# Run 3000shell under gdb and observe all the system calls it makes using <tt>catch syscall</tt> (after setting a breakpoint on main so you don't see the syscalls when it starts).  Where does each system call happen?  In what context (source and assembly)?  Consider both parent and child processes (by setting follow-fork-mode to parent and child).  Compare with the output of <tt>strace</tt> (e.g., run <tt>strace -fqo 3000shell.log ./3000shell</tt>).  For info on gdb, see the [[GDB quick start]].
# 3000shell implements a simple form of output redirection.  What syntax should you use to redirect standard output to a file?
# 3000shell implements a simple form of output redirection.  What syntax should you use to redirect standard output to a file?
# Why are lines 207-210 there (the check for pid == -1)?
# Why are lines 207-210 there (the check for pid == -1)?

Latest revision as of 11:14, 26 September 2022

In this tutorial you will be experimenting with and extending 3000shell.c (listed below).

Make sure you use the original code from 3000shell for each question.

Getting Started

You should download 3000shell.c on your openstack instance (or a Ubuntu Linux 21.04 system or similar). Compile it using the command

gcc -O -g -Wall 3000shell.c -o 3000shell

Standard I/O

Recall that the shell is a command interpreter (program). In comparison, a terminal is a device used to enter data into and display data from a computer. It used to be physical - teletypes, or "TTY"'s. Today they are virtual - pseudoterminals, pseudo teletypes, or pts's. /dev/pts/1 is a pts for example. By themselves, a pts does nothing; however, they allow any program to appear as a virtual teletype to any other program. (To learn more, man ptmx. However, this goes into more detail than you need for this tutorial.)

When you ssh to a system interactively, the remote ssh process uses a pseudo tty to connect network I/O with the programs you run at the command line. In this context you can thus think of ssh (plus a pseudo tty) as an adaptor that converts network traffic into terminal-like I/O.

Standard input, (e.g., /dev/stdin), output (e.g., /dev/stdout) and error (/dev/stderr) are just references to file descriptors 0, 1, and 2 for the current process. They can refer to basically any file. If a process is run in a terminal, stdin, stdout, and stderr are set to refer to the terminal's I/O channels via a tty or pts.

Shells provide easy interfaces for changing these file descriptors. For example:

 ls > ls.log

Will redirect ls's standard output to the file ls.log. Similarly,

 bc -l < math.txt

will take math input from /tmp/math.txt and output it to the current terminal. We can also use the pipe operator to direct the standard output of one program to the standard input of another:

 ls -l | less

(This is really good if you have lots of files in a directory.)

Part of the magic of UNIX is that it allows problems to be solved by combining multiple programs together. For example, to get a list of unique words in a file, you can do something like the following:

 tr ' \t' '\n' < file.txt | tr -d ',.!()?-' | sort | uniq | less  

In summary, shells run in terminals, and the terminal interface is implemented by teletype-like devices, either a "tty" or "pts" device. Standard in, out, and error are abstractions for interactions with such terminals or, with I/O redirection, other arbitrary files.

Tasks/Questions

The purpose of the following questions and tasks is to help you understand how 3000shell works. At the end of this you should have an understanding of every function and every line of the code. If you understand 3000shell, then you understand the basics of all UNIX shells. Your understanding of the code will be tested later, so use this opportunity to dive deep into the code. These tasks and questions should help you build up a mental model of how 3000shell works.

  1. Compile and run 3000shell.c
  2. Try running programs in the background using & after commands entered in 3000shell. What happens to the input and output of the program? Try this for simple programs like ls and bc. Then, try it for more complex interactive programs such as nano and top.
  3. You may have trouble interacting with the shell after running programs in the background. How can you recover from such a situation?
  4. Run 3000shell under gdb and observe all the system calls it makes using catch syscall (after setting a breakpoint on main so you don't see the syscalls when it starts). Where does each system call happen? In what context (source and assembly)? Consider both parent and child processes (by setting follow-fork-mode to parent and child). Compare with the output of strace (e.g., run strace -fqo 3000shell.log ./3000shell). For info on gdb, see the GDB quick start.
  5. 3000shell implements a simple form of output redirection. What syntax should you use to redirect standard output to a file?
  6. Why are lines 207-210 there (the check for pid == -1)?
  7. Make find_binary show every attempt to find a binary.
  8. Make the shell output "Ouch!" when you send it a SIGUSR1 signal.
  9. Delete line 324 (SA_RESTART). How does the behavior of 3000shell change?
  10. Replace the use of find_env() with getenv(). How do their interfaces differ?
  11. Make plist output the parent process id for each process, e.g. "5123 ls (5122)". Pay attention to the stat and status files in the per-process directories in /proc.
  12. Implement redirection of standard error
  13. Implement redirection of standard out for plist() (the same as if it was an external command).
  14. Implement a built-in 3000kill command that works like the standard kill command. (What system call/library call is used to send signals?)

Code

3000shell.c

/* 3000shell.c */
/* v2 Sept. 15, 2019 */
/* v1 Sept. 24, 2017 */
/* based off of csimpleshell.c, Enrico Franchi © 2005
      https://web.archive.org/web/20170223203852/
      http://rik0.altervista.org/snippets/csimpleshell.html */
/* Original under "BSD" license */
/* This version is under GPLv3, copyright 2017, 2019 Anil Somayaji */
/* You really shouldn't be incorporating parts of this in any other code,
   it is meant for teaching, not production */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <dirent.h>
#include <ctype.h>
#include <fcntl.h>
#include <signal.h>

#define BUFFER_SIZE 1<<16
#define ARR_SIZE 1<<16
#define COMM_SIZE 32

const char *proc_prefix = "/proc";

void parse_args(char *buffer, char** args, 
                size_t args_size, size_t *nargs)
{
        char *buf_args[args_size]; /* You need C99 */
        char **cp, *wbuf;
        size_t i, j;
        
        wbuf=buffer;
        buf_args[0]=buffer; 
        args[0] =buffer;
        
        for(cp=buf_args; (*cp=strsep(&wbuf, " \n\t")) != NULL ;){
                if ((*cp != NULL) && (++cp >= &buf_args[args_size]))
                        break;
        }
        
        for (j=i=0; buf_args[i]!=NULL; i++){
                if (strlen(buf_args[i]) > 0)
                        args[j++]=buf_args[i];
        }
        
        *nargs=j;
        args[j]=NULL;
}

/* this is kind of like getenv() */
char *find_env(char *envvar, char *notfound, char *envp[])
{
        const int MAXPATTERN = 128;
        int i, p;
        char c;
        char pattern[MAXPATTERN];
        char *value = NULL;

        p = 0;
        while ((c = envvar[p])) {
                pattern[p] = c;
                p++;
                if (p == (MAXPATTERN - 2)) {
                        break;
                }
        }
        pattern[p] = '=';
        p++;
        pattern[p] = '\0';
        
        i = 0;
        while (envp[i] != NULL) {
                if (strncmp(pattern, envp[i], p) == 0) {                        
                        value = envp[i] + p;
                }
                i++;
        }

        if (value == NULL) {
                return notfound;
        } else {
                return value;
        }
}

void find_binary(char *name, char *path, char *fn, int fn_size) {
        char *n, *p;
        int r, stat_return;

        struct stat file_status;

        if (name[0] == '.' || name[0] == '/') {
                strncpy(fn, name, fn_size);
                return;
        }
        
        p = path;
        while (*p != '\0') {       
                r = 0;
                while (*p != '\0' && *p != ':' && r < fn_size - 1) {
                        fn[r] = *p;
                        r++;
                        p++;
                }

                fn[r] = '/';
                r++;
                
                n = name;
                while (*n != '\0' && r < fn_size) {
                        fn[r] = *n;
                        n++;
                        r++;
                }
                fn[r] = '\0';

                
                stat_return = stat(fn, &file_status);

                if (stat_return == 0) {
                        return;
                }

                if (*p != '\0') {
                        p++;
                }
        }
}

void setup_comm_fn(char *pidstr, char *comm_fn)
{
        char *c;

        strcpy(comm_fn, proc_prefix);
        c = comm_fn + strlen(comm_fn);
        *c = '/';
        c++;
        strcpy(c, pidstr);
        c = c + strlen(pidstr);
        strcpy(c, "/comm");
}

void plist()
{
        DIR *proc;
        struct dirent *e;
        int result;
        char comm[COMM_SIZE];  /* seems to just need 16 */        
        char comm_fn[512];
        int fd, i, n;

        proc = opendir(proc_prefix);

        if (proc == NULL) {
                fprintf(stderr, "ERROR: Couldn't open /proc.\n");
        }
        
        for (e = readdir(proc); e != NULL; e = readdir(proc)) {
                if (isdigit(e->d_name[0])) {
                        setup_comm_fn(e->d_name, comm_fn);
                        fd = open(comm_fn, O_RDONLY);
                        if (fd > -1) {                                
                                n = read(fd, comm, COMM_SIZE);
                                close(fd);
                                for (i=0; i < n; i++) {
                                        if (comm[i] == '\n') {
                                                comm[i] = '\0';
                                                break;
                                        }
                                }
                                printf("%s: %s\n", e->d_name, comm);
                        } else {
                                printf("%s\n", e->d_name);
                        }
                }
        }
        
        result = closedir(proc);
        if (result) {
                fprintf(stderr, "ERROR: Couldn't close /proc.\n");
        }
}

void signal_handler(int the_signal)
{
        int pid, status;

        if (the_signal == SIGHUP) {
                fprintf(stderr, "Received SIGHUP.\n");
                return;
        }
        
        if (the_signal != SIGCHLD) {
                fprintf(stderr, "Child handler called for signal %d?!\n",
                        the_signal);
                return;
        }

        pid = wait(&status);

        if (pid == -1) {
                /* nothing to wait for */
                return;
        }
        
        if (WIFEXITED(status)) {
                fprintf(stderr, "\nProcess %d exited with status %d.\n",
                        pid, WEXITSTATUS(status));
        } else {
                fprintf(stderr, "\nProcess %d aborted.\n", pid);
        }
}

void run_program(char *args[], int background, char *stdout_fn,
                 char *path, char *envp[])
{
        pid_t pid;
        int fd, *ret_status = NULL;
        char bin_fn[BUFFER_SIZE];

        pid = fork();
        if (pid) {
                if (background) {
                        fprintf(stderr,
                                "Process %d running in the background.\n",
                                pid);
                } else {
                        pid = wait(ret_status);
                }
        } else {
                find_binary(args[0], path, bin_fn, BUFFER_SIZE);

                if (stdout_fn != NULL) {
                        fd = creat(stdout_fn, 0666);
                        dup2(fd, 1);
                        close(fd);
                }
                
                if (execve(bin_fn, args, envp)) {
                        puts(strerror(errno));
                        exit(127);
                }
        }
}

void prompt_loop(char *username, char *path, char *envp[])
{
        char buffer[BUFFER_SIZE];
        char *args[ARR_SIZE];
        
        int background;
        size_t nargs;
        char *s;
        int i, j;
        char *stdout_fn;
        
        while(1){
                printf("%s $ ", username);
                s = fgets(buffer, BUFFER_SIZE, stdin);
                
                if (s == NULL) {
                        /* we reached EOF */
                        printf("\n");
                        exit(0);
                }
                
                parse_args(buffer, args, ARR_SIZE, &nargs); 
                
                if (nargs==0) continue;
                
                if (!strcmp(args[0], "exit")) {
                        exit(0);
                }
                
                if (!strcmp(args[0], "plist")) {
                        plist();
                        continue;
                }

                background = 0;            
                if (strcmp(args[nargs-1], "&") == 0) {
                        background = 1;
                        nargs--;
                        args[nargs] = NULL;
                }

                stdout_fn = NULL;
                for (i = 1; i < nargs; i++) {
                        if (args[i][0] == '>') {
                                stdout_fn = args[i];
                                stdout_fn++;
                                printf("Set stdout to %s\n", stdout_fn);
                                for (j = i; j < nargs - 1; j++) {
                                        args[j] = args[j+1];
                                }
                                nargs--;
                                args[nargs] = NULL;
                                break;
                        }
                }
                
                run_program(args, background, stdout_fn, path, envp);
        }    
}

int main(int argc, char *argv[], char *envp[])
{
        struct sigaction signal_handler_struct;
                
        char *username;
        char *default_username = "UNKNOWN";
        
        char *path;
        char *default_path = "/usr/bin:/bin";
        
        memset (&signal_handler_struct, 0, sizeof(signal_handler_struct));
        signal_handler_struct.sa_handler = signal_handler;
        signal_handler_struct.sa_flags = SA_RESTART;
        
        if (sigaction(SIGCHLD, &signal_handler_struct, NULL)) {
                fprintf(stderr, "Couldn't register SIGCHLD handler.\n");
        }
        
        if (sigaction(SIGHUP, &signal_handler_struct, NULL)) {
                fprintf(stderr, "Couldn't register SIGHUP handler.\n");
        }
        
        username = find_env("USER", default_username, envp);
        path = find_env("PATH", default_path, envp);

        prompt_loop(username, path, envp);
        
        return 0;
}