COMP 3000 2022F Assignment 1 Solutions 1. [2] There are three functions defined in 3000menu.c. Which lines in 3000menu.s implement each of those functions? How do you know those are the correct lines? A: The definition for each function starts with the static data of the function followed by a declaration of the address of the start of the function's code. For the first two functions here they start with a .section directive; main, however, starts with a .globl definition of the symbol main, likely because there is no static data inside the main function. The functions end with a .size directive which contains the function name; the last instruction, however, is normally a ret (return). Thus the three functions are on the following lines: run_program: 3-60 choose_program: 61-161 main: 162-186 2. [1] Why is environ referenced but not defined in 3000menu.s? A: environ is declared as extern, meaning that another file defines it. In assembly there is no requirement that all symbols be defined; if you use a symbol that isn't defined, it is assumed to come from outside of the file. Thus environ doesn't have to be declared in the assembly language version of the code. 3. [2] What lines in 3000menu.s define the menu array of character arrays (the declaration on line 10 of 3000menu.c)? What lines define the constant string arrays it points to? A: Lines 208-213 define the actual array of pointers that is menu. The pointers of menu refer to the strings defined on lines 196-203. The whole menu data structure and contents, along with declarations, is defined on lines 194-213. 4. [2] Which lines of 3000menu.s correspond to lines 72-76 of 3000menu.c, inside of main()? What do each of these lines do? A: Lines 171-173 & lines 179-183 in 3000menu.s correspond to lines 72-76 in 3000menu.c. They operate as follows: 171, call: invokes choose_program() 172, cmpl: compares the return value of choose_program (returned in register %eax) with the QUIT global variable (using relative to instruction pointer addressing so the code is position independent). 173, je: If the comparison is true, jump to .L22 on line 179, else continue executing (i.e., move on to line 77 of the C code) 179: just the label .L22: 180, movl: put a 0 in %eax (this is the zero in "return 0") 181, addq: deallocate the program local variable by adding 8 to %rsp 182: update GCC on the current canonical frame address (CFA), now that we changed the stack pointer 183, ret: return from the function (the return statement) 5. [2] The call to getline on line 56 of 3000menu.c does an implicit malloc() call, requiring a call to free() on line 58. Do these calls result in additional system calls? How do you know? A: They do not result in additional system calls. If you look at the library call level you'll see calls to getline() and free(); however, there are no sbrk or mmap system calls at this point in program execution. We see things like this: 12537 write(1, "Your choice? ", 13) = 13 12537 read(0, "1\n", 1024) = 2 12537 write(1, "Running /usr/bin/ls\n", 20) = 20 The write comes from line 54, the printf(). The read comes from getline() on line 56. Then the write() comes from line 25 in run_program(), which is executed by the call on line 79 in main() after choose_program() returns after line 72. Thus there are no system calls that can be attributed to memory allocation or deallocation in this part of the trace. (We see this because these allocations are small and when the program starts the C library dynamic memory allocator sets aside some default amount of space and our program never needs more than this. If we never freed the space allocated by getline(), eventually we'd expect to see an sbrk() call somewhere increasing the heap size.) 6. [2] If we replace line 27 of 3000menu.c with "pid = 0;" how does the behavior of the program change? Why? A: With this change, we'll see that the program terminates after we run one command, we never get to select another. If we replace pid=fork() with pid=0, we're not doing a fork and we're forcing the code that should run in the child process to run. This means that we only have one process, not two, and thus the execve replaces the code of 3000menu with the program being run (such as ls). When that program terminates there is no 3000menu to return to, so we don't get another menu printed. (Note that wait() can never be called with this change.) 7. [2] In line 30 of 3000menu.c, we replace one of the string pointers in menu with NULL. a. [1] What is the purpose of this line? A: This line is used to create the argv[] array for the execve call. We need to at least include the program name in argv, and argv is an array of pointers to strings with the last string pointer being a NULL. With this change we in effect get a two element array, with the first being a pointer to the command we are running, and the second being NULL terminating the command line argument array. b. [1] This doesn't change the menu that is displayed to the user. Why not? A: Because we are changing the array in the child process, not the parent. Changes to the memory of the child process doesn't affect that of the parent, so the parent process's behavior (which prints the menu) is unaffected. 8. [2] How does the 3000menu.c's behavior change if you delete line 34, the call to wait? Why? A: If you delete the call to wait(), the menu will be immediately printed before the selected program finishes executing, and in fact both can run at the same time. In the case of ls and whoami, the change is minor (at most messing up screen formatting) because these programs terminate quickly; however, with top we have a messed up situation where top and 3000menu are both trying to get user input from the terminal at the same time, interfering with each other. 9. [5] Replace line 29 in 3000menu.s (the assembly code version) with the following two lines: mov $57,%eax syscall Create a new binary by compiling the modified assembly language code. It should run the same as before. a. [1] Is 3000menu-q2 statically or dynamically linked? A: It is dynamically linked, as we didn't specify -static and gcc does dynamic linking by default. b. [1] How does the library call behavior of 3000menu-q2 differ from 3000menu? Why? A: The library call behavior is different because 3000menu-q2 doesn't call the fork() library call. This is because we removed this call ("call fork@PLT"). The code we added doesn't make any new library calls, so there is one less call. (Note that ltrace doesn't work properly past the first program that is run; strace with the -f option, however, works properly.) c. [1] How does the system call behavior of 3000menu-q2 differ from 3000menu? Why? A: The system call behavior is different because 3000menu-q2 makes a fork system call while 3000menu makes a clone system call. In the GNU C library, the fork library call is implemented using the more general clone system call on Linux. (Note: there may be some other system calls that are made before clone, but clone is the one that implements the fork functionality and creates a new child process.) d. [2] Could you modify 3000menu.c so that it compiled to essentially identical assembly as 3000menu-q2? Why or why not? (You only need to consider standard C functionality.) A: You could not modify 3000menu.c to produce the same assembly code as 3000menu-q2.s because you cannot create standard C code that causes a "syscall" assembly language instruction to be generated. C itself has no concept of invoking kernel functionality, it only knows how to call functions. Instead, we'd either have to include inline assembly, use a compiler-specific directive to generate the syscall instruction, or we have to call a library function that does this work for us.