COMP 3000 2021F Assignment 1 Solutions [Note that this does not follow the template!] 1. [2] By default, what answers will give you a perfect score in 3000quiz? Why? A: By default, it is impossible to get a perfect score in 3000quiz. You can get 3 out of 4 correct if you answer Ottawa, 5280, and Bob to the 1st, 3rd, and 4th questions. The correct answer for the second is "ENV: DOGNAME" but you can't type that because scanf stops scanning when in detects a space. (1 point for the 3 answers that you can get, 1 for explaining why you can't get a perfect score.) 2. [2] How is the second question in 3000quiz represented in assembly language? How does this compare to the string on line 57, in main()? A: If you compile it with the gcc in the openstack image (from Ubuntu 21.04) it is represented as something like this: .LC10: .string "What is the name of Anil's dog?" The .string means that the following should be treated as a literal string and should be encoded in the binary according to its character-by-character values. The .LC10 (or similar) serves as a label so that other parts of the program can refer to it, specifically the array of pointers to these questions: the_questions: .quad .LC9 .quad .LC10 .quad .LC11 .quad .LC12 .quad 0 The .quad's are 64-bit values that here are pointers to strings (with the last one being NULL, i.e., 0). Line 57 has a literal string that is given as the first argument to printf(). It is represented exactly the same way: .LC4: .string "\nYour score is %d out of %d.\n" The .LC4 is referred to here: leaq .LC4(%rip), %rsi This is part of setting up the arguments for "call __printf_chk@PLT", which is how the printf() statement gets translated into assembly code. Note that on other systems this output may be slightly or significantly different in their details but overall the same basic patterns should hold. 3. [2] Where are the_questions, the_answers, and response (user input) stored: the stack, heap, or elsewhere? Can you verify this by looking at their relative addresses? Explain for each. A: the_questions and the_answers are stored in the .rodata (read-only data) section as shown by "objdump -s". This is neither the stack or the heap as those parts of memory can be modified. This is separate from where the response is stored, it is on the stack with the other local variables. If you take the address of these values as shown in Tutorial 2, you should get output something like this: the_questions: 559be2186060 the_answers: 559be2186020 &response: 7ffe3046b970 &score: 7ffe3046b96c 4. [2] How can you remove the constant numQuestions and replace it with a runtime calculated count of the number of questions? Does anything in the answers data structure help with this calculation? Explain briefly. A: We can calculate the length of the array by looping through it looking for an entry that is NULL. The following code for example works: char *q; int numQuestions = 0; do { q = the_answers[numQuestions]; numQuestions++; } while (q != NULL); numQuestions--; /* the ending NULL isn't a question */ Note that this wouldn't be possible if the last value wasn't NULL. 5. [2] How can you change 3000quiz so that if you run "./3000quiz Sally" the answer to question 4 is Sally? Only change the data passed to askQuestions(), don't change the code in askQuestions(). Can you do this without copying the string? Explain briefly. A: We can make the name the first command line argument (if supplied) by adding this code to main(): if (argc >= 2) { the_answers[3] = argv[1]; } Note that this code doesn't copy the command line string; instead, it sets the appropriate pointer in the_answers to point directly to the appropriate command line argument. (Just pointers are being changed, no strings are copied.) 6. [2] How can you change 3000quiz so that if you set the environment variable DOGNAME to Anil's dog's name the answer to question 2 will be correct? (Again, just change the data passed to askQuestions(), not the code of askQuestions().) Can you do this without copying the string? Explain briefly. A: We can use getenv() to get the value of DOGNAME: char *dogName = getenv("DOGNAME"); if (dogName) { the_answers[1] = dogName; } getenv() just returns a pointer to the environment variable string, it doesn't make a copy. If we check the address of dogName you'll see it is in the range for command line arguments and environment variables. 7. [2] Run 3000quiz using setarch -R: setarch -R ./3000quiz. How does this change the execution of 3000quiz? How can you tell? A: If you do this the addresses aren't randomized every time. If we print out any pointer values we'll see that they are exactly the same between runs. Otherwise we can't tell that anything is different. 8. [2] Does adding environment variables change where things are stored in memory? How do you know? A: Yes it does. If we use the setarch -R to run 3000quiz and we print out the value of dogName, if we don't change environment variables its values will remain constant. However, if we add any new environment variables or change the value of existing ones, the value of dogName will change. 9. [2] Are there system calls (i.e., machine language instructions for making system calls) in the 3000quiz binary? How do you know? Why or why not? A: There are no system calls in the 3000quiz binary when compiled by default. We can verify this by running objdump -d on it and looking for the "syscall" string. This is because the system call instructions are in library code, our code is just making function calls into the C library. 10. [2] How can you create a static binary of 3000quiz (i.e., one that loads no libraries at runtime)? Does this version have system calls in its binary? How do you know? Why or why not? A: Yes this version has system calls. Again, if you search the output of objdump -d on the staticaly-linked binary for "syscall" you'll see many occurences. (The machine code for them should be "0f 05".) The system calls are still in the library code, except now the library code has been added directly to the binary (rather than it being loaded at runtime).