COMP 3000 2020W Assignment 4 Solutions Tutorial 8 Questions -------------------- 1. Can a process change where data and code is stored in virtual memory? What about in physical memory? A: A process has complete control over where things are stored in its virtual memory. It has no control over where its code and data is stored in physical memory. Physical memory is managed by the kernel and parts of physical memory can be allocated and de-allocated at will. A process can even find parts of its address space is not available; in such a situation, it will be paused when that memory is accessed until it is made available by the kernel. 2. If two processes mmap the same library, will that library (necessarily) have the same virtual addresses for both processes? What about the same physical addresses? A: That library will have the same physical addresses, as only one copy will be loaded. However, its location in virtual memory can vary from one process to the next - each can map the library (which is just a file) to different virtual addresses. 3. What system calls does 3000memview2 use to get physical addresses? Are any of these new (ones we haven't previously seen in class)? Why? A: 3000memview2 uses ioctl system calls to interact with the /dev/physicalview character device. We haven't seen this system call before because it is only used to interact with device files, regular files, devices, and symbolic links don't support ioctl calls. 4. Who has access to /dev/physicalview? How do you know (from the code)? A: Everyone has access, because in physicalview_devnode() on line 161 the mode for the device is set to 0666, which means that the file owner, group, and everyone else has read and write access to the file. 5. List all of the page table lookups that get_physical() does in 3000physicalview.c. Why are there so many lookups? A: pgd_offset(), p4d_offset(), pud_offset(), pmd_offset(), and pte_offset_map() - five lookups, as some x86-64 chips implement 5-level page tables and so the Linux kernel uses 5-level lookups always, with architectures that have fewer levels turning the missing ones into no-ops. 6. Can you do an ioctl call on regular files? Why or why not? A: You cannot do ioctl calls on regular files. They don't support it, and there is no good reason to support it. The whole point of ioctl calls is to allow access to device-specific functionity defined by its device driver. Regular files have no such functionality to expose. 7. What are the values of PAGE_SHIFT and PAGE_SIZE? Where are they defined? What do they represent? A: They are defined in the arch/x86/include/asm/page_types.h file (and in other architecture-specific directories). PAGE_SHIFT represents the number of bits required to represent an offset in a page, 12 in this case. PAGE_SIZE is the number of bytes in a page, which is 2^12 or 4096 bytes. (In the source this is encoded as _AC(1,UL) << PAGE_SHIFT, which is a 1 shifted to the left 12 times, so 2^12. The extra stuff is so the constant works in both assembly language and C.) Tutorial 9 Questions -------------------- 1. Where is FILTER_PID defined? Where is it used? A: It is defined on line 20 of 3000shellwatch.py, as part of the string incorporating the PID passed in as an argument. It is then used on line 58 of bpfprogram.c to determine whether the filter function returns 0 or 1. Note that in bpfprogram.c FILTER_PID is a compile time constant that is passed in on the compliation command line using a -D flag (when called by 3000shellwatch.py). 2. How could you make 3000shellwatch.py watch for events in any process, not just a specific one? What events would it then report? A: To watch for all processes, change the filter() function in bpfprogram.c to return 0 always. When you do this, you'll get the events for all processes on the system (including all raw system calls, what the user wrote messages (calls to fgets), received signals, and the distribution of read lengths) that would otherwise just be reported for 3000shell. The number of events is quite large even on the class VMs as there are a number of programs running constantly in the background. (You still have to pass in a PID on the command line, but it will be ignored. You could change the required flag on line 13 of 3000shellwatch.py to be zero and add a default value if you wanted to not have to specify the PID argument.) 3. Make 3000shellwatch.py monitor all instances of 3000shell by checking a process's comm property. Be sure to remove the PID argument. (Hint: see bashreadline) A: Replace filter() in bpfprogram.c with the following: static int filter() { char comm[TASK_COMM_LEN] = {}; bpf_get_current_comm(&comm, sizeof(comm)); if (comm[0] == '3' && comm[1] == '0' && comm[2] == '0' && comm[3] == '0' && comm[4] == 's' && comm[5] == 'h' && comm[6] == 'e' && comm[7] == 'l' && comm[8] == 'l') { return 0; } return 1; } In 3000shellwatch.py, delete lines 12-15, 20, and change line 45 to be something like: print(f'Tracing 3000shell events, ctrl-c to exit...', file=sys.stderr) (You could also change how probes are attached to processes, there are potentially multiple ways to solve this problem but this is the most straightforward.) 4. What code of 3000shellwatch runs in userspace? What runs in kernel space? A: 3000shellwatch.py and utils.py run from userspace (in user mode on the CPU). The code in bpfprogram.c (after being compiled and loaded into the kernel) runs in kernel space (in supervisor mode on the CPU). 5. Why does 3000shellwatch require root privileges to run? Give an example of a small change you could make to 3000shellwatch that would give an unprivileged user the ability to see or do something that they normally can't. A: 3000shellwatch requires root privileges because it allows observation of any process on the system. Just from the previous question, we're monitoring every process running the 3000shell executable. It is trivial to make this match to any executable, or indeed to any process. Thus you could observe any data being written or read, including passwords or confidential information. This is way beyond the capabilities of a regular unprivileged user. 6. How are a uprobe and a uretprobe similar? How are they different? A: uprobe and uretprobe both attach probes (code that will be run on a specific event) to userspace functions. uprobe sets a probe when a function is called, uretprobes are when functions return. 7. What is the signals dictionary used for? A: The signals dictionary (in utils.py) is used to translate the signal numbers returned by signal events, see line 34 in 3000shellwatch.py. The kernel just works with signal numbers, userspace has to associate them with names. 8. As presented in the tutorial, does 3000shellwatch have to use eBPF to achieve its goals? Could it instead have used ptrace? Argue for or against, based on the level of access you've seen gdb and strace have to processes using the ptrace system call. A: The original version of 3000shellwatch could have been implemented with ptrace. gdb has show access to individual functions (you can set breakpoints) and both gdb and strace track system calls. So long as we are only interested in one process 3000shellwatch can be implemented using ptrace; however, we can't (easily) use ptrace if we want to monitor more than one process. (It turns out ptrace can change program behavior and is not good to use in production. eBPF is specifically designed to allow monitoring in production environments.) 9. On line 69 of bpfprogram.c, does sys_exit refer to the exit system call? Explain. A: This event refers to system call exit, not the exit system call. This probe is run every time a system call exits, passing that info to userspace if the system call isn't excluded by the filter function. It also records stats of every read system call (again that isn't filtered).