COMP3000 Operating Systems F23: Tutorial 8: Difference between revisions

From Soma-notes
Abdou (talk | contribs)
Abdou (talk | contribs)
No edit summary
 
(One intermediate revision by the same user not shown)
Line 4: Line 4:
WARNING: The commands and programs in this tutorial are potentially extremely dangerous and may result in crashes or loss of data. Additionally, questions may not work as expected on a machine other than the course VM. For that reason, you are strongly encouraged to do this tutorial on the provided OpenStack virtual machine.
WARNING: The commands and programs in this tutorial are potentially extremely dangerous and may result in crashes or loss of data. Additionally, questions may not work as expected on a machine other than the course VM. For that reason, you are strongly encouraged to do this tutorial on the provided OpenStack virtual machine.


To get started, we will first examine the source code for <tt>3000physicalview.c</tt> and <tt>3000memview2.c</tt>, both of which are in [https://people.scs.carleton.ca/~lianyingzhao/comp3000/w23/tut8/3000physicalview.tar.gz 3000physicalview.tar.gz].
To get started, we will first examine the source code for <code>3000physicalview.c</code> and <code>3000memview2.c</code>, both of which are in [https://people.scs.carleton.ca/~abdou/comp3000/f23/tut8/3000physicalview.tar.gz 3000physicalview.tar.gz].


==Tasks part A: Getting Started==
==Tasks part A: Getting Started==
Compile <tt>3000physicalview</tt> and <tt>3000memview2</tt> using the provided <tt>Makefile</tt> (i.e., by running <tt>make</tt>).
Compile <code>3000physicalview</code> and <code>3000memview2</code> using the provided <code>Makefile</code> (i.e., by running <code>make</code>).


Insert <tt>3000physicalview</tt> by running <tt>make insert</tt>. Confirm that the module is inserted using <tt>lsmod</tt>.
Insert <code>3000physicalview</code> by running <code>make insert</code>. Confirm that the module is inserted using <code>lsmod</code>.
# Examine the call to <tt>copy_from_user()</tt> and <tt>copy_to_user()</tt> on lines 120 and 132 of <tt>3000physicalview.c</tt> (as also discussed in the lecture). Consider the following:
# Examine the call to <code>copy_from_user()</code> and <code>copy_to_user()</code> on lines 120 and 132 of <code>3000physicalview.c</code> (as also discussed in the lecture). Consider the following:
#* How are these functions different from <tt>get_user()</tt>/<tt>put_user()</tt> that we have seen in the previous tutorial?
#* How are these functions different from <code>get_user()</code>/<code>put_user()</code> that we have seen in the previous tutorial?
#* Why are these functions necessary? Couldn't we just access the userspace address directly? What could happen if we did?
#* Why are these functions necessary? Couldn't we just access the userspace address directly? What could happen if we did?
# <tt>3000physicalview</tt> exposes its API to userspace in the form of an <tt>ioctl(2)</tt> call. Consider the following:
# <code>3000physicalview</code> exposes its API to userspace in the form of an <code>ioctl(2)</code> call. Consider the following:
#* What is an ioctl? How is it different from a read or write system call? Hint: check <tt>man 2 ioctl</tt>.
#* What is an ioctl? How is it different from a read or write system call? Hint: check <code>man 2 ioctl</code>.
#* How does <tt>3000physicalview</tt> implement its ioctl? What arguments does it take?
#* How does <code>3000physicalview</code> implement its ioctl? What arguments does it take?
#* How does <tt>3000memview2</tt> call the ioctl? What arguments does it pass to the ioctl?
#* How does <code>3000memview2</code> call the ioctl? What arguments does it pass to the ioctl?
# Which function does the virtual-to-physical translation? What type is <tt>current->mm</tt>? How does it give you the address of the <tt>pgd</tt>? Recall the page table walk explained in the lecture and see how it’s reflected in this function (writing it down is optional).
# Which function does the virtual-to-physical translation? What type is <code>current->mm</code>? How does it give you the address of the <code>pgd</code>? Recall the page table walk explained in the lecture and see how it’s reflected in this function (writing it down is optional).
# Once you’ve got the page frame number (pfn), how is the physical address (phys) calculated? Describe in words.
# Once you’ve got the page frame number (pfn), how is the physical address (phys) calculated? Describe in words.


==Tasks part B: Examining Physical Memory Mappings==
==Tasks part B: Examining Physical Memory Mappings==
# With <tt>3000physicalview</tt> inserted, run <tt>3000memview2</tt> and examine the output. Note that it presents virtual memory addresses on the left, and physical addresses on the right. Are these mappings consistent with what you expected (e.g., in terms of patterns)?
# With <code>3000physicalview</code> inserted, run <code>3000memview2</code> and examine the output. Note that it presents virtual memory addresses on the left, and physical addresses on the right. Are these mappings consistent with what you expected (e.g., in terms of patterns)?
# Compare <tt>3000memview2</tt> with <tt>3000memview</tt> from Tutorial 2. What is similar about their code, and what is different? How similar is their output?
# Compare <code>3000memview2</code> with <code>3000memview</code> from Tutorial 2. What is similar about their code, and what is different? How similar is their output?
# Do you notice a pattern in the virtual addresses of <tt>buf[i]</tt>? Is this same pattern present in the physical addresses? Why or why not?
# Do you notice a pattern in the virtual addresses of <code>buf[i]</code>? Is this same pattern present in the physical addresses? Why or why not?
# Run <tt>3000memview2</tt> a few more times and consider the following:
# Run <code>3000memview2</code> a few more times and consider the following:
#* Are the virtual addresses the same or different between runs? How about physical addresses?
#* Are the virtual addresses the same or different between runs? How about physical addresses?
#* Some physical addresses don't seem to be changing between runs. Which ones? Why do you think this might be the case?
#* Some physical addresses don't seem to be changing between runs. Which ones? Why do you think this might be the case?
# Force the kernel to drop the virtual memory cache using <tt>sync && echo 3 | sudo tee /proc/sys/vm/drop_caches</tt>. Run <tt>3000memview2</tt> one more time and note that the physical addresses that stayed the same previously have now changed. What do you think just happened?
# Force the kernel to drop the virtual memory cache using <code>sync && echo 3 | sudo tee /proc/sys/vm/drop_caches</code>. Run <code>3000memview2</code> one more time and note that the physical addresses that stayed the same previously have now changed. What do you think just happened?


==Tasks part C: Using <tt>bpftrace</tt> to Monitor <tt>3000physicalview</tt> and <tt>3000memview2</tt>==  
==Tasks part C: Using <code>bpftrace</code> to Monitor <code>3000physicalview</code> and <code>3000memview2</code>==  
Now recall what you tried with eBPF in Tutorial 7 (where you applied the fix if applicable). We can also do something similar to watch the interaction between <tt>3000physicalview</tt> and <tt>3000memview2</tt>.  
Now recall what you tried with eBPF in Tutorial 7 (where you applied the fix if applicable). We can also do something similar to watch the interaction between <code>3000physicalview</code> and <code>3000memview2</code>.  
Note: No in-depth understanding of eBPF is expected in this course. You only need to understand what is involved and discussed. Feel free to read more if interested.
Note: No in-depth understanding of eBPF is expected in this course. You only need to understand what is involved and discussed. Feel free to read more if interested.


<ol>
<ol>
   <li>Use the following "one-liner" (as it is just a one-line string within the single quotation marks) in a new terminal session first and in the orginal one run <tt>./3000memview2</tt> as above:
   <li>Use the following "one-liner" (as it is just a one-line string within the single quotation marks) in a new terminal session first and in the orginal one run <code>./3000memview2</code> as above:


<p>
<p>


  <tt>sudo bpftrace -e 'tracepoint:syscalls:sys_enter_ioctl { printf("%s: fd=%d; cmd=%d; arg=%ld \n", comm, args->fd, args->cmd, args->arg); }' | grep 3000memview2</tt>
  <code>sudo bpftrace -e 'tracepoint:syscalls:sys_enter_ioctl { printf("%s: fd=%d; cmd=%d; arg=%ld \n", comm, args->fd, args->cmd, args->arg); }' | grep 3000memview2</code>


</p>
</p>


What does this one-liner do? As always, you can check <tt>man bpftrace</tt> if needed.</li>
What does this one-liner do? As always, you can check <code>man bpftrace</code> if needed.</li>
   <li>(<b>Optional</b> in the submission but you should do it) It seems the command above does not show useful information as the third argument is actually a pointer. So, to be able to show the two arguments <tt>virt</tt> and <tt>phys</tt> defined in <tt>3000physicalview.h</tt>, you will need to create a *.bt file.
   <li>(<b>Optional</b> in the submission but you should do it) It seems the command above does not show useful information as the third argument is actually a pointer. So, to be able to show the two arguments <code>virt</code> and <code>phys</code> defined in <code>3000physicalview.h</code>, you will need to create a *.bt file.
Refer to <tt>/usr/sbin/*.bt</tt> and convert the one-liner above into something like <tt>snoop3000physicalview.bt</tt>, so that when it runs it prints the <tt>virt</tt> passed in and the <tt>phys</tt> that is returned.
Refer to <code>/usr/sbin/*.bt</code> and convert the one-liner above into something like <code>snoop3000physicalview.bt</code>, so that when it runs it prints the <code>virt</code> passed in and the <code>phys</code> that is returned.
The code will be posted and explained.
The code will be posted and explained.
<br> Note: if you choose to directly include the header file you need to create a copy and remove the <tt>MODULE_*</tt> macros. Otherwise, you can simply include the struct definition.</li>
<br> Note: if you choose to directly include the header file you need to create a copy and remove the <code>MODULE_*</code> macros. Otherwise, you can simply include the struct definition.</li>
</ol>
</ol>

Latest revision as of 02:07, 25 October 2023

In this tutorial, you’ll be learning about how virtual addresses are mapped to physical addresses (the address translation) and continue to use kernel modules to extract information that only the kernel has access to. In particular, the kernel module performs a 5-level page table walk to find out the physical address corresponding to a userspace virtual address. In addition to what was discussed in the class, You can also read more about 5-level paging, if interested.

Introduction

WARNING: The commands and programs in this tutorial are potentially extremely dangerous and may result in crashes or loss of data. Additionally, questions may not work as expected on a machine other than the course VM. For that reason, you are strongly encouraged to do this tutorial on the provided OpenStack virtual machine.

To get started, we will first examine the source code for 3000physicalview.c and 3000memview2.c, both of which are in 3000physicalview.tar.gz.

Tasks part A: Getting Started

Compile 3000physicalview and 3000memview2 using the provided Makefile (i.e., by running make).

Insert 3000physicalview by running make insert. Confirm that the module is inserted using lsmod.

  1. Examine the call to copy_from_user() and copy_to_user() on lines 120 and 132 of 3000physicalview.c (as also discussed in the lecture). Consider the following:
    • How are these functions different from get_user()/put_user() that we have seen in the previous tutorial?
    • Why are these functions necessary? Couldn't we just access the userspace address directly? What could happen if we did?
  2. 3000physicalview exposes its API to userspace in the form of an ioctl(2) call. Consider the following:
    • What is an ioctl? How is it different from a read or write system call? Hint: check man 2 ioctl.
    • How does 3000physicalview implement its ioctl? What arguments does it take?
    • How does 3000memview2 call the ioctl? What arguments does it pass to the ioctl?
  3. Which function does the virtual-to-physical translation? What type is current->mm? How does it give you the address of the pgd? Recall the page table walk explained in the lecture and see how it’s reflected in this function (writing it down is optional).
  4. Once you’ve got the page frame number (pfn), how is the physical address (phys) calculated? Describe in words.

Tasks part B: Examining Physical Memory Mappings

  1. With 3000physicalview inserted, run 3000memview2 and examine the output. Note that it presents virtual memory addresses on the left, and physical addresses on the right. Are these mappings consistent with what you expected (e.g., in terms of patterns)?
  2. Compare 3000memview2 with 3000memview from Tutorial 2. What is similar about their code, and what is different? How similar is their output?
  3. Do you notice a pattern in the virtual addresses of buf[i]? Is this same pattern present in the physical addresses? Why or why not?
  4. Run 3000memview2 a few more times and consider the following:
    • Are the virtual addresses the same or different between runs? How about physical addresses?
    • Some physical addresses don't seem to be changing between runs. Which ones? Why do you think this might be the case?
  5. Force the kernel to drop the virtual memory cache using sync && echo 3 | sudo tee /proc/sys/vm/drop_caches. Run 3000memview2 one more time and note that the physical addresses that stayed the same previously have now changed. What do you think just happened?

Tasks part C: Using bpftrace to Monitor 3000physicalview and 3000memview2

Now recall what you tried with eBPF in Tutorial 7 (where you applied the fix if applicable). We can also do something similar to watch the interaction between 3000physicalview and 3000memview2. Note: No in-depth understanding of eBPF is expected in this course. You only need to understand what is involved and discussed. Feel free to read more if interested.

  1. Use the following "one-liner" (as it is just a one-line string within the single quotation marks) in a new terminal session first and in the orginal one run ./3000memview2 as above:

    sudo bpftrace -e 'tracepoint:syscalls:sys_enter_ioctl { printf("%s: fd=%d; cmd=%d; arg=%ld \n", comm, args->fd, args->cmd, args->arg); }' | grep 3000memview2

    What does this one-liner do? As always, you can check man bpftrace if needed.
  2. (Optional in the submission but you should do it) It seems the command above does not show useful information as the third argument is actually a pointer. So, to be able to show the two arguments virt and phys defined in 3000physicalview.h, you will need to create a *.bt file. Refer to /usr/sbin/*.bt and convert the one-liner above into something like snoop3000physicalview.bt, so that when it runs it prints the virt passed in and the phys that is returned. The code will be posted and explained.
    Note: if you choose to directly include the header file you need to create a copy and remove the MODULE_* macros. Otherwise, you can simply include the struct definition.