Operating Systems 2018F Lecture 12: Difference between revisions

From Soma-notes
Created page with "==Video== Video from the lecture given on October 17, 2018 [https://homeostasis.scs.carleton.ca/~soma/os-2018f/lectures/comp3000-2018f-lec12-20181017.m4a is now available]...."
 
Line 4: Line 4:


==Notes==
==Notes==
{{Please leave this line alone and write below (this is the coloured heading)}}
== Today: Kernel modules ==
=== First, Anil's ssh setup: ===
<code>student@anil-vm:</code> // openstack
* why is it named anil-vm?
** edit /etc/hosts and /etc/hostname
<code>anilclass@sutherland:</code> // local machine
* edit /etc/hosts
** add the IP of your VM and give it name
** distributed DNS does this for URLs
* login without password
** ssh key pair, private and public
** files in ~/.ssh/
Public keys?
* sharing won't compromise security
Running X-windows programs on vm:
* <code>ssh -X student@anil-vm</code> // works natively in linux
* on macOS, you can run X-server separately
* on Windows, use x2go
== Whats a kernel module? ==
* kernel code, that you load, into the kernel
* same privileges as other kernel code
* dangerous! load kernel module = anything can happen
** only superuser (root) can load
* NOT a system call
Kernel Oops:
* kernel catches the bad memory access
Kernel crash:
* didn't catch it
Build kernel from scratch? Not usually - everything is in modules now
What modules are running?
<code>lsmod</code>
* gets info from <code>/proc/modules</code>
* <code>/proc/modules</code> is reading a kernel data structure
* <code>strace lsmod</code>:
** shows that it starts with <code>/proc/modules</code>
** but then goes into <code>/sys/module/</code>
* <code>/proc</code> : human readable
* <code>/sys</code> : machine readable
** wouldn't find a list of modules here, but rather deep directory structure
Where are kernel modules stored?
* <code>/lib/modules/<kernel version>/</code>
* why so many kernel versions?
** ex. 4.15.0-33
*** 4.15.0 is version.
*** -33 = package release (patches etc)
Why so many modules? Why not build them in?
* originally, everything was built in - now we basically never do that
** some may conflict
** many are unnecessary
** memory used by kernel is physically used - never virtual
*** so we minimize this by only loading necessary kernel modules
<code>.ko</code>: kernel object
<code>.so</code>: shared object (ex. C libraries)
Where is kernel itself?
<code>/boot/vmlinuz-4.15.0-33-generic</code>
* naming is traditional. 'z' is for compressed.
* variants may have different suffix, ex. 'rt' for realtime, instead of 'generic'
* kernel is around 8mb
== Review tutorial ==
* <code>sudo insmod simple.ko</code> // load kernel module
* <code>dmesg | tail</code> // shows that kernel is loaded
* <code>sudo rmmod simple.ko</code> // removes kernel module
=== simple.c ===
* Why <code>#include <linux/ ...></code>?
* No headers that are familiar, ex. <code>stdlib</code>, etc.
** You can't use these in kernel modules. Why?
*** You can't use any C library that makes a system call
** What if you wanted to <code>printf</code>?
*** <code>printk()</code> is a reimplementation, because
**** <code>printf</code> would "bottom out" by making a system call to write()
**** <code>printk</code> is used to generate logs
**** how does it do that?
***** keeps a buffer until they can be printed somewhere
***** bootup messages? produced by <code>printk</code>
*** kernel has its own weird world of standard libraries
* Some special syntax around declaring init and exit functions..
* How does kernel actually load a module? (ie. through <code>module_init()</code> call in simple.c)
** don't worry about it. always something you don't understand...
** and this is one of these things few people need to understand!
** you just needs to know how to use it.
* Why doesn't simple_exit return anything? Nobody cares what it returns!
** vs. if init failed... we would care
Rule of kernel programming: do as little as possible
* nicer to write in userspace
=== ones.c ===
* kernel module that implements a device driver for <code>/dev/ones</code>
** from linux kernels perspective.. same as any other device
** a module might be a device driver, but might not be
** and a driver might be baked into the kernel (not a module)
** but usually, device drivers are implemented as modules
* makes a character device: <code>register_chrdev()</code>
** every device has a major and a minor number: the above function returns major number
* <code><strong>ones_fops { }</strong></code>: important.  struct containing function pointers.
** we never call these functions in ones.c
** these implement the file operations on <code>/dev/ones</code>
*** after normal syscall dispatcher stuff, it will want to know HOW to open the device, read, etc
*** what if we wanted to do a write?
**** well, <code>/dev/ones</code> is opened read-only. let's <code>chmod a+w /dev/ones</code>.
**** <code>echo 5 > /dev/ones</code>  // write error
**** we only have the operations that we've implemented
** who calls them?
* <code>class_create()</code> ?
** can't just create a device from the major number
** need a class first before running <code>device_create()</code>
** grouping of all devices that use the same driver
** see <code>/sys/class/comp3000</code>
* <code>/sys/devices/virtual/comp3000/ones</code>
** what's this about? metadata.
** thankfully we didn't have to create this ourself - <code>device_create()</code> took care of all that
* failed_devreg, etc:
** handling failures is really important in kernel programming.
** if something fails and you don't manually deregister / deallocate memory, it never gets "cleaned up"
*** stays in physical memory until you reboot
* How did we come up with this code, generally?
** no manpages
** the linux kernel source is the ultimate reference for how this works
** best way to figure it out is to look at existing device drivers, modify to your purpose, trial & error...
* <code>ones_read()</code>:
** similar to <code>read()</code> in userspace
** in userspace you take a file descriptor; in kernel you take a pointer to a file struct
** <code>buf, len</code> just as in <code>read()</code>..
** offset? point to specific part of the file
** the file is always going to be <code>/dev/ones</code> in this case (since it is a device driver)
** why <code>put_user('1', buf++)</code> instead of <code>*buf++ = 1</code>?
*** The 1's are going out into userspace. This is what <code>put_user()</code> is for.
aside: <code>find . -name "ones"</code>
* will find files named "ones" in current directory (.)
Look at newgetpid.c

Revision as of 20:39, 21 November 2018

Video

Video from the lecture given on October 17, 2018 is now available.

Notes

Template:Please leave this line alone and write below (this is the coloured heading)

Today: Kernel modules

First, Anil's ssh setup:

student@anil-vm: // openstack

  • why is it named anil-vm?
    • edit /etc/hosts and /etc/hostname

anilclass@sutherland: // local machine

  • edit /etc/hosts
    • add the IP of your VM and give it name
    • distributed DNS does this for URLs
  • login without password
    • ssh key pair, private and public
    • files in ~/.ssh/

Public keys?

  • sharing won't compromise security

Running X-windows programs on vm:

  • ssh -X student@anil-vm // works natively in linux
  • on macOS, you can run X-server separately
  • on Windows, use x2go

Whats a kernel module?

  • kernel code, that you load, into the kernel
  • same privileges as other kernel code
  • dangerous! load kernel module = anything can happen
    • only superuser (root) can load
  • NOT a system call

Kernel Oops:

  • kernel catches the bad memory access

Kernel crash:

  • didn't catch it

Build kernel from scratch? Not usually - everything is in modules now

What modules are running? lsmod

  • gets info from /proc/modules
  • /proc/modules is reading a kernel data structure
  • strace lsmod:
    • shows that it starts with /proc/modules
    • but then goes into /sys/module/
  • /proc : human readable
  • /sys : machine readable
    • wouldn't find a list of modules here, but rather deep directory structure

Where are kernel modules stored?

  • /lib/modules/<kernel version>/
  • why so many kernel versions?
    • ex. 4.15.0-33
      • 4.15.0 is version.
      • -33 = package release (patches etc)

Why so many modules? Why not build them in?

  • originally, everything was built in - now we basically never do that
    • some may conflict
    • many are unnecessary
    • memory used by kernel is physically used - never virtual
      • so we minimize this by only loading necessary kernel modules

.ko: kernel object .so: shared object (ex. C libraries)

Where is kernel itself? /boot/vmlinuz-4.15.0-33-generic

  • naming is traditional. 'z' is for compressed.
  • variants may have different suffix, ex. 'rt' for realtime, instead of 'generic'
  • kernel is around 8mb

Review tutorial

  • sudo insmod simple.ko // load kernel module
  • dmesg | tail // shows that kernel is loaded
  • sudo rmmod simple.ko // removes kernel module

simple.c

  • Why #include <linux/ ...>?
  • No headers that are familiar, ex. stdlib, etc.
    • You can't use these in kernel modules. Why?
      • You can't use any C library that makes a system call
    • What if you wanted to printf?
      • printk() is a reimplementation, because
        • printf would "bottom out" by making a system call to write()
        • printk is used to generate logs
        • how does it do that?
          • keeps a buffer until they can be printed somewhere
          • bootup messages? produced by printk
      • kernel has its own weird world of standard libraries
  • Some special syntax around declaring init and exit functions..
  • How does kernel actually load a module? (ie. through module_init() call in simple.c)
    • don't worry about it. always something you don't understand...
    • and this is one of these things few people need to understand!
    • you just needs to know how to use it.
  • Why doesn't simple_exit return anything? Nobody cares what it returns!
    • vs. if init failed... we would care

Rule of kernel programming: do as little as possible

  • nicer to write in userspace

ones.c

  • kernel module that implements a device driver for /dev/ones
    • from linux kernels perspective.. same as any other device
    • a module might be a device driver, but might not be
    • and a driver might be baked into the kernel (not a module)
    • but usually, device drivers are implemented as modules
  • makes a character device: register_chrdev()
    • every device has a major and a minor number: the above function returns major number
  • ones_fops { }: important. struct containing function pointers.
    • we never call these functions in ones.c
    • these implement the file operations on /dev/ones
      • after normal syscall dispatcher stuff, it will want to know HOW to open the device, read, etc
      • what if we wanted to do a write?
        • well, /dev/ones is opened read-only. let's chmod a+w /dev/ones.
        • echo 5 > /dev/ones // write error
        • we only have the operations that we've implemented
    • who calls them?
  • class_create() ?
    • can't just create a device from the major number
    • need a class first before running device_create()
    • grouping of all devices that use the same driver
    • see /sys/class/comp3000
  • /sys/devices/virtual/comp3000/ones
    • what's this about? metadata.
    • thankfully we didn't have to create this ourself - device_create() took care of all that
  • failed_devreg, etc:
    • handling failures is really important in kernel programming.
    • if something fails and you don't manually deregister / deallocate memory, it never gets "cleaned up"
      • stays in physical memory until you reboot
  • How did we come up with this code, generally?
    • no manpages
    • the linux kernel source is the ultimate reference for how this works
    • best way to figure it out is to look at existing device drivers, modify to your purpose, trial & error...
  • ones_read():
    • similar to read() in userspace
    • in userspace you take a file descriptor; in kernel you take a pointer to a file struct
    • buf, len just as in read()..
    • offset? point to specific part of the file
    • the file is always going to be /dev/ones in this case (since it is a device driver)
    • why put_user('1', buf++) instead of *buf++ = 1?
      • The 1's are going out into userspace. This is what put_user() is for.

aside: find . -name "ones"

  • will find files named "ones" in current directory (.)

Look at newgetpid.c