COMP 3000 2012 Week 4 Notes

From Soma-notes
Jump to navigation Jump to search

Package Management

  • Two popular ones
  1. dpkg ---> debian
  2. rpm ---> redhat
  3. portage --> gentoo. Compile packages from source
  • Historically, packages distrobuted via compressed archives (filename.tar.gz)
    • Tar --> archives uncompressed
    • .gz, .bz2 --> compression
  • This method does not take into consideration dependencies, -pre&post install scripts

Package management's true innovation was that it handled all this.

Yum, apt-get, aptitude are wrappers built around rpm and dpkg. These also maintain lists of repos and do alot of other things. "They have alot of smarts." Basic functionality is done by dpkg/rpm, though. Use yum and apt-get for system upgrades.

Package management empowers you to know all the dependencies and files needed for a a binary. It allows you absolute control over the binaries in your system.

  • you can strip down linux to the base essentials
  • helps you debug programs not that don't start

Dpkg files are basically tar balls (archives). The same format dating back from the 70's. They contain all the scripts and binaries needed to install the system.

  • on Mac, package management is less of a problem because "applications" are directories containing all the files needed by the program
  • before package managers, linux used tarballs (X.tar.gz)
    • tar
      • tape archiver
      • combines files
      • no compression
    • cpio
      • copy I/O
      • similar to tar
    • gz
      • gnu zip
    • bz2
      • bzip
      • competes with gzip
  • package manager
    • encodes dependencies
    • pre/post installation scripts
    • mostly written in shell scripts
    • NB: can uninstall the shell (BAD IDEA)
  • "metapackage managers"
    • built on top of package managers
    • installs missing dependencies
    • resolves conflicts
    • examples
      • yum
      • apt
      • aptitude
  • kernel
    • runs on the hardware
  • regular applications
    • run as processes on the kernel
    • each has its own virtual address space
  • virtual address space
    • maps a virtual address like 2000 to a physical address in memory
    • elaborate virtual memory mechanism (we'll talk about this later)
  • libraries
    • one copy loaded into memory, shared between running processes
    • breaks the notion of process separation
  • static linked library
    • puts the full library into memory with the program
    • allows complete pocess separation
    • wasteful because many copies of the library will be loaded into memory
  • system caller
    • calls from a process to the kernel
    • can watch this to determine how a process is interacting with the outside world through the kernel
    • tools
      • strace
        • watch system calls made by a process
      • ltrace
        • watch library calls made by a process

System Calls

  • strace
  • CPU has two modes
    • user mode
      • processes run in this mode
    • supervisor mode
      • kernel runs in this mode
  • A process can be at many different states, such as:
    • running : on CPU and executing
    • ready : ready to run
    • blocked : waiting on I/O


  • Scheduler decides what happens next.
  • Disk, network, user input devices, clock, etc can cause interrupt.
  • When making system calls => generate software interrupt
  • On interrupt:
    • save CPU state <- registers to process structure
    • switch to supervisor mode
    • run interrupt handler (system call dispatcher)
    • return to user space (scheduler will restore the CPU state)