Operating Systems 2021F Lecture 3

From Soma-notes
Jump to navigation Jump to search

Video

Video from the lecture given on September 16, 2021 is now available:

Video is also available through Brightspace (Resources->Class zoom meetings->Cloud Recordings tab)

Notes

Lecture 3
---------

Today we're talking about shells

And I'm going to try making a drawing here


     ---------------------
     |shell| vi | firefox|   <---  running programs (processes)
     |-------------------+
      |------------------+
      |  kernel (Linux)	 |
      |			 |    <--- drivers go in here
      +------------------+|
      |    hardware	  |   <--- firmware goes here, but loaded by kernel
      +------------------/-        (more to this story)


The shell is just another running program (process)

Drivers are code that is added to the kernel so it can access specific devices

Today we're focused on the shell, because that's how you're going to interact with the kernel (to start with), and because it gives you a nice view of what the kernel is doing

(sorry things have been confusing so far, will get to firmer ground soon)

When you ssh to openstack, openstack only allows connections from the Carleton network.  So you have to get onto the carleton network.

Two ways to do this:
 - VPN to Carleton
 - ssh to Carleton (e.g., access.scs.carleton.ca)

If you ssh to Carleton, you can use that ssh connection to "tunnel" other
connections.  That's what the -L is for

If you VPN to Carleton, ALL of your network traffic goes through the Carleton network.
 - not so good if you're wanting to stream video at the same time,
   VPNs aren't the fastest

SSH method
----------

So to connect to access first, you
  ssh -L <local port>:<openstack floating IP address>:22 \
     <mycarletonone username>@access.scs.carleton.ca

type in your password, then

  ssh -p <local port> student@localhost

and you're on your openstack instance


VPN method
----------

* connect to the Carleton VPN using Cisco Anyconnect or
  openconnect (linux)

* then,

  ssh student@<openstack floating IP address>


Just remember ALL your network traffic is going through Carleton while you are VPN'd.  So don't watch so much youtube :-)  (They log things)

So why does this all matter?
 - this is actually how things used to work
 - you'd login to one machine, then remote shell to other machines to do
   work
 - ssh stands for "secure shell", and is a direct descendent of
   rsh (remote shell)
     - note the repeated uses of the word "shell"
     - this is the original interface for UNIX-like systems
     - GUIs are a MUCH later development

What is SSH actually doing?
 - it runs a program on the remote systems (another ssh process)
   - the two ssh processes set up a bidirectional network stream
 - the remote ssh process starts a shell
 - then the local ssh forwards keyboard input to the remote ssh,
   it passes it to the shell
 - the remote shell outputs data, it sends it to the remote ssh
   and it forwards it to the local ssh to display

You can use any local port >1023 generally, 1023 and less are reserved
 - I can talk more about networking later, but just trying to get the basics for now

Note that port 22 is reserved for ssh, so if you ssh localhost or ssh -p 22 localhost, it is the same thing, and you're logging in to your own computer

I'm going to stop talking about SSH now, but I'm happy to discuss further in the forums or in office hours (or later lectures)

But what is interesting about what we've been talking about, is it shows what a shell is
 - character stream in
 - character stream out
 - does something in between

In general, this is known as a "read eval print" loop
 - comes from the LISP world
 - but is really the same thing here

LISP is a programming language
 - should be covered in 3007
 - (+ 2 2) <-- this is what LISP looks like (e.g., Scheme)

(If you want to learn more about LISP, look up LISP machines
and symbolics.  BTW symbolics.com was the first .com address on the Internet)

Going back to a REPL
 - primitive way of interacting with a computer
 - but MUCH better than what came before

Before you'd do things as a batch
 - give lots of input (often as a deck of punch cards)
 - WAIT
 - get lots of output
 ... ponder the output ...
 repeat

REPLs are interactive
 - you type something, and generally something happens in return almost
   immediately

REPLs are synchronous
 - you have to wait for one input to be evaluated before giving new input

but there are ways to do asynchronous things in a REPL

The shell IS a REPL, but one designed for interacting with the operating system
 - its purpose is to run other programs!

The purpose of the default prompt of the shell is so you know you're talking to the shell
 - you can run all kinds of other programs
 - often not clear who you are talking to!

To help you manage running other programs, standard shells give special meaning to certain control key sequences, and most programs don't override it (but they can)
 - Ctrl-C: interrupt
 - Ctrl-Z: suspend/stop
 - Ctrl-D: end of file

If you enable emacs bindings (which is default on the class vm), you can also do some line editing
 - Ctrl-A: start of line
 - Ctrl-E: end of line
 - Ctrl-K: kill (delete to kill buffer)   <-- cut
 - Ctrl-Y: Yank (paste from kill buffer)  <-- paste

Also, on modern shells you have access to command history via cursor keys (up and down)

They also support incremental search
 - Ctrl-S: forward search
 - Ctrl-R: reverse search
 - these are also emacs commands
   - to cancel, hit Ctrl-G

All of these features are supported by the GNU readline library
 - used by bash and many other programs
 - so all have the same interface
 - and also use the same customizations file, so if you have a way
   you like to interact with command lines, create your config file
   and many programs will respect it

If we go back in time, often you'd want to pause output coming your way
 - especially if it was an out of control command
 - ctrl-S pauses, ctrl-q unpauses
   - this is "flow control"
 - and yes, Ctrl-S also is for search, depends on context
   - basically, are you editing a command line or are you interacting
     with another program

Why am I telling you all this?
 - REPLs like the shell offer a sophisticated interface with many
   nice features to make them efficient to use
 - but, these features aren't very discoverable
    - you have to see them in use or read about them
 - Not great for newbies, but as you gain experience with them,
   it pays of in productivity gains

If your terminal gets messed up, you can type "clear" to clear it
 - and if things are acting really weird, type "reset" to get it back to a sane state

Shells run in terminals
 - emulate the first interfaces
 - provide a text "screen" and a "keyboard"
    - used to be a physical device, now virtual generally in a GUI

You need to reset your terminal if you've viewed a file which isn't pure text directly

Because, how does cursor positioning work in a terminal?
  - special character sequences

If something isn't pure text, it has embedded special character sequences
 (random ones)

Your history of commands is stored in .bash_history (for bash)
 - erase it if you want to delete your command history

(there are lots of other ways of interacting with the shell, I'm going to stop here)

Lots of interesting files in your main home directory.  By default ls doesn't show you them all.
 - "ls -a" to see all files
 - by default, any file that starts with a period is considered a hidden
   file and isn't shown
    - this is just by convention, no special permissions/restrictions

When we talk about dot files, we are talking about files that start with a period, they are often used to customize how things work
 - if you want to change how things work when you log in,
   you have to mess with the dot files in your home directory

By default, every user on UNIX has a home directory
 - where they are put when they log in
 - represented by "~"
 
pwd tells me the directory I'm currently in

note that the shell is always in a directory
 - that's the other job of the shell - interact and manipulate the filesystem
    - but it does so mostly by running programs

If you look in ~/.bash_aliases  (/home/student/.bash_aliases) you'll see
the code that defines the scs-backup command
 - yes the shell is generally programmable
 - so you can modify/extend the interface as you wish

When you type a command at a shell, it can either be an internal or external command
 - internal command: bash understands it
 - external command: bash runs another program

ls is an external command, as is less, top, etc

which tells you where a command lives
 - if which says nothing, it is built in

So you see the shell interface is really one for running other programs
 - but it is a programming language as well, just one optimized for
   running other programs, e.g. "shell scripts"

If you are curious about a command/program, try
 - man <command>  <-- full docs
 - whatis <command> <-- quick summary
 - <command> --help or -h
    returns program's help info

You're not going to learn everything about the shell in this course, or UNIX
 - SO rich
 - I learn new commands all the time (just learned about whatis today!)
 - It will be clear what I want you to know for this class
   - but learning more will make your life better
   - learn what you need to make things easier for yourself

I don't teach to tests
 - very limiting
 - and not so effective

I cover a lot, and then expect you to know a subset
 - that subset is what is on the assignments

If you teach me something on the test and it works, that's great!
 - but mostly you'll be explaining things, so that won't come up so much

So what you should be asking yourself is this
 - if I'm running programs from the shell, how are those programs related to or connected to each other?
 - which one is in control?  Can more than one be running?
 - are they all running at the same time?
 - can they mess with each other?

At least, those are the questions I want you to focus on :-)

A1 will come out shortly after T2, so by early next week

The really cool part about the shell is that you don't have to just run one program at a time
 - you can combine many programs together to accomplish a task


The shell is just another program running in a process
 - see csimpleshell.c, it is a basic shell

consider this command
  du | sort -n -r | head

These are three separate programs
 - How is the output of one being passed to the other?
 - When are they all running?  one at a time, all at the same time?
   - if at the same time, how do they coordinate their behavior?
 - yes the answer is | is a pipe, but what is a pipe actually doing?

We'll discuss this next time.