COMP 3000 Lecture #3 September 15, 2005 -------------------- Amusing fact: google for "comp 3000 carleton" and you'll find the class web page as the third link. Doesn't seem to appear if you search for "operating systems carleton", though. Now, the paper: Basic idea: literature review of a sub-topic of operating systems literature, or some closely related literature full paper, 10-15 pages roughly, but not graded on length purely, but on how well put together it is (coverage, organization, style, etc.) Full paper outline is due on October 11th: - abstract - a detailed outline (listing sections and important paragraphs) - a list of references Note that your abstract cannot just discuss the general topic of your paper but must also explain your basic argument. Simple example of topic vs. thesis statement: Topic: squirrels & power lines Thesis: Squirrels like to run across power lines because they become addicted to the "stimulation" provided by the small shocks squirrels receive from them. In addition, you need at least 10 references: full bibliographic citations of published papers and/or books (in conferences, journals, technical reports). - You may cite simple web pages, but they don't count! - Good places to look: - IEEE, ACM, USENIX conference proceedings, journals - citeseer.com, google scholar - Once you find one or two papers, use their citations to find others: who they cite, who cite those papers. - Also, textbooks (yours and in library) have references on many topics. What is good about the textbooks is that they tend to reference the important works. (Important, see below) You will be graded in this first part on 1) the quality of your abstract and outline and 2) the quality and relevance of your references. You should try to find not just any papers, but the ones that are most significant (i.e. the ones written by the people who came up with the original ideas). Note that such papers tend to be cited by others! ***This project will take a lot of work to do right. Do not wait until the last minute, start working today!*** To meet this deadline, you should probably choose a topic by next Thursday. While you are not required to, I encourage you to approach the TAs or me regarding your ideas for topics (via email or in office hours). You probably won't be able to understand all of the papers you find today, but you should be able to get enough out of them to develop a thesis and outline. - If you need to change your thesis later after learning more about your topic, that's ok - this is _an_ outline, not _the_ outline for your paper. Another suggestion - let the topic come from the papers themselves. Start looking for papers on topics that you find interesting, and once you find a few in the same area, use that as your topic. DO NOT PLAGARIZE! This means that: - Your paper should be original, written by you alone and not submitted for credit for any other class. - While you may get outside help with writing, and you may discuss ideas with others, the final work should be essentially yours. - You should reference any ideas that are not your own and that are not "common knowledge". (If the basic idea is in Tannenbaum in the non-research sections, you don't have to cite it - but if you can cite the original source, that is always good.) - The words _and organization_ must be original, you can't just paraphrase paragraphs or entire sections from another work. - Quotes should be one sentence long or less, and should be accompanied by at least two sentences of original text. This is a guideline, not a rule - but if you are breaking it, you are probably doing something wrong. Here are two good sources that explain more on how to avoid plagarism: http://owl.english.purdue.edu/handouts/research/r_plagiar.html http://www.utoronto.ca/writing/plagsep.html Discuss ideas for paper topics Back to Computer Architecture Stored Program Architecture: Central Arithmetical unit (CA) the Central Control unit (CU) the Memory (M) the Input/Output devices (IO) ____________ | | | M | |____________| ^ _______________________ | _________ | ___________ _____v______ | ____________ ____________ | | | | | | | | | | | | CA |<->| CC | <->| I/O | <-> | human | | |___________| |____________| | |____________| |____________| |___________________________________| From http://www.salem.mass.edu/~tevans/VonNeuma.htm, by Barney J. Cabrera Note there are two bottlenecks here: between CPU and I/O, and CPU and memory. (Also bottleneck between I/O and human, but that's another story!) Solution to bottlenecks: - memory bottleneck: memory hierarchy - I/O bottlenecks: connect I/O and memory directly (bypassing CPU) The Memory Hierarchy Based on two observations: - code locality - data locality So, small amounts of fast memory are used to "cache" the "working set" of code and/or data so processor can access it as fast as possible Example: Tape backup Disk (100G): filesystem and paging/swap DRAM (1G) motherboard cache (L3) (1M) on-chip L2 cache (128K) on-chip L1 cache (16K) registers (<1K) Levels go from small/fast to large/slow Basic storage management strategy: caching - keep stuff that's frequently used in close proximity - on cache miss, generally have worse-than-no-cache latency - but on hit, do better - can pre-fetch multiple items, in hope they'll be used Works particularly when when code and data are local - code: in a loop - data: frequently changing local variables - note that streaming data doesn't cache well - think multimedia Hardware manages everything from DRAM up, but OS is responsible for disk and tape. Note that differing speeds require different memory management strategies - VM is more complex because CPU has more time - higher-level cache must use simpler circuitry, hence things like direct cache vs. associative caches [explain terms] Consider alternative: why not allow programmer to directly decide where things get stored, instead of hardware/OS choosing what to put where (and having multiple copies of data)? - too complex to program, lots of bookkeeping - note that this is what the PS3 will require, though... - hope they have good compilers... Note that copying between levels, while speeding up repeated accesses to the same data, slows down the first access. - caching increases and decreases latency Interesting fact: mainframes have very little cache - would just slow down I/O, generally don't access same data multiple times in short time frame Can add bandwidth easily, but you must design for low latency - consider a station wagon of DVDs - high bandwidth, but also high latency! Techniques for masking latency - caching for repeated use - multitask while waiting => major early motivation for multiprogramming Modern CPUs are actually very parallel, have multiple copies of just about everything - execute instructions in "parallel" via pipelining - like an assembly line, break job up into small pieces - requires sophisticated branch prediction - real code isn't simple assembly line