Talk:COMP 3000 Essay 1 2010 Question 7: Difference between revisions

Revision as of 20:33, 11 October 2010

The Question

Original:
How is it possible for systems to supports millions of threads or more within a single process? What are the key design choices that make such systems work - and how do those choices affect the utility of such massively scalable thread implementations?

For Paul:
Please rephrase, you`re just restating the question.

Rannath:
The question seems to be about number and scalability of threads not the gross mechanics.

To be more clear: we can limit ourselves from the thread implementations to the thread scalability... ignore the stuff that required for all threads, unless its required for many threads. (I didn't find any implementations that required hardware)

I would also argue that since OSs have to run on multiple hardwares one cannot guarantee that unique/rare hardware bits will be there. While we can talk about hardware we should limit it to a mention at most. OR we could mention prospective hardware that could help out, but is not yet standard. It depends on whether we want to do "as it is" or "as it might be"

utility of such massively scalable thread implementations. I took this as: what functionality (of single strings) does one have to give up to make threads scalable.

Paul:
Our topic contains 3 parts to it from what i see

How is it possible for systems to supports millions of threads or more within a single process?
What are the key design choices that make such systems work -
and how do those choices affect the utility of such massively scalable thread implementations?

We need to find a way to split it between 5 people so everyone focuses primarily on one aspect. If you guys don't mind Id like to discuss the format of our essay. Paul

Gautam:
I think the hardware is as relevant as the software. Not all things can be done in software and hardware support is an important factor in most of the solutions to many problems that OS face. My take.

Henry:
Since the question is about the system as a whole, I think the answer should include both software and hardware support for large amounts of threads. The questions revolves around how a system can handle millions of threads and what are the major factors that allow the system to do it. Also, the last part of the question seems to ask what this amount of threads allows a process to do.

Group 7

Let us start out by listing down our names and email id (preffered).

Gautam Akiwate <gautam.akiwate@gmail.com>

Patrick Young(rannath) <rannath@gmail.com>

vG Vivek <support.tamiltreasure@gmail.com>

Henry Irving <sens.henry@gmail.com>

Paul Raubic <paul_raubic@hotmail.com>

Guidelines

Raw info should have some indication of where you got it for citation.

Claim your info so we don't need to dig for who got what when we need clarification.

Feel free to provide info for or edit someone else's info, just keep their signature so we can discuss changes

sign changes (once) preferably without time stamps Ex: --Rannath

Please maintain a log of your activities in the Log Section. So that we can keep track of the evolution of the essay. --Gautam

Log

Please maintain a log of your activities in the Log Section. So that we can keep track of the evolution of the essay. --Gautam

Moved around some info for clarity

everyone should post your interpretation of the question in simplest possible English so we`re on the same page (as someone, maybe me, seems to have the wrong idea about what we`re trying to talk about)

More moving for clarity

added an essay outline at bottom (feel free to change)

filled in the outline somewhat

added questions to the outline for everyone to think on.--Rannath

Facts We have

Start by placing the info here so we can sort through it. I'm going to go into full research/essay writing mode on Sunday if there isn't enough here.

So far we have: Three design choices I've seen:

Smallest possible footprint per-thread (being extremely light weight) - from everywhere
least number (none if at all possible) of context switches per-thread - 5
use of a "thread pool" - 3

The idea is to reduce processor time and storage needed per-thread so you can have more in the same amount of space. --Rannath

These are all related ideas.

Ok, since we are discussing design choices maybe we could also elaborate on the two major types of threads. Here, I already wrote a few lines, source can be found in citation section:

Fibers (user mode threads) provide very quick and efficient switching because there is no need for a system call and kernel is oblivious to a switch - allows for millions of user mode threads. ISSUES: Blocking system calls disables all other fibers. On the other hand managing threads through the kernel requires context switch (between user and kernel mode) on creation and removal of a thread therefore programs with prodigious number of threads would suffer huge performance hits.--Praubic 18:05, 10 October 2010 (UTC)

User-mode scheduling (UMS) is a light-weight mechanism that applications can use to schedule their own threads. The ability to switch between threads in user mode makes UMS more efficient than thread pools for short-duration work items that require few system calls. Paul

One implementation of UMS is: combination of N:N and N:M, where the N:N relationship reveals N false processors to the user-space so the user can deal with scheduling on their own. 5 -Rannath

I would scrap the first two below, at most mention them...

time-division multiplexing
threads vs processes
I/O Scheduling -vG

Splitting this off because I don't think it's technically part of the answer
Multithreading generally occurs by time-division multiplexing. It makes it possible for the processor to switch between different threads but it happens so fast that the user sees it as it is running at the same time. User:vG

Things that we need to cover in the essay:--Gautam 19:35, 7 October 2010 (UTC)
(A)Design Decisions

  1. Type of threading (1:1 1:N M:N)
  2. Signal handling - we might be able to leave this out as it seems some "light weight" threads use no signals
  3. Synchronisation
  4. Memory Handling
  5. Scheduling Priorities (context switching and how it affects the CPU threading process)Paul
  6. Examples of systems that require many threads such as mainframe servers or banking client processing.--Praubic 17:34, 11 October 2010 (UTC)

Things we might want also to cover in the essay (non-essentials here): --Rannath 04:43, 10 October 2010 (UTC)
(A)Design Decisions

  1. Brief History of threading
  2. examples of attempts at getting absurd numbers of threads (failures)
  3. other types of threading, including heavy weight and processes

(B)Kernel

  1. Program Thread manipulation through system calls --Hirving 20:05, 7 October 2010 (UTC)

(C)Hardware --Hirving 19:55, 7 October 2010 (UTC)

  1. Simultaneous Multithreading
  2. Multi-core processors

Essay Outline

Thesis is an answer to the question so... that's the first step, or the last step, we can always present our info and make our thesis match the info.
List all questions and points we have about the topic

Questions:

What makes threads non-scalable? List the problems
What utility do some scalable implementations lack? Why?
Just how scalable does a full utility implementation get?

Answers:

Signals, portability(maybe) both add overhead which would slow down threads

Intro (fill in info)

Thesis
main topics

Body (made of many main points)

Main Point 1 -Rannath
- efficient thread creation/destruction is more scalable
-- NPTL's improvements over LinuxThreads- primarily due to lower overhead of creation/destruction 1

Main Point 2 -Rannath
- UMS & user-space threads are more scalable
-- context switches are costly From class
-- blocking locks have lower latency when twinned with a user space scheduler 8

It is a good idea to avoid trying to create one (or even N) threads per client request. This approach is classically non-scalable and will definitely cause problems with memory usage or context switching. Using a thread pool approach instead and looking at the incoming requests as tasks for any thread in the pool to handle is more prefarable. The scalability of this approach is then limited by the ideal number of threads in the pool - usually this is related to the number of CPU cores. We want to try to have each thread use exactly 100% of the CPU on a single core - so in the ideal case we would have 1 thread per core, this will reduce context switching to zero. Depending on the nature of the tasks, this might not be possible, maybe the threads have to wait for external data, or read from disk or whatever so you may find that the number of threads is increased by some scaling factor. --Praubic 18:03, 11 October 2010 (UTC)

The question is how to support arbitrarily large numbers of threads, not if that's a good idea :P -Rannath

Link the Main points to the thesis

Conclusion

restate info
affirmation of thesis

Sources

Short history of threads in Linux and new implementation of them. NPTL: The New Implementation of Threads for Linux Gautam 22:18, 5 October 2010 (UTC)
This paper discusses the design choices Native POSIX Threads Gautam 22:11, 5 October 2010 (UTC)
lightweight threads vs kernel threads PicoThreads: Lightweight Threads in Java --Rannath 00:23, 6 October 2010 (UTC)
Eigenclass Comparing lightweight threads --Rannath 00:23, 6 October 2010 (UTC)
A lightwight thread implementation for Unix Implementing light weight threads --Rannath 00:49, 6 October 2010 (UTC) Gbint 19:50, 5 October 2010 (UTC)
Not in this group, but I thought that this paper was excellent: Qthreads: An API for Programming with Millions of Lightweight Threads
Difference between single and multi threading [1] vG
Implementation of Scalable Blocking Locks using an Adaptative Thread Scheduler --Gautam 19:35, 7 October 2010 (UTC)
Research Group working on Simultaneous Multithreading Simultaneous Multithreading --Hirving 19:58, 7 October 2010 (UTC)
This site provides in-depth info about threads, threads-pooling, scheduling: http://msdn.microsoft.com/en-us/library/ms684841(VS.85).aspx Paul
Here is another site that outlines THREAD designs and techniques: http://people.csail.mit.edu/rinard/osnotes/h2.html Paul
Interesting presentation: really worth checking out Paul
KERNEL vs USERMODE http://www.wordiq.com/definition/Thread_(computer_science)--Praubic 18:06, 10 October 2010 (UTC)

@@ Line 96: / Line 96: @@
 ----
+I would scrap the first two below, at most mention them...
 #time-division multiplexing