Talk:COMP 3000 Essay 1 2010 Question 7

The Question

Original:
How is it possible for systems to supports millions of threads or more within a single process? What are the key design choices that make such systems work - and how do those choices affect the utility of such massively scalable thread implementations?

Rannath:
The question seems to be about number and scalability of threads not the gross mechanics.

To be more clear: we can limit ourselves from the thread implementations to the thread scalability... ignore the stuff that required for all threads, unless its required for many threads. (I didn't find any implementations that required hardware)

I would also argue that since OSs have to run on multiple hardwares one cannot guarantee that unique/rare hardware bits will be there. While we can talk about hardware we should limit it to a mention at most. OR we could mention prospective hardware that could help out, but is not yet standard. It depends on whether we want to do "as it is" or "as it might be"

utility of such massively scalable thread implementations. I took this as: what functionality (of single strings) does one have to give up to make threads scalable.

Gautam:
I think the hardware is as relevant as the software. Not all things can be done in software and hardware support is an important factor in most of the solutions to many problems that OS face. My take.

Henry:
Since the question is about the system as a whole, I think the answer should include both software and hardware support for large amounts of threads. The questions revolves around how a system can handle millions of threads and what are the major factors that allow the system to do it. Also, the last part of the question seems to ask what this amount of threads allows a process to do.

Shane:
In response to the above's idea on the last part of the question, I would argue that it would enable fast execution because all threads that receive a cache miss would be picked up by the other threads so long as there was enough resources. Also the use of more threads would help synchronize the cache (through sharing) so that it would not miss. Of course this would be if they were assigned to the same task, you cannot sync threads running different applications it just wouldn't make sense. The only issue with this idea is the software must support this number.

Group 7

Let us start out by listing down our names and email id (preffered).

Gautam Akiwate <gautam.akiwate@gmail.com>

Patrick Young(rannath) <rannath@gmail.com>

vG Vivek <support.tamiltreasure@gmail.com>

Shane Panke <shanepanke@msn.com>

Henry Irving <sens.henry@gmail.com>

Paul Raubic <paul_raubic@hotmail.com>

Guidelines

Raw info should have some indication of where you got it for citation.

Claim your info so we don't need to dig for who got what when we need clarification.

Feel free to provide info for or edit someone else's info, just keep their signature so we can discuss changes

sign changes (once) preferably without time stamps Ex: --Rannath

Please maintain a log of your activities in the Log Section. So that we can keep track of the evolution of the essay. --Gautam

Log

Please maintain a log of your activities in the Log Section. So that we can keep track of the evolution of the essay. --Gautam

Moved around some info for clarity

everyone should post your interpretation of the question in simplest possible English so we`re on the same page (as someone, maybe me, seems to have the wrong idea about what we`re trying to talk about)

More moving for clarity added an essay outline at bottom (feel free to change) filled in the outline somewhat added questions to the outline for everyone to think on.--Rannath

First Draft for essay. Please modify and add on. --Gautam 02:46, 13 October 2010 (UTC)

Facts We have

Start by placing the info here so we can sort through it. I'm going to go into full research/essay writing mode on Sunday if there isn't enough here.

So far we have: Three design choices I've seen:

Smallest possible footprint per-thread (being extremely light weight) - from everywhere
least number (none if at all possible) of context switches per-thread - 5
use of a "thread pool" - 3

The idea is to reduce processor time and storage needed per-thread so you can have more in the same amount of space. --Rannath

These are all related ideas.

Ok, since we are discussing design choices maybe we could also elaborate on the two major types of threads. Here, I already wrote a few lines, source can be found in citation section:

Fibers (user mode threads) provide very quick and efficient switching because there is no need for a system call and kernel is oblivious to a switch - allows for millions of user mode threads. ISSUES: Blocking system calls disables all other fibers. On the other hand managing threads through the kernel requires context switch (between user and kernel mode) on creation and removal of a thread therefore programs with prodigious number of threads would suffer huge performance hits.--Praubic 18:05, 10 October 2010 (UTC)

User-mode scheduling (UMS) is a light-weight mechanism that applications can use to schedule their own threads. The ability to switch between threads in user mode makes UMS more efficient than thread pools for short-duration work items that require few system calls. Paul

One implementation of UMS is: combination of N:N and N:M, where the N:N relationship reveals N false processors to the user-space so the user can deal with scheduling on their own. 5 -Rannath

I would scrap the first two below, at most mention them...

time-division multiplexing
threads vs processes
I/O Scheduling -vG

Splitting this off because I don't think it's technically part of the answer
Multithreading generally occurs by time-division multiplexing. It makes it possible for the processor to switch between different threads but it happens so fast that the user sees it as it is running at the same time. User:vG

Things that we need to cover in the essay:--Gautam 19:35, 7 October 2010 (UTC)
This is a need section 4 below is not needed
(A)Design Decisions

  1. Type of threading (1:1 1:N M:N)
  2. Signal handling - we might be able to leave this out as it seems some "light weight" threads use no signals
  3. Synchronisation
  4. Memory Handling
  5. Scheduling Priorities (context switching and how it affects the CPU threading process)Paul

Things we might want also to cover in the essay (non-essentials here): --Rannath 04:43, 10 October 2010 (UTC)
(A)Design Decisions

  1. Brief History of threading
  2. examples of attempts at getting absurd numbers of threads (failures)
  3. other types of threading, including heavy weight and processes
  4. Examples of systems that require many threads such as mainframe servers or banking client processing.--Praubic 17:34, 11 October 2010 (UTC)

Here is an example of a design: (the topic asks for key design choices here is one)

Capriccio is a specific design for scalable user level threads. They are distinct from most designs by being independent of event based mechanisms as well as kernel thread models. They are very good choice for internet servers and this implementations could easily support 100,000 threads. They are characterized by high scalability, efficient stack management and scheduling based on resource usage however the performance is not comparable to event-based systems.--Praubic 13:32, 12 October 2010 (UTC)

(B)Kernel

  1. Program Thread manipulation through system calls --Hirving 20:05, 7 October 2010 (UTC)

(C)Hardware --Hirving 19:55, 7 October 2010 (UTC)

  1. Simultaneous Multithreading
  2. Multi-core processors

Essay Outline

Thesis is an answer to the question so... that's the first step, or the last step, we can always present our info and make our thesis match the info.
List all questions and points we have about the topic

Questions:

What makes threads non-scalable? List the problems
What utility do some scalable implementations lack? Why?
Just how scalable does a full utility implementation get?

Answers:

Signals, portability(maybe) both add overhead which would slow down threads

Intro (fill in info)

Thesis
main topics

Body (made of many main points)

Main Point 1 -Rannath
- efficient thread creation/destruction is more scalable
-- NPTL's improvements over LinuxThreads- primarily due to lower overhead of creation/destruction 1

Main Point 2 -Rannath
- UMS & user-space threads are more scalable - maybe
-- context switches are costly From class
-- blocking locks have lower latency when twinned with a user space scheduler 8

Main Point 3
- Certain bottleneck appear in scaled implementations, removing these improves scalability.
-- "False cache-line sharing" 14
-- xtime lock to a lockless lock 14

Main Point 3.5
Fine-Grain over course-grain
-- "Big Kernel Lock" 14
-- dcache_lock 14

Link the Main points to the thesis

Conclusion

restate info
affirmation of thesis

Here is the first paragraph that I attempted. Please feel free to change or even delete it from here.

A thread is an independent task that executes in the same address space as other threads within a single process while sharing data synchronously. Threads require less system resources then concurrent cooperating processes and start much easier therefore there may exist millions of them in a single process. The two major types of threads are kernel and user-mode. Kernel threads are usually considered more heavy and designs that involve them are not very scalable User threads on the other hand are mapped to kernel threads by the threads library such as libpthreads. and there are a few designs that incorporate it mainly Fibers and UMS (User Mode Scheudling) which allow for very high scalability. UMS threads have their own context and resources however the ability to switch in the user mode makes them more efficient (depending on application) than Thread Pools which are yet another mechanism that allows for high scalability. --Praubic 19:04, 12 October 2010 (UTC)

I suggest that we start filling out the main points of the essay. We can discuss the intricacies as we go along. --Gautam 02:46, 13 October 2010 (UTC)

Sources

Short history of threads in Linux and new implementation of them. NPTL: The New Implementation of Threads for Linux Gautam 22:18, 5 October 2010 (UTC)
This paper discusses the design choices Native POSIX Threads Gautam 22:11, 5 October 2010 (UTC)
lightweight threads vs kernel threads PicoThreads: Lightweight Threads in Java --Rannath 00:23, 6 October 2010 (UTC)
Eigenclass Comparing lightweight threads --Rannath 00:23, 6 October 2010 (UTC)
A lightwight thread implementation for Unix Implementing light weight threads --Rannath 00:49, 6 October 2010 (UTC) Gbint 19:50, 5 October 2010 (UTC)
Not in this group, but I thought that this paper was excellent: Qthreads: An API for Programming with Millions of Lightweight Threads
Difference between single and multi threading [1] vG
Implementation of Scalable Blocking Locks using an Adaptative Thread Scheduler --Gautam 19:35, 7 October 2010 (UTC)
Research Group working on Simultaneous Multithreading Simultaneous Multithreading --Hirving 19:58, 7 October 2010 (UTC)
This site provides in-depth info about threads, threads-pooling, scheduling: http://msdn.microsoft.com/en-us/library/ms684841(VS.85).aspx Paul
Here is another site that outlines THREAD designs and techniques: http://people.csail.mit.edu/rinard/osnotes/h2.html Paul
Interesting presentation: really worth checking out Paul
KERNEL vs USERMODE http://www.wordiq.com/definition/Thread_(computer_science)--Praubic 18:06, 10 October 2010 (UTC)
Scalability in linux
This has something to do with our question...
Linux Scheduling Priorities Explained 11 October 2005