Talk:COMP 3000 Essay 2 2010 Question 6: Difference between revisions

From Soma-notes
Achamney (talk | contribs)
Achamney (talk | contribs)
Line 55: Line 55:
lock-set based reasoning<br><br>
lock-set based reasoning<br><br>


Eraser, a data race detector programmed in 1997, was one of the earlier data race detectors on the market. It used fairly low level techniques to detect races. Most of the reason why it is unsuccessful is because it only checks whether memory accesses use proper locking techniques. If a memory access is found that does not use a lock, then it will report a data race. This is bad because based on what the programmer is doing, sometimes a lock is not required. This also does not take into account all of the benign problems such as date of access variables. The makers of DataCollider used this source to compare their program to Eraser, and to take some of its primitive ideas. <br><br>
Eraser, a data race detector programmed in 1997, was one of the earlier data race detectors on the market. It may have been a useful and revolutionary program of its time, however, it uses very low level techniques compared to most data race detectors today. One of the reason why it is unsuccessful is because it only checks whether memory accesses use proper locking techniques. If a memory access is found that does not use a lock, then Eraser will report a data race. In many cases, the misuse of proper locking techniques is a conscious decision by the programmer, so Eraser will report many false positives. This also does not take into account all of the benign problems such as date of access variables. DataCollider used this source as an example of a lock-set based program, and why they are a poor choice for a race condition debugger. <br><br>




Line 64: Line 64:
<b>LiteRace: Effective Sampling for Lightweight Data-Race Detection</b><br>
<b>LiteRace: Effective Sampling for Lightweight Data-Race Detection</b><br>
happens-before reasoning<br><br>
happens-before reasoning<br><br>
LiteRace, similar to pacer, samples a percentage of memory accesses from a program. Where it differs is the parts of memory that LiteRace samples the most. The "hot spot" regions of memory are ones that are accessed most by the program. Since they are accessed the most, chances are that they have already been successfully debugged, or if there are data races there, they are benign. LiteRace detects these areas in memory as hot spots, and samples them at a much lower rate. This improves LiteRace's chances of capturing a valid data race at a much lower sampling rate.  Where DataCollider bests LiteRace is based on LiteRace's installing mechanism. LiteRace needs to be recompiled into the software it is trying to debug, whereas DataColleder's breakpoints do not require any code changes to the program. This is a major success for DataCollider because often third party testers do not have the source code for a program. <br><br>
LiteRace, similar to Pacer, samples a percentage of memory accesses from a program. Where it differs is the parts of memory that LiteRace samples the most. The "hot spot" regions of memory are ones that are accessed most by the program. Since they are accessed the most, chances are that they have already been successfully debugged, or if there are data races there, they are benign. LiteRace detects these areas in memory as hot spots, and samples them at a much lower rate. This improves LiteRace's chances of capturing a valid data race at a much lower sampling rate.  Where DataCollider bests LiteRace is based on LiteRace's installing mechanism. LiteRace needs to be recompiled into the software it is trying to debug, whereas DataColleder's breakpoints do not require any code changes to the program. This is a major success for DataCollider because often third party testers do not have the source code for a program. <br><br>


<b>FastTrack: Efficient and Precise Dynamic Race Detection</b><br>
happens-before reasoning<br><br>


<b>RaceTrack: Efficient Detection of Data Race Conditions via Adaptive Trackins</b><br>
<b>RaceTrack: Efficient Detection of Data Race Conditions via Adaptive Trackings</b><br>
combo of lock-set and happens-before reasoning<br><br>
combo of lock-set and happens-before reasoning<br><br>
<b>HIGH OVERHEAD</B>[http://www.cs.ucla.edu/~dlmarino/pubs/pldi09.pdf]
<b>HIGH OVERHEAD</B>[http://www.cs.ucla.edu/~dlmarino/pubs/pldi09.pdf]

Revision as of 02:28, 2 December 2010

Actual group members

- Nicholas Shires nshires@connect.carleton.ca

- Andrew Zemancik andy.zemancik@gmail.com

- Austin Bondio -> abondio2@connect.carleton.ca

- David Krutsko :: dkrutsko at connect.carleton.ca

If everyone could just post there names and contact information.--Azemanci 02:57, 15 November 2010 (UTC)

IMPORTANT
THINGS WE NEED TO DEFINE:

  • Happens-before reasoning
  • Lock-set based reasoning
  • Hardware Breakpoints

The prof seemed to be very focused on hardware breakpoints, so it is very important to define it well, and talk about it often, it looks like hardware breakpoints are the one thing thats setting DataCollider apart from other race detectors, so lets focus on it!
IMPORTANT




Who's Doing What

Research Problem

I'll do 'Research Problem' and help out with the 'Critique' section, the professor said that part was pretty big Nshires 20:45, 21 November 2010 (UTC)

The research problem being addressed by this paper is the detection of erroneous data races without creating much overhead. This problem occurs because read/write access instructions in processes are not always atomic and two read/write commands may happen simultaneously.

The reasearch team’s program DataCollider needs to detect error between the hardware and kernel as well as errors in context thread synchronization in the kernel which must synchronize between user-mode processes, interrupts and deferred procedure calls. As shown in the Background Concepts section, this error can create unwanted problems in kernel modules. The research group created DataCollider which puts breakpoints in memory accesses to check if two system calls are calling the same piece of memory. There have been many solutions to this problem in the past and there are many other ways of detecting these data race errors.

One solution that some detectors in the past have used is the “happens before” method. This checks whether one access happened before another or if the other happened first, and if neither of those options were the case, the two accesses were done simultaneously. This method gathers true data race errors but is very hard to implement.

Another method used is the “lock-set” approach. This method checks all of the locks that are held currently by a thread, and if all the accesses do not have at least one common lock, the method sends a warning. This method has many false alarms since many variables nowadays are shared using other ways than locks or have very complex locking systems that lockset cannot understand.

This is what I have so far, suggestions welcomed! Nshires 22:38, 30 November 2010 (UTC) http://www.hpcaconf.org/hpca13/papers/014-zhou.pdf

Contribution

What are the research contribution(s) of this work? Specifically, what are the key research results, and what do they mean? (What was implemented? Why is it any better than what came before?)

Ill do Contribution: Achamney 03:50, 22 November 2010 (UTC)


Proving that DataCollider is better:
The key part of the contribution of this essay is its competition. The research team for DataCollider looked at several other implementations of race condition testers to find ways of improving their own program, or to look for different ways of solving the same problem.

Some of the programs that were referenced were:

  • Eraser: A Dynamic Data Race Detector for Multithreaded Programs
  • RaceTrack: Efficient Detection of Data Race Conditions via Adaptive Tracking
  • PACER: Proportional Detection of Data Races
  • LiteRace: Effective Sampling for Lightweight Data-Race Detection
  • FastTrack: Efficient and Precise Dynamic Race Detection
  • MultiRace: Efficient on-the-fly data race detection in multithreaded C++ programs
  • RacerX: Effective, Static Detection of Race Conditions and Deadlocks


Eraser: A Dynamic Data Race Detector for Multithreaded Programs
lock-set based reasoning

Eraser, a data race detector programmed in 1997, was one of the earlier data race detectors on the market. It may have been a useful and revolutionary program of its time, however, it uses very low level techniques compared to most data race detectors today. One of the reason why it is unsuccessful is because it only checks whether memory accesses use proper locking techniques. If a memory access is found that does not use a lock, then Eraser will report a data race. In many cases, the misuse of proper locking techniques is a conscious decision by the programmer, so Eraser will report many false positives. This also does not take into account all of the benign problems such as date of access variables. DataCollider used this source as an example of a lock-set based program, and why they are a poor choice for a race condition debugger.


PACER: Proportional Detection of Data Races
happens-before reasoning

Pacer, a happens-before reasoning data race detector, uses the FastTrack algorithm to detect data races. FastTrack uses vector-clocks to keep track of two threads, and find whether or not they are conflicting in any way. Pacer samples some percentage of each memory access, (from 1 to 3 percent) and runs the FastTrack happens-before algorithm on each thread that accesses that part of memory. DataCollider used this source as an example of the implementation of sampling. Similar to Pacer, DataCollider samples some memory accesses, but instead of using vector-clocks to catch the second thread, they use hardware break points. Hardware break points are considerably faster, and cause DataCollider to run much faster than Pacer.

LiteRace: Effective Sampling for Lightweight Data-Race Detection
happens-before reasoning

LiteRace, similar to Pacer, samples a percentage of memory accesses from a program. Where it differs is the parts of memory that LiteRace samples the most. The "hot spot" regions of memory are ones that are accessed most by the program. Since they are accessed the most, chances are that they have already been successfully debugged, or if there are data races there, they are benign. LiteRace detects these areas in memory as hot spots, and samples them at a much lower rate. This improves LiteRace's chances of capturing a valid data race at a much lower sampling rate. Where DataCollider bests LiteRace is based on LiteRace's installing mechanism. LiteRace needs to be recompiled into the software it is trying to debug, whereas DataColleder's breakpoints do not require any code changes to the program. This is a major success for DataCollider because often third party testers do not have the source code for a program.


RaceTrack: Efficient Detection of Data Race Conditions via Adaptive Trackings
combo of lock-set and happens-before reasoning

HIGH OVERHEAD[1]

MultiRace: Efficient on-the-fly data race detection in multithreaded C++ programs
combo of lock-set and happens-before reasoning






I've noticed a couple things for controversy, even though its not my topic The biggest thing i saw was that dataCollider reports non-erroneous operations 90% of the time. This makes the user have to sift through all of the reports to separate the problems from the benign races. Achamney 17:18, 22 November 2010 (UTC)

Background Concepts

Hey guys, sorry I'm late to the party. I'll get started with Background Concepts. - Austin Bondio 15:33, 23 November 2010 (UTC)

Critique

I'll work on the critique which will probably need more then one person and I'll also fill out the paper information section.--Azemanci 18:42, 23 November 2010 (UTC)