S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

17
Automatically Classifying Benign and Harmful Data Races Using Replay Analysis S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    1

Transcript of S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

Page 1: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

Automatically Classifying Benign and Harmful Data

Races Using Replay AnalysisS. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder

UCSD and MicrosoftPLDI 2007

Page 2: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

Data Races hard to debug◦ Difficult to detect◦ Even more difficult to reproduce

Data Race Detectors help in detection◦ LockSet, Happens-Before and Atomicity Violation

But they tend to overdo it◦ Up to 90% false alarms

Especially with LockSet

We need a tool that detects and reliably classifies

all harmful Data Races

Motivation

2

Page 3: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

3

Offline Dynamic Happens-BeforeData Race Detection

◦ Step 1: Trace Capturing◦ Step 2: Offline Happens-Before Analysis◦ Step 3: Replay Critical Segments◦ Step 4: Auto Classify harmful vs. benign races

Algorithm Overview

Page 4: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

4

iDNA captures the execution of an application

Simply records initial state,◦ Registers and PC

load values,◦ Only those needed absolutely◦ 1st load after a store, DMA etc…

and a global clock (sequensers)◦ Inserted in the thread’s replay log for

Synchronization events System calls

1. Trace Capturing & Replaying

Page 5: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

5

2. Offline Happens-Before Analysis Good old Happens-

Before◦ Two conflicting

accesses At least one write Not ordered

Detects only the data races that happened

Page 6: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

6

When a data race is detected replay the affected segments twice◦ 1st with the actual order

Given by the load values◦ 2nd reverse the racing accesses

Store the replay result◦ No-State-Change: If all live-outs are the same◦ State-Change: If at least 1 live-out changed◦ Replay Failure: If disaster encountered

Load null or unencountered address Branch someplace else

3. Replay Critical Segments

Page 7: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

7

3. Replay Critical Segments cont.

Replay Failure Potentially Harmful Data Race

Page 8: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

8

Repeat step 3 for each instance of a data race

Potentially Benign Data Race◦ every replay results to No-State-Change

Potentially Harmful Data Race◦ ≥1 replay results in State-Change or Replay Failure

State-Change shows that something would be different if things took the other path

Replay Failure indicates that a program changed that much, so we cannot simulate the other state

◦ Concrete proof that something definitely changed Easier for the programmers to accept it

4. Automatic Classification

Page 9: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

9

18 different executions of various services in Windows Vista and Internet Explorer

Happens-Before returns 16,642 data races◦ 68 unique

Trace capture◦ 0.8 bits per instruction

96 MB per 1,000,000,000 instructions Only 1st loads and synchronizers captured

◦ 0.3 if compressed with zip

Evaluation

Page 10: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

10

Results for Internet Explorer◦ P4 Xeon 2.2 GHz, 1 GB of RAM

Start adding…◦ 6x for capturing◦ 10x for replaying (unnecessary)◦ 45x offline Happens-Before Data Race Detection◦ 280x replay analysis

2,196 dynamic data races

Slowdowns

Page 11: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

11

Potentially Benign

Potentially Harmful To-

talReal Benign Real Harmful Real Benign Real

Harmful

No-State-

Change32 0 32

State Change 15 2 17

Replay Failure 14 5 19

Total 32 0 29 7 68

Data Race Classifications

Impossible StateImpossible State

Impossible StateImpossible State

Automatically ClassifiedAutomatically Classified

Manually ClassifiedManually Classified

All harmful races identified correctly0 false negatives

All harmful races identified correctly0 false negatives

Half benign races identified correctly.

Half still persist

Half benign races identified correctly.

Half still persist

Page 12: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

12

32 Real Benign races classified as such◦ Every instance

must return No-State-Changed

◦ The more instances, the more confidence in the classification

True Negatives

Page 13: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

13

7 Real bugs, correctly identified

At least 1 State-Change or Replay Failure required

True PositivesDangerous Zone

Page 14: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

14

29 Benign races incorrectly classified as harmful ◦ Approximate

Computation (23/29) Statistics etc

◦ Replayer Limitation (6/29) At least 1 instance

caused replay failure The final outcome is the

same

False Positives

Page 15: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

15

User Constructed◦ Garbage collector does not

use locks Double Checks

◦ If (a) {lock(…); if(a) {…}}

Both Values Valid◦ Use cache? High Perf?

Redundant Writes◦ Rewrite the same value

Disjoint bit manipulation◦ Modify different bits in

same variable

# Race

s

User Constructed Synchronization 8

Double Checks 3

Both Values Valid 5

Redundant Writes 13

Disjoint bit manipulation 9

Approx. Computation 23

False Positives (cont.)

23 false positives that were not caused by replay failure

Page 16: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

16

Interesting approach to identify benign races

It would be interesting to apply it to LockSet◦ LockSet has far more false positives◦ But it can detect bugs that did not happen in

production runs

A grand total overhead is missing

Conclusion

Page 17: S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, B. Calder UCSD and Microsoft PLDI 2007.

17

Questions?

Thank You!!!