Cormac Flanagan and Stephen Freund PLDI 2009 Slides by Michelle Goodstein 07/26/10.

Cormac Flanagan and Stephen Freund

PLDI 2009

Slides by Michelle Goodstein

07/26/10

FastTrack: Efficient and Precise Dynamic Race

Detection

Eraser/LockSet algorithms are fast but impreciseSuffer from false positives

Vector-clock (VC) data race detectors are precise but slow

Want: Fast and preciseFastTrack

Motivation

Don’t always need full power of VCsMajority of data:

Thread localLock protectedRead shared

Common cases:O(1) fast path

General case:No loss of precision/correctness

Intuition

Epoch: (VC, t), tTid, x Var, m LockOperations:

rd(t,x), wr(t,x), aq(t,m), rel(t,m), fork(t,t’), join(t,t’)Trace of operations: α

sequence of ops performed by threadsHappens before <α : smallest transitively closed

relation such that a <α b holds when a occurs before b in α because of:Program orderLockingFork-join

Notation

V1 < V2 if V1(t) ≤ V2(t) for all t (happens-before rltn)

Each thread, lock has own vector clock Ct

Increment own clock on lock releaseLock VCs are updated on acquire/release ops

Thread t releases lock m: Copy Ct to Cm Thread t’ acquires lock m: Ct’ = maxt (Ct’ (t),

Cm(t)) tEach location x gets 2VCs – 1 for reads, 1 for

writesRx(t), Wx(t) record last read, write to x by

thread tRead from x by thread t is race-free if Wx ≤ Ct

All writes happen-before current read

Review of vector clocks (DJIT+ alg)

Observation: writes are totally ordered until a data race detected

Only guarantee to detect first data race on loc x

Detect: Write-Write raceNo race detected on x so far?

Only need clock c and thread t of writer Epoch: c@t : O(1) space for epochsc@t ≤ V iff c ≤ V(t) : O(1) comparison time

Detect: Write-Read raceCheck Wx ≤ Ct

Intuition: Detecting races

FastTrack

Detect: Read-Write raceReads not guaranteed ordered in race-free

programIn practice: thread-local, lock-protected reads

orderedTypically, reads only unordered when read-

sharedAdaptive representation:

If read is ordered, record epoch of last readIf read not ordered, store entire VCCan still perform epoch-VC comparison in O(1) time

Intuition: Detecting Races

Read-Same-Epoch: x already read this epochno work

Read-Exclusive:current read happens-after prior read

epochupdate read epochRead-Share:

current read may be concurrent with prior readallocate VC to record epochs of both reads

Read-Shared: x already shared (already tracking VC)update VC

Read Rules

Read-Same-Epoch: (63.4% of reads)x already read this epochno work

Read-Exclusive: (15.7% of reads)current read happens-after prior read

epochupdate read epochRead-Share: (0.1% of reads)

current read may be concurrent with prior readallocate VC to record epochs of both reads

Read-Shared: (20.8% of reads)x already shared (already tracking VC)update VC

Read Rules (82.3% of ops)

Write-Same-Epoch: x already written this epochno change

Write-Exclusive: If Rx is an epoch, Rx ,Wx≤ Ct update write

epochWrite-Shared:

Rx is a VC, Rx ,Wx≤ Ct update write epoch, set Rx to “empty” epoch

Write Rules:

Write-Same-Epoch: (71% of writes)x already written this epochno change

Write-Exclusive: (28.9% of writes)If Rx is an epoch, Rx ,Wx≤ Ct update write

epochWrite-Shared: (0.1% of writes)

Rx is a VC, Rx ,Wx≤ Ct update write epoch, set Rx to “empty” epoch

Write Rules: (14.5% of ops)

Acquire, Release, Fork, Join:RareUse full VCs

Correct/Precise: FastTrack reports data races iff detects

concurrent conflicting accesses

Other operations

FastTrack in Action

RoadRunnerLike DBI but for JavaInstruments bytecode at runtime

32 bit epochs8 bit tid24 bit clockcould also use 64-bit

Implementation

7 dynamic analyses: Empty

RoadRunner overhead FastTrack Eraser (+ barrier synch) DJIT+ MultiRace

DJIT + lock set Update lock set for locn on first access in epoch Full VC comparisons after lockset empty Imprecision/unsoundness from eraser

GoldiLocks Precise race detector Tracks synch devices & threads Requires tight integration with VM, garbage collector

BasicVC Read, Write VCs per mem location

All implemented on RoadRunner

Evaluation

Benchmarks: 16Apple Mac ProDual-3Ghz quad-core Pentium Xeon

Processors12 GB memoryOSX 10.5.6, Java Hotspot 64-bit Server VM

v1.6.0JVM startup time excludedReport at most one race per field in a class or

array access in program source

Evaluation

Results: Precision and Efficiency

Results: VCs allocated/benchmark

Results: Fine vs Coarse Grain Analysis

Interesting potential lifeguardPerformance of eraser, precision of VC-based

algorithmUnclear how will perform on non-Java

platform?

Conclusions

FastTrack ExampleInit: empty (bottom)

epochsWrite-exclusiveRead-Exclusive

due to forkRead-ShareWrite-Shared

happens-after b/c of join

Read-Exclusive

Cormac Flanagan and Stephen Freund PLDI 2009 Slides by Michelle Goodstein 07/26/10.

Documents

Transcript of Cormac Flanagan and Stephen Freund PLDI 2009 Slides by Michelle Goodstein 07/26/10.