CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer...

46
CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich), Katie Coons (U. T. Austin), Tayfun Elmas (Koc University), P. Arumuga Nainar (U. Wisc. Madison), Iulian Neamtiu (U. Maryland, U.C. Riverside)
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer...

Page 1: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

CHESSFinding and Reproducing

Heisenbugs

Tom Ball, Sebastian BurckhardtMadan Musuvathi, Shaz Qadeer

Microsoft Research

Interns: Gerard Basler (ETH Zurich),Katie Coons (U. T. Austin),

Tayfun Elmas (Koc University),P. Arumuga Nainar (U. Wisc. Madison),

Iulian Neamtiu (U. Maryland, U.C. Riverside)

Page 2: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Concurrency is HARDRare thread interleavings can result in bugs

These bugs are hard to find, reproduce, and debugHeisenbugs: Observing the bug can “fix” it !

A huge productivity problemDevelopers and testers can spend weeks chasing a single

Heisenbug

Page 3: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Demo

Let’s find a simple concurrency bug

Page 4: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

CHESS motivationToday:

concurrency testing == stress testing

Stress increases the interleaving variety, butNot predictable → HeisenbugsNot systematic → poor coverage of interleavings

Page 5: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Don’t stress, use CHESSBasic primitive: Drive a program along an interleaving

of choiceInterleaving can be decided by a program or a userDoing this today is surprisingly hard

Use model checking techniques to systematically enumerate thread interleavings

Page 6: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

CHESS architecture

CHESSScheduler

MemoryModelbugs

Monitors

Coverage

Repro

TestingDataraces

Debugging Visualization

UnmanagedProgram

Windows

ManagedProgram

.NET CLR

• Record the interleaving executed• Drive the program along an interleaving

Page 7: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Talk outlineIntroduction

Preemption bounding [PLDI ‘07]Tackling state space explosion

Fair stateless model checking [PLDI ‘08]Handling cycles in states spaces

CHESS architecture details [OSDI ‘08]

Page 8: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Enumerating thread interleavings

x = 1;y = 1;

x = 2;y = 2;

2,1

1,0

0,0

1,1

2,2

2,22,1

2,0

2,12,2

Thread 1 Thread 2

1,2

2,0

2,2

1,1

1,1 1,2

1,0

1,2 1,1

y = 1;

x = 1;

y = 2;

x = 2;

Page 9: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Stateless model checking [Verisoft]Systematically enumerate all paths in a state-space graph

Don’t capture program states Capturing states is extremely hard for large programsState = globals, heap, stack, registers, kernel, filesystem,

other processes, other machines,…

Very effective on acyclic state spaces Termination is guaranteed

Potentially revisits program statesPartial-order reduction alleviates redundant exploration

Page 10: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

x = 1; … … … … … y = k;

State space explosionThread 1 Thread n

x = 1; … … … … …y = k;

n threads

k steps each

Number of executions = O( nnk )

Exponential in both n and kTypically: n < 10 k > 100

Limits scalability to large programs

Goal: Scale CHESS to large programs (large k)

Page 11: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

x = 1;if (p != 0) { x = p->f;}

Preemption bounding Prioritize executions with small number of preemptions

Preemption is a context switch forced by the scheduler Unexpected by the programmere.g. Time-slice expiration

Hypothesis: most concurrency bugs result from few preemptions

x = p->f;}

x = 1;if (p != 0) {

p = 0;

Thread 1 Thread 2

preemption

non-preemption

Page 12: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Polynomial state spaceTerminating program with fixed inputs and deterministic threads

n threads, k steps each, c preemptionsNumber of executions <= nkCc . (n+c)!

= O( (n2k)c. n! )

Exponential in n and c, but not in k

x = 1; … … … … …y = k;

x = 1; … … … … … y = k;

Thread 1 Thread 2

x = 1; … … … …

x = 1; … … …

…y = k;

… …

y = k;

• Choose c preemption points

• Permute n+c atomic blocks

Page 13: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Find lots of bugs with 2 preemptionsProgram Lines of code Bugs

Work Stealing Q 4K 4

CDS 6K 1

CCR 9K 3

ConcRT 16K 4

Dryad 18K 7

APE 19K 4

STM 20K 2

TPL 24K 9

PLINQ 24K 1

Singularity 175K 2

37 (total)

Acknowledgement: testers from PCP team

Page 14: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Good coverage metricWhen CHESS completes search with c preemptionsAny remaining bug requires c+1 or more preemptions

Two preemptions sufficient to reproduced all stress-test failures, reported so far

Page 15: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Talk outlineIntroduction

Preemption bounding [PLDI ‘07]Tackling state space explosion

Fair stateless model checking [PLDI ‘08]Handling cycles in states spaces

CHESS architecture details [OSDI ‘08]

Page 16: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Concurrent programs have cyclic state spaces

SpinlocksNon-blocking algorithmsImplementations of synchronization primitivesPeriodic timers…

L1: while( ! done) { L2: Sleep(); }

M1: done = 1;

Thread 1 Thread 2 ! done L2

! doneL1

done L2

doneL1

Page 17: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

A demonic scheduler unrolls any cycle ad-infinitum

! done

done! done

done! done

done

while( ! done){ Sleep();}

done = 1;

Thread 1 Thread 2

! done

Page 18: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Depth bounding

! done

done! done

done! done

done! done

Prune executions beyond a bounded number of steps

Depth bound

Page 19: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Problem 1: Ineffective state coverage

! done

! done

! done

! done

Bound has to be large enough to reach the deepest bug Typically, greater than 100

synchronization operations

Every unrolling of a cycle redundantly explores reachable state space

Depth bound

Page 20: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Problem 2: Cannot find livelocksLivelocks : lack of progress in a program

temp = done;while( ! temp){ Sleep();}

done = 1;

Thread 1 Thread 2

Page 21: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Fair stateless model checking

Make stateless model checking effective on cyclic state spacesEffective state coverageDetect livelocks

Page 22: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Key idea

This test terminates only when the scheduler is fairFairness is assumed by programmers

All cycles in correct programs are unfair A fair cycle is a livelock

while( ! done){ Sleep();}

done = 1;

Thread 1 Thread 2

! done! done

donedone

Page 23: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Key idea

This test terminates only when the scheduler is fairFairness is assumed by programmers

CHESS should only explore fair schedules

while( ! done){ Sleep();}

done = 1;

Thread 1 Thread 2

! done! done

donedone

Page 24: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

What notion of fairness?

Page 25: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Weak fairnessForall t :: GF ( enabled(t) scheduled(t) )A thread that remains enabled should eventually be

scheduled

A weakly-fair scheduler will eventually schedule Thread 2Example: round-robin, FIFO wait queues

while( ! done){ Sleep();}

done = 1;

Thread 1 Thread 2

Page 26: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Weak fairness does not suffice

Lock( l );While( ! done){ Unlock( l ); Sleep(); Lock( l );}Unlock( l );

Lock( l );done = 1;Unlock( l );

Thread 1 Thread 2

en = {T1, T2}

T1: Sleep()T2: Lock( l )

en = {T1, T2}

T1: Lock( l )T2: Lock( l )

en = { T1 }

T1: Unlock( l )T2: Lock( l )

en = {T1, T2}

T1: Sleep()T2: Lock( l )

Page 27: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Strong Fairness Forall t :: GF enabled(t) GF scheduled(t) A thread that is enabled infinitely often is scheduled infinitely often

Thread 2 is enabled and competes for the lock infinitely often Example: a round-robin scheduler with priorities [Apt & Olderog ‘83]

Lock( l );While( ! done){ Unlock( l ); Sleep(); Lock( l );}Unlock( l );

Lock( l );done = 1;Unlock( l );

Thread 1 Thread 2

Page 28: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Constructing a strongly fair schedulerA round-robin scheduler is not strongly fair

It is only weakly fair

Extend a round-robin scheduler with priorities [Apt & Olderog ‘83]

Page 29: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

CHESS also needs to be demonicCannot generate all fair schedules

There are infinitely many, even for simple programs

It is sufficient to generate enough fair schedules to Explore all states (safety coverage)Explore at least one fair cycle, if any (livelock coverage)

Do it without capturing the program states

Page 30: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Fair stateless model checkingGiven a concurrent program Q and a safety property P

Q does not necessarily have an acyclic state space

Determine Q satisfies P and Q is fair-terminating (livelock-free)

Without capturing program states

Page 31: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

(Good) Programs indicate lack of progress

Good Samaritan assumption:Forall threads t : GF scheduled(t) GF yield(t)A thread when scheduled infinitely often yields the processor infinitely

often

Examples of yield:Sleep(), ScheduleThread(), asm {rep nop;}Thread completion

while( ! done){ Sleep();}

done = 1;

Thread 1 Thread 2

Page 32: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Robustness of the Good Samaritan assumptionA violation of the Good Samaritan assumption is a

performance error

Programs are parsimonious in the use of yieldsA Sleep() almost always indicates a lack of progressImplies that the thread is stuck in a state-space cycle

while( ! done){ ;}

done = 1;

Thread 1 Thread 2

Page 33: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Fair demonic scheduler (outline)Maintain a priority-order (a partial-order) on threads

A < B means that A will not be scheduled in a state where B is enabled

Threads get a lower priority only when they yieldScheduler is fully demonic on yield-free paths

A thread loses its priority once it executesRemove all edges t < A when A executes

Page 34: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Four outcomes of the semi-algorithmTerminates without finding any errorsTerminates with a safety violationDiverges with an infinite execution

that violates the GS assumption (a performance error)that is strongly-fair (a livelock)

In practice: detect infinite executions by a very long execution

Page 35: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

CoverageTheorem: The algorithm achieves full coverage

if every state is reachable by a yield-free path, and Exists a fair cycle iff exists a fair cycle with at most one

yield per thread

Page 36: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Results: Achieves more coverage faster

With fairness

Without fairness, with depth bound

20 30 40 50 60

States Explored 1726 871 1505 1726 1307 683

PercentageCoverage 100% 50% 87% 100% 76% 40%

Time(secs) 143 97 763 2531 >5000 >5000

Work stealing queue with one stealer

Page 37: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Livelocks in Singularity(both fixed)A thread needlessly burns its CPU quantum in a spin-

loop “it's a bug that we think we have seen in practice, but

that would have been very difficult to find through normal means” [Dean Tribble]

An infinite loop in the Promise implementationManifested as a non-reproducible problem in an existing

stress-testCHESS found the bug in a simple test harness with a

repeatable error-trace

Page 38: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Talk outlineIntroduction

Preemption bounding [PLDI ‘07]Tackling state space explosion

Fair stateless model checking [PLDI ‘08]Handling cycles in states spaces

CHESS architecture details [OSDI ‘08]

Page 39: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

CHESS architecture recap

CHESSScheduler

MemoryModelbugs

Monitors

Coverage

Repro

TestingDataraces

Debugging Visualization

UnmanagedProgram

Windows

ManagedProgram

.NET CLR

• Record the interleaving executed• Drive the program along an interleaving

Page 40: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Capture the ‘happens-before’ graph Happens-before graph captures all communication between threads in

a concurrent execution

Abstracts time: For a given input, two executions that result in the same happens-before graph are behaviorally equivalent

x = 1

t = x;

wait(e)

setEvent(e)

Page 41: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Enforce a single-threaded executionHappens-before graph is a partial-order

can be converted to a totally-ordered single threaded execution

Big performance winData-accesses are automatically ordered by synchronization

eventsDon’t need to instrument data-accesses

Cannot explore non-sequentially consistent executions of a programResulting from relaxed memory model of the hardwareSober: A tool that detects the presence of such executions [CAV

‘08]

Page 42: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Directing the executionGiven a happens-before graphBlock the execution of a synchronization if it produces

an edge not in the graph

Need to understand the precise semantics of synchronization operations

Page 43: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

ConclusionMessage to concurrency programmers

Think seriously about interleaving coverage

Message to system/PL researchersConcurrency APIs should have a clear specification of the

nondeterminism exposed

Don’t stress, use CHESSCHESS binary available for academic use

http://research.microsoft.com/CHESSCHESS will be shipped for commercial use, very soon

http://msdn.microsoft.com/devlabs

CHESS is extensibleUse CHESS scheduler for concurrency toolsPlug in new search algorithms

Page 44: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

Questions

Page 45: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

A stress test fails…

Page 46: CHESS Finding and Reproducing Heisenbugs Tom Ball, Sebastian Burckhardt Madan Musuvathi, Shaz Qadeer Microsoft Research Interns: Gerard Basler (ETH Zurich),

CHESS reproduces the bug in 2 mins