Concurrency Checking with CHESS: Learning from Experience Tom Ball, Sebastian Burckhardt, Chris...

download Concurrency Checking with CHESS: Learning from Experience Tom Ball, Sebastian Burckhardt, Chris Dern, Madan Musuvathi, Shaz Qadeer

If you can't read please download the document

  • date post

    27-Mar-2015
  • Category

    Documents

  • view

    219
  • download

    4

Embed Size (px)

Transcript of Concurrency Checking with CHESS: Learning from Experience Tom Ball, Sebastian Burckhardt, Chris...

  • Slide 1

Concurrency Checking with CHESS: Learning from Experience Tom Ball, Sebastian Burckhardt, Chris Dern, Madan Musuvathi, Shaz Qadeer Slide 2 Outline What is CHESS? a testing tool, plus a test methodology (concurrency unit tests) a platform for research and teaching Chess design decisions Learnings from CHESS user forum, champions Slide 3 What is CHESS? CHESS is a user-mode scheduler Controls all scheduling nondeterminism Hijacks scheduling control from the OS Guarantees: Every run takes a different thread schedule Reproduce the schedule for every run Slide 4 Concurrency Unit Tests Generally, in our test environment, we want to test what we call scenarios. A scenario might be a specific feature or API usage. In my case I am trying to test the scenario of a user canceling a command execution on a different thread. Steve Hale, Microsoft Slide 5 A Concurrency Unit Test Pattern: Fork-Join void ForkJoinTest() { var t1 = new Thread(() => { S1 }); var t2 = new Thread(() => { S2 }); t1.Start(); t2.Start(); t1.Join(); t2.Join(); Debug.Assert(...); } Slide 6 Concurrency Unit Tests Small scope hypothesis For most bugs, there exists a short-running scenario with only a few threads that can find it Unit tests provide Better coverage of schedules Easier debugging, regression, etc. Slide 7 CHESS as Research/Teaching Platform http://research.microsoft.com/chess/ http://research.microsoft.com/chess/ Source code release chesstool.codeplex.com chesstool.codeplex.com Courseware with CHESS Practical Parallel and Concurrent Programming Practical Parallel and Concurrent Programming coming this fall! Preemption bounding [PLDI07] speed search for bugs simple counterexamples Fair stateless exploration [PLDI08] scales to large programs Architecture [OSDI08] Tasks and SyncVars API wrappers Store buffer simulation [CAV08] Preemption sealing [TACAS10] orthogonal to preemption bounding where (not) to search for bugs Best-first search [PPoPP10] Automatic linearizability checking [PLDI10] More features Data race detection Partial order reduction More monitors Slide 8 CHESS Design Decisions Stateless state space exploration No change to underlying scheduler Ability to enumerate all/only feasible schedules Schedule points = synchronization points and use race detection to make up the difference Serialize concurrent behavior Suite of search/reduction strategies preemption bounding, sealing best-first search Monitor API to easily add new checking capability Slide 9 Stateless model checking [Verisoft] Given a program with an acyclic state space Systematically enumerate all paths Dont capture program states Not necessary for termination Precisely capturing states is hard and expensive At the cost of potentially revisiting states Partial-order reduction alleviates redundant exploration Slide 10 CHESS architecture CHESS Scheduler CHESS Scheduler Unmanaged Program Unmanaged Program Windows Managed Program Managed Program CLR CHESS Exploration Engine CHESS Exploration Engine Win32 Wrappers.NET Wrappers Capture scheduling nondeterminism Drive the program along an interleaving of choice Slide 11 Running Example Lock (l); bal += x; Unlock(l); Lock (l); bal += x; Unlock(l); Lock (l); t = bal; Unlock(l); Lock (l); bal = t - y; Unlock(l); Lock (l); t = bal; Unlock(l); Lock (l); bal = t - y; Unlock(l); Thread 1Thread 2 Slide 12 Introduce Schedule() points Schedule(); Lock (l); bal += x; Schedule(); Unlock(l); Schedule(); Lock (l); bal += x; Schedule(); Unlock(l); Schedule(); Lock (l); t = bal; Schedule(); Unlock(l); Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Schedule(); Lock (l); t = bal; Schedule(); Unlock(l); Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Thread 1Thread 2 Instrument calls to the CHESS scheduler Each call is a potential preemption point Slide 13 First-cut solution: Random sleeps Introduce random sleep at schedule points Does not introduce new behaviors Sleep models a possible preemption at each location Sleeping for a finite amount guarantees starvation-freedom Sleep(rand()); Lock (l); bal += x; Sleep(rand()); Unlock(l); Sleep(rand()); Lock (l); bal += x; Sleep(rand()); Unlock(l); Sleep(rand()); Lock (l); t = bal; Sleep(rand()); Unlock(l); Sleep(rand()); Lock (l); bal = t - y; Sleep(rand()); Unlock(l); Sleep(rand()); Lock (l); t = bal; Sleep(rand()); Unlock(l); Sleep(rand()); Lock (l); bal = t - y; Sleep(rand()); Unlock(l); Thread 1Thread 2 Slide 14 Improvement 1: Capture the happens-before graph Schedule(); Lock (l); bal += x; Schedule(); Unlock(l); Schedule(); Lock (l); bal += x; Schedule(); Unlock(l); Schedule(); Lock (l); t = bal; Schedule(); Unlock(l); Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Schedule(); Lock (l); t = bal; Schedule(); Unlock(l); Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Thread 1Thread 2 Delays that result in the same happens-before graph are equivalent Avoid exploring equivalent interleavings Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Schedule(); Lock (l); t = bal; Schedule(); Unlock(l); Schedule(); Lock (l); t = bal; Schedule(); Unlock(l); Sleep(5) Slide 15 Improvement 2: Understand synchronization semantics Schedule(); Lock (l); bal += x; Schedule(); Unlock(l); Schedule(); Lock (l); bal += x; Schedule(); Unlock(l); Schedule(); Lock (l); t = bal; Schedule(); Unlock(l); Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Schedule(); Lock (l); t = bal; Schedule(); Unlock(l); Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Thread 1Thread 2 Avoid exploring delays that are impossible Identify when threads can make progress CHESS maintains a run queue and a wait queue Mimics OS scheduler state Schedule(); Unlock(l); Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Schedule(); Lock (l); t = bal; Schedule(); Lock (l); t = bal; Slide 16 Emulate execution on a uniprocessor Schedule(); Lock (l); bal += x; Schedule(); Unlock(l); Schedule(); Lock (l); bal += x; Schedule(); Unlock(l); Thread 1Thread 2 Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Schedule(); Lock (l); bal = t - y; Schedule(); Unlock(l); Schedule(); Lock (l); t = bal; Schedule(); Unlock(l); Schedule(); Lock (l); t = bal; Schedule(); Unlock(l); Enable only one thread at a time Linearizes a partial-order into a total-order Controls the order of data- races Slide 17 CHESS modes: speed vs coverage Fast-mode Introduce schedule points before synchronizations, volatile accesses, and interlocked operations Finds many bugs in practice Data-race mode Repeat Find data races Introduce schedule points before racing memory accesses Captures all sequentially consistent (SC) executions Slide 18 Capture all sources of nondeterminism? No. Scheduling nondeterminism? Yes Timing nondeterminism? Yes Controls when and in what order the timers fire Nondeterministic system calls? Mostly CHESS uses precise abstractions for many system calls Input nondeterminism? No Rely on users to provide inputs Program inputs, files read, packets received, Good tradeoff in the short term But cant find race-conditions on error handling code Slide 19 CHESS architecture CHESS Scheduler CHESS Scheduler Unmanaged Program Unmanaged Program Windows Managed Program Managed Program CLR CHESS Exploration Engine CHESS Exploration Engine Win32 Wrappers.NET Wrappers Slide 20 CHESS wrappers Translate Win32/.NET synchronizations Into CHESS scheduler abstractions Tasks : schedulable entities Threads, threadpool work items, async. callbacks, timer functions SyncVars : resources used by tasks Generate happens-before edges during execution Executable specification for complex APIs Most time consuming and error-prone part of CHESS Enables CHESS to handle multiple platforms Slide 21 http://msdn.microsoft.com/en-us/devlabs/cc950526.aspx http://social.msdn.microsoft.com/Forums/en-US/chess/threads/ Learning from Experience: User forum, Champions Slide 22 Slide 23 CHESS Doesnt Scale Hmm we just ran CHESS on the Singularity operating system (and found bugs in the bootup/shutdown sequence) What they usually mean: CHESS isnt very effective on a long-running test There are a lot of possible schedules! Time for enumerative model checking (Time to execute one test) x (# schedules) Slide 24 Find lots of bugs with 2 preemptions ProgramLines of codeBugs Work Stealing Q4K4 CDS6K1 CCR9K3 ConcRT16K4 Dryad18K7 APE19K4 STM20K2 TPL24K9 PLINQ24K1 Singularity175K2 37 (total) Slide 25 CHESS Isnt Push Button The more I look at CHESS the more I realize that I could use some general guidance on how to author test code that will actually help CHESS reveal concurrency bugs. Daniel Stolt Slide 26 Challenge -> Opportunity: New Push button concurrency tools Cuzz [ASPLOS 2010]: Concurrency Fuzzing Attach to any running executable Find concurrency bugs faster through smart fuzzing Lineup [PLDI 2010]: Automatic Linearizability Checking Generate thread-safety tests for a class automatically Use sequential behavior as oracle for concurrent behavior CHESS underneath Slide 27 CHESS Doesnt Find This Bug RTFM is not helpful Instead, generate helpful warning messages Warning: running CHESS without race detection can miss bugs Or, turn race detection on for a few executions. void ForkJoinTest() { int x = 0; var t1 = new Thread(() => { x=x+1; }); var t2 = new Thread(() => { x=x+1; }); t1.Start(); t2.Start(); t1.Join(); t2.Join(); Debug.Assert(x==2); } Slide 28 CHESS Cant Avoid Finding Bugs Solution is working and found two bug with CHESS. To get the second bug, I had to fix first bug first That liveness bug is such a minor performance problem that I wont fix it. Slide 29 Playing CHESS with George Slide 30 Slide 31 Slide 32 Slide 33 Slide 34 Sealed Methods AssertsTimeoutsLivelocksDeadlocksLeaksPass 5340005 +TryDequeue6501140 +WaitForTask5502140