Concurrency Checking with CHESS: Learning from Experience

57
Concurrency Checking with CHESS: Learning from Experience Tom Ball, Sebastian Burckhardt, Chris Dern, Madan Musuvathi, Shaz Qadeer

description

Concurrency Checking with CHESS: Learning from Experience. Tom Ball, Sebastian Burckhardt, Chris Dern, Madan Musuvathi, Shaz Qadeer. Outline. What is CHESS? a testing tool, plus a test methodology (concurrency unit tests) a platform for research and teaching Chess design decisions - PowerPoint PPT Presentation

Transcript of Concurrency Checking with CHESS: Learning from Experience

Page 1: Concurrency Checking with CHESS:  Learning from Experience

Concurrency Checking with CHESS: Learning from Experience

Tom Ball, Sebastian Burckhardt, Chris Dern, Madan Musuvathi, Shaz Qadeer

Page 2: Concurrency Checking with CHESS:  Learning from Experience

Outline

• What is CHESS?– a testing tool, plus– a test methodology (concurrency unit tests)– a platform for research and teaching

• Chess design decisions

• Learnings from CHESS user forum, champions

Page 3: Concurrency Checking with CHESS:  Learning from Experience

What is CHESS?

• CHESS is a user-mode scheduler

• Controls all scheduling nondeterminism– “Hijacks” scheduling control from the OS

• Guarantees:– Every run takes a different thread schedule– Reproduce the schedule for every run

Page 4: Concurrency Checking with CHESS:  Learning from Experience

Concurrency Unit Tests

“Generally, in our test environment, we want to test what we call scenarios. A scenario might be a specific feature or API usage. In my case I am trying to test the scenario of a user canceling a command execution on a different thread.”

Steve Hale, Microsoft

Page 5: Concurrency Checking with CHESS:  Learning from Experience

A Concurrency Unit Test Pattern:Fork-Join

void ForkJoinTest() { var t1 = new Thread(() => { S1 }); var t2 = new Thread(() => { S2 });

t1.Start(); t2.Start(); t1.Join(); t2.Join();

Debug.Assert(...);}

Page 6: Concurrency Checking with CHESS:  Learning from Experience

Concurrency Unit Tests

• Small scope hypothesis– For most bugs, there exists a short-running

scenario with only a few threads that can find it

• Unit tests provide– Better coverage of schedules– Easier debugging, regression, etc.

Page 7: Concurrency Checking with CHESS:  Learning from Experience

CHESS as Research/Teaching Platformhttp://research.microsoft.com/chess/

• Source code release – chesstool.codeplex.com

• Courseware with CHESS– Practical Parallel and Concurrent Programming– coming this fall!

• Preemption bounding [PLDI07]– speed search for bugs– simple counterexamples

• Fair stateless exploration [PLDI08]– scales to large programs

• Architecture [OSDI08]– Tasks and SyncVars– API wrappers

• Store buffer simulation [CAV08]• Preemption sealing [TACAS10]

– orthogonal to preemption bounding– where (not) to search for bugs

• Best-first search [PPoPP10] • Automatic linearizability

checking [PLDI10]• More features

– Data race detection– Partial order reduction– More monitors…

Page 8: Concurrency Checking with CHESS:  Learning from Experience

CHESS Design Decisions• Stateless state space exploration• No change to underlying scheduler• Ability to enumerate all/only feasible schedules• Schedule points = synchronization points and use

race detection to make up the difference• Serialize concurrent behavior• Suite of search/reduction strategies– preemption bounding, sealing– best-first search

• Monitor API to easily add new checking capability

Page 9: Concurrency Checking with CHESS:  Learning from Experience

Stateless model checking [Verisoft]Given a program with an acyclic state spaceSystematically enumerate all paths

Don’t capture program states Not necessary for terminationPrecisely capturing states is hard and expensive

At the cost of potentially revisiting statesPartial-order reduction alleviates redundant exploration

Page 10: Concurrency Checking with CHESS:  Learning from Experience

CHESS architecture

CHESSScheduler

UnmanagedProgram

Windows

ManagedProgram

CLR

CHESSExploration

Engine

Win32 Wrappers

.NET Wrappers

• Capture scheduling nondeterminism• Drive the program along an interleaving of choice

Page 11: Concurrency Checking with CHESS:  Learning from Experience

Running Example

Lock (l);bal += x;Unlock(l);

Lock (l);t = bal;Unlock(l);

Lock (l);bal = t - y;Unlock(l);

Thread 1 Thread 2

Page 12: Concurrency Checking with CHESS:  Learning from Experience

Introduce Schedule() points

Schedule();Lock (l);bal += x;Schedule(); Unlock(l);

Schedule(); Lock (l);t = bal;Schedule(); Unlock(l);

Schedule(); Lock (l);bal = t - y;Schedule(); Unlock(l);

Thread 1 Thread 2

Instrument calls to the CHESS scheduler

Each call is a potential preemption point

Page 13: Concurrency Checking with CHESS:  Learning from Experience

First-cut solution: Random sleeps

Introduce random sleep at schedule points

Does not introduce new behaviorsSleep models a possible

preemption at each locationSleeping for a finite amount

guarantees starvation-freedom

Sleep(rand());Lock (l);bal += x;Sleep(rand());Unlock(l);

Sleep(rand());Lock (l);t = bal;Sleep(rand());Unlock(l);

Sleep(rand());Lock (l);bal = t - y;Sleep(rand());Unlock(l);

Thread 1 Thread 2

Page 14: Concurrency Checking with CHESS:  Learning from Experience

Improvement 1:Capture the “happens-before” graph

Schedule();Lock (l);bal += x;Schedule(); Unlock(l);

Schedule(); Lock (l);t = bal;Schedule(); Unlock(l);

Schedule(); Lock (l);bal = t - y;Schedule(); Unlock(l);

Thread 1 Thread 2

Delays that result in the same “happens-before” graph are equivalent

Avoid exploring equivalent interleavings

Schedule(); Lock (l);bal = t - y;Schedule(); Unlock(l);

Schedule(); Lock (l);t = bal;Schedule(); Unlock(l);

Sleep(5)

Sleep(5)

Page 15: Concurrency Checking with CHESS:  Learning from Experience

Improvement 2:Understand synchronization semantics

Schedule();Lock (l);bal += x;Schedule(); Unlock(l);

Schedule(); Lock (l);t = bal;Schedule(); Unlock(l);

Schedule(); Lock (l);bal = t - y;Schedule(); Unlock(l);

Thread 1 Thread 2 Avoid exploring delays that are impossible

Identify when threads can make progress

CHESS maintains a run queue and a wait queueMimics OS scheduler state

Schedule(); Unlock(l);

Schedule(); Lock (l);bal = t - y;Schedule(); Unlock(l);

Schedule(); Lock (l);t = bal;

Page 16: Concurrency Checking with CHESS:  Learning from Experience

Emulate execution on a uniprocessor

Schedule();Lock (l);bal += x;Schedule(); Unlock(l);

Thread 1 Thread 2

Schedule(); Lock (l);bal = t - y;Schedule(); Unlock(l);

Schedule(); Lock (l);t = bal;Schedule(); Unlock(l);

Enable only one thread at a time

Linearizes a partial-order into a total-order

Controls the order of data-races

Page 17: Concurrency Checking with CHESS:  Learning from Experience

CHESS modes: speed vs coverageFast-mode

Introduce schedule points before synchronizations, volatile accesses, and interlocked operations

Finds many bugs in practice

Data-race modeRepeat

Find data racesIntroduce schedule points before racing memory accesses

Captures all sequentially consistent (SC) executions

Page 18: Concurrency Checking with CHESS:  Learning from Experience

Capture all sources of nondeterminism?No.Scheduling nondeterminism? Yes

Timing nondeterminism? YesControls when and in what order the timers fire

Nondeterministic system calls? MostlyCHESS uses precise abstractions for many system calls

Input nondeterminism? NoRely on users to provide inputs

Program inputs, files read, packets received,…Good tradeoff in the short term

But can’t find race-conditions on error handling code

Page 19: Concurrency Checking with CHESS:  Learning from Experience

CHESS architecture

CHESSScheduler

UnmanagedProgram

Windows

ManagedProgram

CLR

CHESSExploration

Engine

Win32 Wrappers

.NET Wrappers

Page 20: Concurrency Checking with CHESS:  Learning from Experience

CHESS wrappersTranslate Win32/.NET synchronizations Into CHESS scheduler abstractions

Tasks : schedulable entitiesThreads, threadpool work items, async. callbacks, timer functions

SyncVars : resources used by tasksGenerate happens-before edges during execution

Executable specification for complex APIsMost time consuming and error-prone part of CHESS

Enables CHESS to handle multiple platforms

Page 21: Concurrency Checking with CHESS:  Learning from Experience

http://msdn.microsoft.com/en-us/devlabs/cc950526.aspxhttp://social.msdn.microsoft.com/Forums/en-US/chess/threads/

Learning from Experience:User forum, Champions

Page 22: Concurrency Checking with CHESS:  Learning from Experience
Page 23: Concurrency Checking with CHESS:  Learning from Experience

“CHESS Doesn’t Scale”Hmm… we just ran CHESS on the Singularity operating

system (and found bugs in the bootup/shutdown sequence)

What they usually mean:“CHESS isn’t very effective on a long-running test”“There are a lot of possible schedules!”

Time for enumerative model checking(Time to execute one test) x (# schedules)

Page 24: Concurrency Checking with CHESS:  Learning from Experience

Find lots of bugs with 2 preemptionsProgram Lines of code Bugs

Work Stealing Q 4K 4

CDS 6K 1

CCR 9K 3

ConcRT 16K 4

Dryad 18K 7

APE 19K 4

STM 20K 2

TPL 24K 9

PLINQ 24K 1

Singularity 175K 2

37 (total)

Page 25: Concurrency Checking with CHESS:  Learning from Experience

“CHESS Isn’t Push Button”

“The more I look at CHESS the more I realize that I could use some general guidance on how to author test code that will actually help CHESS reveal concurrency bugs.”

Daniel Stolt

Page 26: Concurrency Checking with CHESS:  Learning from Experience

Challenge -> Opportunity: New “Push button” concurrency tools

Cuzz [ASPLOS 2010]: Concurrency FuzzingAttach to any running executableFind concurrency bugs faster through smart fuzzing

Lineup [PLDI 2010]: Automatic Linearizability CheckingGenerate “thread-safety” tests for a class automaticallyUse sequential behavior as oracle for concurrent behaviorCHESS underneath

Page 27: Concurrency Checking with CHESS:  Learning from Experience

“CHESS Doesn’t Find This Bug”

RTFM is not helpfulInstead, generate helpful warning messages

“Warning: running CHESS without race detection can miss bugs”Or, turn race detection on for a few executions.

void ForkJoinTest() { int x = 0; var t1 = new Thread(() => { x=x+1; }); var t2 = new Thread(() => { x=x+1; });

t1.Start(); t2.Start(); t1.Join(); t2.Join();

Debug.Assert(x==2);}

Page 28: Concurrency Checking with CHESS:  Learning from Experience

“CHESS Can’t Avoid Finding Bugs”

“Solution is working and found two bug with CHESS . To get the second bug, I had to fix first bug first”

“That liveness bug is such a minor performance problem that I won’t fix it.”

Page 29: Concurrency Checking with CHESS:  Learning from Experience

Playing CHESS with George

Page 30: Concurrency Checking with CHESS:  Learning from Experience
Page 31: Concurrency Checking with CHESS:  Learning from Experience
Page 32: Concurrency Checking with CHESS:  Learning from Experience
Page 33: Concurrency Checking with CHESS:  Learning from Experience
Page 34: Concurrency Checking with CHESS:  Learning from Experience

Sealed Methods Asserts Timeouts Livelocks Deadlocks Leaks Pass

5 3 40 0 0 5

+TryDequeue 6 5 0 1 1 40

+WaitForTask 5 5 0 2 1 40

+Reg.Recv.+PostInternal 5 5 0 0 0 43

Page 35: Concurrency Checking with CHESS:  Learning from Experience

“CHESS is Confusing Me”

Page 36: Concurrency Checking with CHESS:  Learning from Experience

The Nondeterminism Saga: static data, lazily initialized

E F

If replay of p.E fails, yielding p.F, then try again and see if p.F replays

Report lost coverage

p

Page 37: Concurrency Checking with CHESS:  Learning from Experience

Nondeterminism Junkie: Too much information

“Why does this test pass instead of say ‘Detected nondeterminism’ outside the control of CHESS"?

Page 38: Concurrency Checking with CHESS:  Learning from Experience

“Is this good behavior for CHESS to return three different results for the same code?”

Page 39: Concurrency Checking with CHESS:  Learning from Experience

“CHESS Time Isn’t Real Time”: It’s a feature, not a bug.

“The call to WaitOne(60000, false) immediately returns false, which isn’t correct. If I use WaitOne() or WaitOne(Timeout.Infinite, false) instead of WaitOne(60000, false), the WaitHandle waits till the Event is set, returns true and everything goes fine. But waiting without a timeout isn't an option in my case.”

Page 40: Concurrency Checking with CHESS:  Learning from Experience

The expected: “I can’t play CHESS on”

x64Multi-process programsMessage passing, distributed systemsThe Boost library.NET without the CLR ProfilerJavaUnix…

Page 41: Concurrency Checking with CHESS:  Learning from Experience

Learning from Experience:Forums, Champions

Chris Dern, Steve Hale, Ram Natarajan, Roy Tan

Page 42: Concurrency Checking with CHESS:  Learning from Experience

“Congratulations CHESS team!!!!! I have proven outside of CHESS that the issue it is finding in our product on the 106th thread schedule looks like a valid product bug!! I wrote a quick application to launch my CHESS test outside of CHESS and by freezing/thawing threads I was able to reproduce the issue independently. This is incredibly exciting!!! Many thanks for your patience, perseverance, and CHESS bug fixes as I’ve struggled to understand CHESS.”

Steve Hale, Microsoft , 2/12/2009

Page 43: Concurrency Checking with CHESS:  Learning from Experience
Page 44: Concurrency Checking with CHESS:  Learning from Experience
Page 45: Concurrency Checking with CHESS:  Learning from Experience

ConcurrentDictionary

ConcurrentBag

SemaphoreSlim ManualResetEventSlim

Barrier

BlockingCollection

Task

TaskScheduler

PLINQ Parallel.For

Page 46: Concurrency Checking with CHESS:  Learning from Experience

“As the true value of a test is in its ability to find bugs, let’s take a look at how our CHESS tests did. Over the development cycle to date, the CHESS test found seven bugs, and was used to reproduce another seven for a total of 14, out of the 276 high priority bugs over the same time. While only 14 bugs against 276 appear sadly anemic, it’s important to dig a bit deeper. If we address each of the issues raised, would we find more bugs?”

Chris Dern, PFX_CHESS_Review_Final.docx

Page 47: Concurrency Checking with CHESS:  Learning from Experience

“Early on the adoption of CHESS, we made a fatal mistake. Perhaps it was wishful thinking on our part, or perhaps we believed too much in the marketing hype and didn’t read the fine print. We believed early on that CHESS was a turnkey solution capable of using existing tests and test approaches and ‘finding the bugs’. “ C. Dern

Page 48: Concurrency Checking with CHESS:  Learning from Experience

“The schedule for any product group is always under attack. Over the life cycle of a product, features are in constant flux, with managers always balancing risk and reward. In the face of this pressure, any untried tool, methodology, or approach faces an uphill battle.” C. Dern

Page 49: Concurrency Checking with CHESS:  Learning from Experience

“For tool developers, it’s important that once you engage with a customer you help find then drive to some level of success. Finding a single bug is a priceless commodity when arguing to continue the time investment in a specific tool. Take small bites, set modest goals and drive to success. Perfect is the enemy of good, or at least good enough right now.” C. Dern

Page 50: Concurrency Checking with CHESS:  Learning from Experience

Dern’s DO’s and DON’Ts

DO NOT expect that CHESS will ‘magically’ find your bugs. CHESS is a tool, mainly focused at enumerating schedules for a given bound. While it can find specific types of concurrency bugs, e.g. deadlocks, for ‘free’ the value and benefit of CHESS comes with deliberate tests.

Page 51: Concurrency Checking with CHESS:  Learning from Experience

DO develop an understanding of what properties, invariants, and behaviors your test is testing

DO run your tests. While this may seem a silly tip, but it’s important to remember that CHESS enables the familiar write, run, refactor test experience for concurrent tests, which we enjoy with sequential tests today.

Page 52: Concurrency Checking with CHESS:  Learning from Experience

DO NOT add artificial spinning/busy work in the test. CHESS will explore all schedules for your specified bound. Adding busy work, like you may find in a ‘stress’ test to increase coverage, only increases the test runtime when under CHESS.

Page 53: Concurrency Checking with CHESS:  Learning from Experience

AVOID blindly converting an existing ‘stress’ style unit test into a CHESS test. The size, scale, and assertions that one tends to find in those types of tests make for a weak CHESS test at best, or a unusable CHESS test at worst.

Page 54: Concurrency Checking with CHESS:  Learning from Experience

Stepping Back from the Fray: High-level Learnings

Proper expectation settingGood methodologyGood default behaviorGood warnings and messagesMinimize cognitive dissonanceCultivate championsListen to them and learn!

Page 55: Concurrency Checking with CHESS:  Learning from Experience

Three CHESS Learnings

1. If you wantdeterministic schedulingwith ability to explore all

schedules without changing the

underlying schedulerThen its hard to achievehigh API coveragerobustness

Action: we need observable and controllable schedulers!

2. Concurrency unit testing can be effective, but requires careful planning and

scoping

3. Search/reduction strategiesare absolutely essential

Page 56: Concurrency Checking with CHESS:  Learning from Experience

Uplifting Message andBlatant Advertisement for LineUp Talk

“Partnerships and CollaborationsThe success of the LineUp work is a perfect example of [the benefits of] an open dialog between the teams along with continual experimentation by both sides. Combining innovations from both research and product testing group, we create[d] a complete solution to one area of concurrency testing.” C. Dern

Page 57: Concurrency Checking with CHESS:  Learning from Experience