Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Post on 30-Dec-2015

30 views 1 download

Tags:

description

Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics. Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan Zhou Appeared in ASPLOS’08. Presented by Michelle Goodstein LBA Reading Group 3/27/08. Introduction. Multi-core computers are common - PowerPoint PPT Presentation

Transcript of Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics

Learning From Mistakes—A Learning From Mistakes—A Comprehensive Study on Real Comprehensive Study on Real World Concurrency Bug World Concurrency Bug CharacteristicsCharacteristics

Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan ZhouAppeared in ASPLOS’08

Presented by Michelle GoodsteinLBA Reading Group 3/27/08

IntroductionIntroductionMulti-core computers are commonMore programmers are having to write

concurrent programsConcurrent programs have different bugs

than sequential programsHowever, without a study, hard to know

what those bugs areFirst real-world study of concurrency

bugs

IntroductionIntroductionKnowing the types of concurrent bugs

that actually occur in software will:◦Help create better bug detection schemes◦ Inform the testing process software goes

through◦Provide information to program language

designers

IntroductionIntroductionCurrent state of affairs◦ Repeating concurrent bugs is difficult◦ Test cases are critical to being able to diagnose a bug◦Most detection research focuses:

data races deadlock bugs some new work on detecting atomicity violations

Few studies on real world concurrency bugs◦Most use programs that were buggy by design for the

studyMost studies on bug characteristics focus on non-

concurrent bugs

MethodologyMethodology4 representative open-source

applications:◦MySQL◦Apache◦Mozilla◦OpenOffice

Each application has◦ 9-13 years of development history ◦1-4 million lines of code

MethodologyMethodologyRandomly selected bugs from bug

databases that contained at least one keyword related to concurrency (eg “race”, “concurrency”, “deadlock”, “synchronization”, etc.)

From these, randomly choose 500 bugs that have◦Root causes explained well and in detail◦Source code available◦Bug fix info available

MethodologyMethodologyRemove any bugs not truly caused by

concurrencyResult: 105 concurrency bugsSeparate study of deadlock and non-

deadlock bugs

MethodologyMethodologyEvaluated bugs in 3 dimensions◦Bug pattern: {atomicity-violation, order-

violation, other}◦Manifestation: required conditions for bug to

occur, # threads involved, # variables, # accesses

◦Bug fix strategy: Look at final patch, mistakes in intermediate patches, and whether TM can help

Results organized as a collection of findings

MotivationMotivation34/105 concurrency bugs cause program

crashes37/105 concurrency bugs cause programs

to hangConcurrency bugs are important

Bug PatternsBug Patterns

Findings: Bug PatternsFindings: Bug PatternsAtomicity Violation

Order Violation

Findings: Bug PatternsFindings: Bug PatternsMost (72/74) of the examined non-

deadlock concurrency bugs are either atomicity-violations or order-violationsFocusing on atomicity and order-violations

should detect most non-deadlock concurrency bugs

In fact, 24/74 are order violationsSince current tools don’t address order-

violation, new tools must be developed

Bug ManifestationsBug Manifestations

Findings: Bug ManifestationsFindings: Bug ManifestationsMost (101/105) bugs involved ≤ 2 threads• Most communication among a small number

of threads• Enforcing certain partial orderings among a

small number of threads can expose bugs• Heavy workloads can increase competition for

resources, and make it more likely to observe a partial ordering that causes a bug

Pairwise Testing can find many bugs

Findings: Bug ManifestationsFindings: Bug ManifestationsSome (7/31) bugs experience deadlock

bugs with only 1 thread!Easy to detect/avoid

Findings: Bug ManifestationsFindings: Bug ManifestationsMany (49/74) non-deadlock bugs involve

1 variable. However, 34% involve ≥ 2 variables Focusing on 1 variable is a good

simplificationHowever, new tools also necessary to

discover multivariable concurrency bugs

Findings: Bug ManifestationsFindings: Bug ManifestationsMost (30/31 ) deadlock bugs involved ≤ 2

resourcesPairwise testing of order among obtained

and released resources should help reveal deadlocks

Findings: Bug ManifestationsFindings: Bug ManifestationsMost (92%) bugs manifested if enforced certain

partial orderings among ≤ 4 memory accesses Testing small groups of accesses will be polynomial time and

expose most bugs

Bug FixesBug Fixes

Findings: Bug FixesFindings: Bug FixesAdding/changing locks only helps

minority (20/74) non-deadlock concurrency bug fixesLocks aren’t enough to fix all concurrency

bugs.Locks don’t promise ordering, just atomicityAddition of locks can hurt performance or

create new, deadlock bugs

Findings: Bug FixesFindings: Bug FixesMost common fix (19/31) to deadlock

bugs allows 1 thread to ignore acquiring a resource, like a lockThis may get rid of deadlock bugs, but create

other non-deadlock bugsCode may no longer be correct

Bug fixes: Buggy Patches Bug fixes: Buggy Patches 17/57 Mozilla bugs have ≥ 1 buggy patchOn average, release .4 buggy patches for

every final correct patchOf 23 distinct buggy patches for the 17 bugs:◦6 decrease probability of occurrence but do not

eliminate original bug◦5 create new concurrency bugs◦ 12 create new non-concurrency bugs

Findings: Bug fixesFindings: Bug fixesIn many (41/105) cases, TM can help

avoid concurrency bugs

Findings: Bug fixesFindings: Bug fixesAlso in many cases (44/105), TM might be

able to help with concurrency bugs◦Need to allow long regions, rollback of I/O,

strange “nature” of the code

Findings: Bug fixesFindings: Bug fixesIn 20/105 cases, TM provides little help◦TM cannot help with many order-violation bugsWhile TM could be useful in preventing

concurrency bugs, it will not fix all of them

ConclusionConclusion First real-world concurrent bug study Multiple findings on

◦ Type of concurrency bugs◦ Conditions for manifestation◦ Techniques for fixing concurrent bugs

Several heuristics proposed for:◦ Bug detection◦ Testing◦ Language Design (ie, TM)

Future work can focus on detecting common types of errors◦ Multi-variable bugs◦ Order violation bugs◦ Multiple-access bugs