Towards Tractability of Dataflow Analysis for Concurrent Programs Vineet Kahlon NEC Labs, Princeton,...
-
date post
20-Dec-2015 -
Category
Documents
-
view
221 -
download
1
Transcript of Towards Tractability of Dataflow Analysis for Concurrent Programs Vineet Kahlon NEC Labs, Princeton,...
Towards Tractability of Dataflow Analysis
for Concurrent Programs
Vineet KahlonNEC Labs, Princeton, USA
Sequential Dataflow Analysis
• Program Design
• Debugging
• Optimization
• Maintenance
• Documentation
• …
Sequential Dataflow Analysis
• Program Design
• Debugging
• Optimization
• Maintenance
• Documentation
• …
Concurrent
Sequential Dataflow Analysis
• Program Design
• Debugging
• Optimization
• Maintenance
• Documentation
• …
Hardly anything of interest is decidable
Concurrent
Pointer Analysis
int main(){ foo(int **t){ int *p, *v; if(…) rc = fork(foo,&v); *t = d; v = c; else{ p = v; *t = e; rc = join(); *t = f;} } }
Pointer Analysis
int main(){ foo(int **t){ int *p, *v; if(…) rc = fork(foo,&v); *t = d; v = c; else{ p = v; *t = e; rc = join(); *t = f;} } }
Pointer Analysis
int main(){ foo(int **t){ int *p, *v; if(…) rc = fork(foo,&v); *t = d; v = c; else{ p = v; *t = e; rc = join(); *t = f;} } }
Pointer Analysis
int main(){ foo(int **t){ int *p, *v; if(…) rc = fork(foo,&v); *t = d; v = c; else{ p = v; *t = e; rc = join(); *t = f;} } }
Pointer Analysis
int main(){ foo(int **t){ int *p, *v; if(…) rc = fork(foo,&v); *t = d; v = c; else{ p = v; *t = e; rc = join(); *t = f;} } }
Pointer Analysis
int main(){ foo(int **t){ int *p, *v; if(…){ rc = fork(foo,&v); send(sig1); v = c; *t = d; wait(sig1); } else{ p = v; send(sig2); rc = join(); *t = e; } wait(sig1); *t=f; } }
Analysis of Concurrent Programs is Inherently Global
foo(…){
…
…
}
Pairwise Reachability
• Key Problem: Given a program location c in a thread determine which locations in other threads can contribute to dataflow facts at c
• Technical formulation: Pairwise Reachability
EF (c Æ d)
Concurrent Dataflow Analysis Framework
• Step 1: Abstractly interpret each thread based on the analysis
• Step 2: For each location c of a given thread T determine which locations in the concurrent program can contribute to dataflow facts at c
• Step 3: Compute dataflow facts using standard fixpoint computations
Concurrent Dataflow Analysis Framework
• Step 1: Abstractly interpret each thread based on the analysis
• Step 2: For each location c of a given thread T determine which locations in the concurrent program can contribute to dataflow facts at c
• Step 3: Compute dataflow facts using standard fixpoint computations
Inter-procedural Dataflow Analysis for Sequential Programs
• Close relationship between Data Flow Analysis for sequential programs and the model checking problem for Pushdown Systems (PDS)–Use abstract interpretation to get a finite representation of the control part of the program–Recursion is modeled as a stack–Exploit the fact that the model checking problem for Pushdown Systems is efficiently decidable for very expressive linear and branching-time logics
[Bouajjani et.al., Walukiewicz, Reps, Schwoon, Jha]
From Programs to PDSs
main(){
l1: sh = 0;
l2: if(…){
l3: sh = 1;
l4: foo(); } else{
l5: sh=z;
l6: foo(); }
l7: … }
l1
l2
l3 l5
l6
l’6
l7
m1
m0
l4
l’4
! foo4
! foo6
foo6 !
foo4 !
From Programs to PDSs
main(){
l1: sh = 0;
l2: if(…){
l3: sh = 1;
l4: foo(); } else{
l5: sh=z;
l6: foo(); }
l7: … }
l
l2
l3 l4
l5
l’5
l7
l4
l’4
! foo4
! foo6
foo6 !
foo4 !
(l1,0)
(l2,0)
(l3,1) (l5,0)
(l6,>)
(l’6,>)
(l4,1)
(l7,>)
(l’4,1)
(m0,1)
(m0,>)
(m0,>)
(m0,1)
PDS
A PDS is a tuple (Q, , , q0), where
• Q is a finite set of control states• is a set of stack symbols
• µ (Q £ ) £ (Q £ *), Not: c1 ! c2
Configurations:
h c, u i c : control state
u : stack content
a !
Inter-procedural Dataflow Analysis for Concurrent Programs
• Dataflow analysis for concurrent program reduces to the model checking problem for interacting PDS systems
• Fundamental Problem: To study the decidability of the model checking problem for PDS interacting via the standard synchronization primitives
–Lock
•Non-nested Locks
•Nested Locks
–Pairwise and Asynchronous Rendezvous
–Broadcasts
–Boolean Guards
Undecidability Barrier
Undecidable for PDSs interacting via
–Pairwise Rendezvous [Ramalingam]
–Locks [Kahlon et. al.]
Key Underlying Obstacle: Checking non-emptiness of
the intersection of two context free languages is
undecidable
Consequence: If the PDSs are coupled tightly enough
either by making
–the synchronization primitive expressive enough, or
–the temporal property being checked strong enough,
we get undecidability of the model checking problem
Indexed Linear Temporal Logic
• LTL – Atomic Propositions: If atomic proposition are
interpreted over the local states of k threads then the formula is k-indexed,
– Temporal Operators: • F p : eventually p• P U q : p until q• X p : next time p• G p : always p
– Boolean Connectives: Æ, Ç and : • Example: Data Race - F(c Æ d) • L(Op1,…,Opk): Only operators Op1,…,Opk are allowed
– Example: L(G,U) allows G, U, Æ, Ç and atomic prop.
1
LTL Landscape
L(G,U)
L(G,F)L(U)
L(F) L(G)
L(F,F)1
Nested Locks
Non-NestedLocks
BroadcastsPairwise
Rendezvous
Decidability , Loose Coupling
• One thread cannot force another thread to execute
• Model Checking is decidable for loosely coupled PDSs
• When model checking is decidable we can reduce the analysis for the program to its constituent threads
Frequently Used Primitives
• Locks
• Rendezvous
Java: Wait/Notify
Pthreads: pthread_cond_wait()
pthread_cond_send()
• Broadcasts
Java: Wait/NotifyAll
Frequently Used Primitives
• Locks
• Rendezvous
Java: Wait/Notify
Pthreads: pthread_cond_wait()
pthread_cond_send()
• Broadcasts
Java: Wait/NotifyAll
Locks
• Locks:– Nested: things of interest are decidable– Non-Nested: nothing of interest is
undecidable
• Rendezvous: nothing of interest is decidable– Solutions:
• Over-approximation via regular sets• Over-approximation via parameterization• …
Locks
L(G,U)
L(G,F)L(U)
L(F) L(G)
L(F,F)1
Nested
Non-Nested
Nested Locks
A concurrent multi-threaded program uses locks in a nestedfashion iff along every computation each thread can onlyrelease that lock which it acquired last and that has not yet been released
f() { g(){ h(){ acquire(b) ; release(b); acquire(c); g(); acquire(c); release(b); release(c); } } }
• Programming guidelines typically recommend that programmers use lock in a nested fashion
• Locks are guaranteed to be nested in Java· 1.4 and C#
Nested Locks
A concurrent multi-threaded program uses locks in a nestedfashion iff along every computation each thread can onlyrelease that lock which it acquired last and that has not yet been released
f() { g(){ h(){ acquire(b) ; release(b); acquire(c); h(); acquire(c); release(b); release(c); } } }
• Programming guidelines typically recommend that programmers use lock in a nested fashion
• Locks are guaranteed to be nested in Java· 1.4 and C#
Coupling via Locks
l1
l2
l3
l4
c1
l2
l3
l4
l1
c2
Nested Locks can only enforce mutual exclusionbut cannot constrain the order in whichtransitions can be executed
Locks can, in general, enforce synchronizationthrough chaining
Nested Locks
L(G,U)
L(G,F)L(U)
L(F) L(G)
L(F,F)1
Nested
Nested Locks
• Pairwise reachability is (efficiently) decidable
• Reasoning about a concurrent program comprised of threads interacting via nested locks can be de-coupled to its constituent threads
Acquisition History: Motivation
Thread1(){ Thread2(){ c1: acquire(a); g1: acquire(c); c2: acquire(c); g2: acquire(a); c3: release(c); g3: release(a); c4: Error1; g4: Error2; } }
Observation: c4 and g4 are not simultaneously reachable even though Lock-Set(c4) Å Lock-Set(g4) = ;
Bottomline: Tracking Lock-Sets is not enough
Acquisition History: Cyclic Dependencies
Thread1(){ Thread2(){
c1: acquire(a); g1: acquire(c);
c2: acquire(c); g2: acquire(a);
c3: release(c); g3: release(a);
c4: Error1; g4: Error2;
} }
• acquire(a) must be executed by Thread1 before acquire(c) is executed by Thread2
• acquire(c) must be executed by Thread2 before acquire(a) is executed by Thread1.
Acquisition History: Definition
Thread1(){ Thread2(){
c1: acquire(a); g1: acquire(c); c2: acquire(c); g2: release(a); c3: release(c); g3: release(a); c4: Error1; g4: Error2; } }
The acquisition history of a lock lk at a control location of a thread T is the set of locks that have been acquired (and possibly released) by T since the last acquisition of lk by
• Acq-Hist(c4,a) = {c}• Acq-Hist(g4,c) = {a}
Acquisition History: Consistency
Thread1(){ Thread2(){
c1: acquire(a); g1: acquire(c); c2: acquire(c); g2: release(a); c3: release(c); g3: release(a); c4: Error1; g4: Error2; } }
Acq-Hist(c1, l1) is consistent with Acq-Hist(c2, l2) iff thefollowing does not hold: l1 2 Acq-Hist(c2, l2) and l2 2 Acq-Hist(c1, l1)
Decomposition Result
Control states c1 and c2 of Thread1 and Thread2, respectively,
are simultaneously reachable iff
• Lock-Set(c1) Å Lock-Set(c2) = ; ;
• There do not exist locks l, m:
– l 2 Acq-Hist(c1, m)
– m 2 Acq-Hist(c2, l)
Corollary: By tracking acquisition histories we can reduce the
model checking problem from a concurrent program to its
Individual threads.
Decomposition Result
(c1, c2) is reachable from the initial state (in1, in2) iff there
exist local paths of T1 and T2 along which the acquisition histories are consistent. , There exist consistent acquisition histories AH1 and AH2
such that the augmented local states (c1, AH1) and (c2, AH2) are reachable individually in T1 and T2, resp., For each i, ini 2 pre*({(ci, AHi)})
Bottomline: pre*closure for a Multi-threaded programinteracting via nested locks can be reduced to its individualconstituent threads.
A decision procedure for EF(c1 Æ c2)
1. Enumerate the set of all pairs pi of augmented local states (c1, AH1i) and (c2, AH2i), where AH1i and AH2i are consistent
2. For a pair pi, compute for each individual thread Tj, the sets pre*(c1, AH1i) and pre*(c2, AH2i)
3. EF(c1 Æ c2) holds iff for some i,1. in1 2 pre*(c1, AH1i), and
2. in2 2 pre*(c2, AH2i)
Model Checking LTL Properties
Main Idea: Reduce the Model Checking Problem to multiple instances of reachability
Model Checking for Finite State Systems
ing
S £ Bf
stem
cycle
System: S
Temporal Property: f
Model Checking a Single PDS for LTL properties
Decide whether the product BP of the given PDS P and the Buchi
Automaton for the given property f has an accepting
lollipop, i.e., there exist a global configuration hc, aui such that
1. There is a path from the initial state hin, ?i to hc, aui [Stem]
2. For some v, there is a path from hc, ai to hc, avi containing an accepting state g of BP. [Cycle]
Notation: hc,aui•c – control state •a – top stack symbol•u – stack content
Pumping diagram
h in, ?igh c, aui
ua
Pumping diagram
h in, ?igh c, aui
ua
Pumping diagram
h in, ?igh c, avui
vu
a
Dual Pumping
hc, ai hc, avi
hc, aui hc, avui
Pumping diagram
h in, ?igh c, avui
vu
a
Pumping diagram
h in, ?igh c, av2ui
vu
a
v
Pumping diagram
h in, ?igh c, aviui
Dual Pumping
hc1, a1u1i h c1, a1v1u1ih in1, ?i
Dual Pumping
h c1, a1v1u1ihc1, a1u1ih in1, ?i
h in2, ?i h c2, a2u2i h c2, a2v2u2i
Dual Pumping
f
Dual Pumping
f lf2lf1
Dual Pumping
f lf2lf1
Dual Pumping
f lf2lf1
Dual Pumping
f lf2lf1
Dual Pumping
f lf2lf1
Dual Pumping
f lf2lf1
c2
d2
Dual Pumping
f lf2lf1
c2
d2
Dual Pumping
f lf2lf1
c2
d2
Dual Pumping
f lf2lf1
Dual Pumping
f lf2lf1d1
Dual Pumping
f lf2lf1
c1
d1
Dual Pumping
f lf2lf1
c1in
Reduction to Reachability
• Dual Pumping reduces model checking for F (c1 Æ c2) to reachability in Dual-PDS systems
• Reachability is decidable for PDS interacting via nested locks but undecidable for PDS interacting via non-nested ones
• Reachability can de decided in a compositional manner
1
LTL Landscape
L(G,U)
L(G,F)L(U)
L(F) L(G)
L(F,F)1
Nested Locks
Non-NestedLocks
BroadcastsPairwise
Rendezvous
LTL Landscape
L(G,U)
L(G,F)L(U)
L(F) L(G)
L(F,F)1
Nested Locks
Non-NestedLocks
BroadcastsPairwise
Rendezvous
Rendezvous
Over-approximation via Regular Languages
T1(){ T2(){ … …
a1: if(..){ b1: wait(obj);
a2: send(obj); b2: …
a3: counter++; b3: counter++;
a4: } else{ }
a5: counter = 0; }}
Over-approximation via Regular Languages
T1(){ T2(){ … …
a1: if(..){ b1: wait(obj);
a2: send(obj); b2: …
a3: counter++; b3: counter++;
a4: } else{ }
a5: counter = 0; }}
Over-approximation via Regular Languages
T1(){ T2(){ … …
a1: if(..){ b1: wait(obj);
a2: send(obj); b2: …
a3: counter++; b3: counter++;
a4: } else{ }
a5: counter = 0; }}
Over-approximation via Regular Languages
• Compute the language of sends/waits at each control location
• c1 and c2 are simultaneously reachable only if
L(c1) Å L’(c2) ;
L’(c2) is the language gotten from L(c2) by replacing each send with a wait, and vice versa
Over-approximation via Regular Languages
main(){ foo(..){ for (int i = 0; i < 100; i++) if(){ send(a); send(b); … }else{ foo(); send(c); … foo();} wait(d); } }
L(exmain) = (a!)100(c!)n(b!)(d?)n
Lo(exmain) = (a!)*(c!)*(b!)(d?)*
Parameterized Systems
• Systems are comprised many replicated copies of a few basic components
U1 || … || Uk
• Examples
– Multi-core processors
– Protocols
– Drivers
• PMCP: 9n1,…,nk: U1 || … || Uk ² f
n1 nk
nkn1
Example: Huge Tlb
static struct page *alloc_fresh_huge_page(struct page *page) { static int nid = 0;
page = alloc_pages_node(nid,GFP_HIGHUSER|__GFP_COMP|__GFP_NOWARN, HUGETLB_PAGE_ORDER); nid = (nid + 1) % num_online_nodes(); if (page) { // Data Race here !!! //++ spin_lock(&hugetlb_lock);
nr_huge_pages++;
nr_huge_pages_node[page_to_nid(page)]++;
//++ spin_unlock(&hugetlb_lock); } return page;}
Why Parameteriztion ?
• Parameterized Applications: – Example: Device drivers are supposed to be data
race free irrespective of how many thread instances running the driver exist
• (Partial) Completeness: Many data races occur iff they occur in a parameterized setting
• Soundness: Data race freedom in a parameterized setting implies data race freedom for any concrete finite instance
Parameterization as Abstraction
T1 || T2 ² EF (c1 Æ c2)
versus
9 n, m, T1n || T2
m ² EF (c1 Æ c2)
Parameterization as Abstraction
T1 || T2 ² EF (c1 Æ c2)
versus
9 n, m, T1n || T2
m ² EF (c1 Æ c2)
Why Parameterization for Dataflow Analysis ?
• Surprise: parameterized reachability is more tractable
• For threads communicating via locks parameterization does not lead to an over-approximation of pairwise reachable states
• Can be used as a first step to cheaply filter out many interleavings
• Existing tools can be easily adapted to decide parameterized reachability
Tractability via Parameterization
L(G,U)
L(G,F)L(U)
L(F) L(G)
L(F,F)1
PairwiseRendezvous
Pairwise Rendezvous
Tractability via Parameterization
L(G,U)
L(G,F)L(U)
L(F) L(G)
L(F,F)1
PairwiseRendezvous
Parameterization vis-à-vis Ramaligam’s Result
• Avoid reasoning about instances of progressively increasing size
• Given a pair of parameterized pairwise reachable states c1 and c2 computing the smallest instance for which c1 and c2 are pairwise reachable is not possible in general
Tractability via Parameterization
L(G,U)
L(G,F)L(U)
L(F) L(G)
L(F,F)1
PairwiseRendezvous
Unbounded Multiplicity Result
The multiplicity of any reachable states
can made to exceed any given m
Un ² EF c ) Unm ² EF¸ m c
Efficient Parameterized Reachability
!
! !
!
b!
a?
a!
b?
c?d!
c!
Efficient Parameterized Reachability
!
! !
!
b!
a?
a!
b?
c?d!
c!
c0 c1
c2
Efficient Parameterized Reachability
!
! !
!
c0 c1
c2
Efficient Parameterized Reachability
!
! !
!
Efficient Parameterized Reachability
!
! !
!
b!
b?
c?d!
c!
c0 c3
Efficient Parameterized Reachability
!
! !
!
c0 c3
Efficient Parameterized Reachability
!
! !
!
Efficient Parameterized Reachability
!
! !
!
c?d!
c!
Efficient Parameterized Reachability
!
! !
!
Complexity of Parameterized Reachability
• At most s iterations
• Each iteration costs O(s3) time
• Total time: O(s4)
• Can use existing PDS model checking tools
s: no of control states
A Concise History
• Undecidable for PDS interacting via rendezvous
[Ramalingam]• Undecidable for PDSs interacting via Locks [Kahlon et al.]• Decidable for
–PA processes [Esparza et. al., Lugiez et. al]–Constrained Dynamic Pushdown Networks
[Bouajjani et.al.]–Asynchronous Dynamic Pushdown Network
[Bouajjani et. al.]• Decidable for Asynchronous Program
[Sen et. al., Jhala et. al., Olm et.al.]
A Concise History
• Over approximation techniques for PDS interacting via rendezvous [Chaki et.al.]
• Dataflow Analysis from Partial Order Traces
[Farzan and Madhusudan]• Delineation of the decidability boundary for the
standard synchronization primitives
[Kahlon et. al.]• Parameterization as a form of abstraction
[Kahlon]
Concluding Remarks
• Decidability , Threads Loosely Coupled• For decidable cases one can reduce dataflow analysis
for the given concurrent threads to its individual threads
• Undesirables: Model Checking of L(F) is undecidable for PDSs interacting via primitives other than nested locks
• Exploit program structure to ensure tractability– Parameterization– Over-Approximation via Regular Languages