Overview of program analysis Mooly Sagiv html://msagiv/courses/wcc03.html.
Hawkeye: Effective Discovery of Dataflow Impediments to Parallelization Omer Tripp John Field Greta...
-
Upload
dale-goodwin -
Category
Documents
-
view
217 -
download
1
Transcript of Hawkeye: Effective Discovery of Dataflow Impediments to Parallelization Omer Tripp John Field Greta...
Hawkeye: Effective Discovery of Dataflow Impediments to
Parallelization
Omer TrippJohn Field
Greta YorshMooly Sagiv
Dataflow Impediments to Parallelization
public void set(Object o) {
this.f = calc_f(o);}
public void process() {Object o = this.f;if (o == null) {
doA();} else {
doB();}
}
public void setAndProcess(Object o) {
set(o);process();
}
set(o) || process()?
RAWdependency
for (Vertex cutpoint : this.cutpoints) { UndirectedGraph subgraph = new SimpleGraph(); subgraph.addVertex(cutpoint); this.cutpointGraphs.put(cutpoint, subgraph); this.addVertex(subgraph); Set blocks = this.vertex2blocks.get(cutpoint); for (UndirectedGraph block : blocks) { int oldHitCount = this . block2hits .get(block); this.block2hits.put(block, oldHitCount+1); this.addEdge (subgraph, block); } }
Simplified version of the JGraphT algorithm for building a block-cutpoint graph
Sometimes It’s Less Obviousfor (Vertex cutpoint : this.cutpoints) { UndirectedGraph subgraph = new SimpleGraph(); subgraph.addVertex(cutpoint); this.cutpointGraphs.put(cutpoint, subgraph); this.addVertex(subgraph); Set blocks = this.vertex2blocks.get(cutpoint); for (UndirectedGraph block : blocks) { int oldHitCount = this.block2hits.get(block); this.block2hits.put(block, oldHitCount+1); this.addEdge (subgraph, block); } }
This code admits a lot of available parallelism, but there are a few impediments that must be addressed toward parallelizing it.
How can we pinpoint these dependencies precisely and concisely?
Field-based Dependence Analysis
So let’s use dynamic dependence
analysis instead…
Static dependence analysis is challenged by dynamic containers, aliasing, etc
789modcount
table [0] next
[8] next
next
next
1
Kkey
value
]…[ …
]…[ …
2
K’key
value
next
m.put(k,1);
m.put(k’,2);
Spurious dependencies, which inhibitm.put(k,1) || m.put(k’,2)!
m = new ConcurrentHashMap();
2
m.put(k,2);
Semantic dependency, which gets “lost” in the noise!
Eureka: Let’s Use Abstraction
Abstract Locking
Galois
Leveraging ADT
semantics in STM conflict
detection
Using ADT semantics in DB concurrency control
(Muth et al., 93)
Exploiting commutativity in DB transactions(Bernstein, 66)
But…We need a predictive tool; our code is still sequential
We want the tool to pinpoint impediments to parallelization before applying parallelization transformations
The Hawkeye Analysis Tool
789modcount
table [0] next
[8] next
next
next
1
Kkey
value
]…[ …
]…[ … 2
K’
key
value
next
K 1valueK
K’ 2
valueK’
?
value
?
value
?
?
Representation Function
Key Value
Concrete Map state Map ADT state
Dynamic analysis toolUses abstraction while tracking (certain) dependenciesUser specifies representation function for data structures of choice; rest tracked concretelyAllows concentrating on semantic dependencies while suppressing spurious dependencies
Specification Language
foreach key k in m.keySet() adtState.add(m -> k);foreach entry (k,v) in m.entrySet() adtState.add(k -> v);
foreach node n in g.nodes() adtState.add(g -> n);foreach edge (n1,n2) in g.edges() adtState.add(n1 -> n2);
Map
Graph
Specification Language
foreach instance i1 in instances() foreach instance i2 in instances() adtState.add((i1,i2) -> distance(i1,i2));…
DistanceFunction
Specification Language
No need to model ADT operations
User can refine approximation (though our experience shows that the default is mostly accurate)
No need for a commutativity spec
Hawkeye uses heuristics for (sound) approximation of the footrprint of an ADT operation
Concrete
The Hawkeye Algorithm789M
modcount
table [0] next
next
[8] next
next
next
1
Kkey
value2
K’key
value2
M(M,X)
(M,K)1
(M,K,1)K
(M,K’)2
(M,K’,2)K’
m.put(k,1);
m.put(k’,2);
m.put(k,2);
2(M,K,2) (R: {}, W: {(M,K),(M,K,1)})
(R: {}, W: {(M,K’),(M,K’,2)})
(R: {}, W: {(M,K),(M,K,1),(M,K,2)})
WAW
Our assumptions:
• linearizability – for trace abstraction• encapsulation – for state abstraction
Logical
Challenges
• What is the meaning of dependencies under abstraction?
• How can we track both concrete and abstract dependencies simultaneously?
We’ve developed a uniform framework for tracking data dependencies…
Best Write Set
• The write set of transition is the union of– the locations whose value was changed by ;– the locations allocated by ; and– the locations de-allocated by .
Intuitively, the write set of a transition is its observable effect, i.e., the delta between the
entry and exit states.
Best Read Set (More Tricky)
• is a sufficient read set of transition
iff for every , such that andagree on , write( ) ≡ write( ).
• The read set of transition is the union of all its minimal sufficient read sets.
',, p
M
',, p ',, pM
Intuitively, the read set of a transition is the set of locations whose values determine the
observable effect of the transition.
Simple Example
([y=3], set(y,4), [y=4])Read set: { y }Write set: { y }
([y=3], set(y,3), [y=3])
Read set: { y }Write set: { }
Secures y=4 in exit state
Secures empty write set
Approximating the “Best” Definitions
• The good news: The “best” definitions apply both in concrete and in abstract semantics
• The bad news: The definition of the “best” read set is not computable in general
An approximation r, w of read, write is sound iff• read r w• write w
Usage Scenario
7
modcount
table [0] next
[8] next
next
next
1
Kkey
value
]…[ …
]…[ … 2
K’
key
value
next
Hmmm… Too many dependencies!
Usage Scenario
K 1valueK
K’ 2
valueK’
?
value
?
value
?
?
Now I understand what’s going on!
Usage Scenario
K 1valueK
K’ 2
valueK’
?
value
?
value
?
?
Trace Length Description Name
813,382 Solver for MST problem Boruvka
2,629,457 Java code coverage analysis Cobertura
1,733,552 Utility for synchronizing pairs of directories JFileSync
710,580 Graph library JGraphT
2,190,213 Java source code analyzer PMD
17,945,255 Machine-learning library Weka
4,840,544 Web-site dowload and mirror tool WebLech
Boruvk
a
Cobertura
JFileSy
nc
JGrap
hT
PMD
Weka
0
50
100
150
200
250
300
HawkeyeBaseline
Number of inter-iteration dependencies at the level of ADT operations with and without abstraction
Only built-in spec (Java collections)
Boruvk
a
Cobertura
JFileSy
nc
JGrap
hT
PMD
Weka
0
50
100
150
200
250
300
HawkeyeBaseline
Number of inter-iteration dependencies at the level of ADT operations with and without abstraction
Including user spec (for user types)
789modcount
table [0] next
T H Anext
]…[ …
]…[ …
N
next
Y O Unext
!next
next next
next
Knext
Backup
Preliminaries
• A state maps memory locations to values.
• A transition is a triple , where p is a program statement and are states, such that .
• A program trace is a sequence of transitions.• We assume an interleaving semantics of
concurrency.
VL:
',, p',
)(' p
Challenges
• What is the meaning of dependencies under abstraction?
• How can we track both concrete and abstract dependencies simultaneously?
We’ve developed a uniform framework for tracking data dependencies…
Best Write Set
• The write set of transition is the union of– the locations whose value was changed by ;– the locations allocated by ; and– the locations de-allocated by .
Intuitively, the write set of a transition is its observable effect, i.e., the delta between the
entry and exit states.
Best Read Set (More Tricky)
• is a sufficient read set of transition
iff for every , such that andagree on , write( ) ≡ write( ).
• The read set of transition is the union of all its minimal sufficient read sets.
',, p
M
',, p ',, pM
Intuitively, the read set of a transition is the set of locations whose values determine the
observable effect of the transition.
Simple Example
([y=3], set(y,4), [y=4])Read set: { y }Write set: { y }
([y=3], set(y,3), [y=3])
Read set: { y }Write set: { }
Secures y=4 in exit state
Secures empty write set
Approximating the “Best” Definitions
• The good news: The “best” definitions apply both in concrete and in abstract semantics
• The bad news: The definition of the “best” read set is not computable in general
An approximation r, w of read, write is sound iff• read r w• write w
Approximate Read Set
Take 1: all the locations reachable from arguments
Take 2: all the locations reachable from arguments that were accessed
during the statement’s execution
Take 3: all the locations reachable from arguments that were accessed during the statement’s execution with user
specification of the frame