1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal...

40
1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal ([email protected]) Vijay K. Garg ([email protected]) PDS Lab University of Texas at Austin

Transcript of 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal...

Page 1: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

1

Efficient Dependency Tracking for Relevant Events in Shared Memory Systems

Anurag Agarwal ([email protected])Vijay K. Garg ([email protected])

PDS LabUniversity of Texas at Austin

Page 2: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

2

Outline

Motivation Background Chain Clock Instances of Chain Clock Experimental Results Conclusion

Page 3: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

3

Motivation

Dependency between events required for global state information

Applications like monitoring and debugging Vector clock [Fidge 88, Mattern 89]

O(N) operations for a system with N processes Dynamic creation of processes

Page 4: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

4

Outline

Motivation Background Chain Clock Instances of Chain Clock Experimental Results Conclusion

Page 5: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

5

Relevant Events

Events “useful” for application Predicate Detection

“There are no messages in the channel”

p1

p2

p3

p4

Page 6: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

6

Vector Clocks [Fidge 88, Mattern 89] Assigns N-tuple (V) to every relevant event

e → f iff e.V < f.V (clock condition)

Process Pi : V = (0, … , 0) On an event e

I. If e is receive of message m:V = max (V, m.V)

II. If e is a relevant event:V[i] = V[i] + 1

III.If e is a send of message m:m.V = V

Page 7: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

7

Outline

Motivation Background Chain Clock Instances of Chain Clock Experimental Results Conclusion

Page 8: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

8

Key Idea

Any chain in the computation poset can function as a process

a

f

eb

d

c

h

g

p1

p2

p3

p4

a b c d

e f g h

Page 9: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

9

Chain Clocks

A component in timestamp corresponds to a chain

Change “Rule II” in the vector clock algorithm If e is a relevant event

V[e.c] = V[e.c] + 1

Theorem: Chain clocks guarantee the “clock condition”

Goal: Online decomposition of poset into as few chains as possible

Page 10: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

10

Outline

Motivation Background Chain Clock Instances of Chain Clock

DCC ACC VCC

Experimental Results Conclusion

Page 11: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

11

Dynamic Chain Clocks (DCC)

Shared vector Z maintains up-to-date values of all components

Each process starts with empty vector Rule II

e.c = j such that Z[j] = e.V[j] Give preference to component last updated by Pi

V[e.c] = V[e.c] + 1

Page 12: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

12

DCC: Example

I. If e is receive of message m:

V = max (V, m.V)

II. If e is a relevant event:e.c = i s.t. Z[i] = V[i]V[e.c] = V[e.c] + 1Z[e.c] = Z[e.c] + 1

III. If e is a send of message m: m.V = V

(1)p1

p2(0,1)

(1,1) = max{(1),(0,1)}

1 10

V1 V2 Z

1 1 122

(2,1)

(3,2)p3

V3

132

3

(3,1)

13

(3,1)

2

Page 13: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

13

Problem

Number of processes can be much larger than minimal number of chains

(1)

p1

p2(0,1) (1,2)

(0,1,1) (1,2,2)

(0,1,1,1) (1,2,2,2)

p3

p4

Page 14: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

14

Optimal Chain Decomposition Antichain: Set of pairwise concurrent elements Width: Maximum size of an antichain

Dilworth’s Theorem [1950] : A poset of width k can be partitioned into k chains and no fewer.

Requires knowledge of complete poset

Page 15: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

15

Online Chain Decomposition

Elements of poset presented in a total order consistent with the poset

Assign elements to chains as they arrive Can be modeled as a game between

Bob : Presents elements Alice : Assigns them to chains

Felsner [1997] : For a poset of width k, Bob can force Alice to use k(k+1)/2 chains

Page 16: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

16

Chain Partitioning Algorithm (ACC) Felsner gave an algorithm which meets the k(k+1)/2

bound Our algorithm is simpler and more efficient

B1 B2 B3

B1 … Bk : |Bi| = i

For an element z:

Insert into the first queue q in Bi with head < z

Swap queues in Bi and Bi-1 leaving q in its place

z

Page 17: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

17

Drawback of DCC and ACC Require a shared data structure

Monitoring applications generally need a central server

Hybrid clocks Multiple servers, each responsible for a subset of

processes Finds chains within a process group

Page 18: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

18

Shared Memory System

Accesses to shared variables induce dependencies

Observation: Access events for a shared variable form a chain

Variable-based Chain Clocks (VCC) Associate a component with every variable

Page 19: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

19

VCC Application: Predicate Detection Predicate : (x = 1) and (y = 1) Only events changing x and y are relevant Associate a component of VCC with x and

other with y

x = 0

x =1 x = 2

x = 1y = 1

y = 2

Initially: x=0, y = 0

Page 20: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

20

Outline

Motivation Background Chain Clock Instances of Chain Clock Experimental Results Conclusion

Page 21: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

21

Experiments

Setup A multithreaded application Each thread generates a sequence of events Parameters:

Number of Processes Number of Events Probability of relevant event:

Metrics Number of components used Execution time

Page 22: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

22

Components Used

Events = 100 = 1%

Page 23: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

23

Execution Time

Events = 100 = 1%

Page 24: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

24

Effect of Relevancy

Threads = 100Events = 100

Page 25: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

25

Conclusion

Generalized vector clocks to a class of algorithms called Chain Clocks

Dynamic Chain Clock (DCC) can provide tremendous speedup and reduce memory requirement for applications

Antichain-based Chain Clock (ACC) meets the lower bound for chain decomposition

Page 26: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

26

Questions?

Page 27: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

27

Page 28: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

28

Example: Poset of width 2

For a poset of width 2, Alice can force Bob to use 3 chains

1

2

1

3

Page 29: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

29

Drawback of DCC and ACC Require a shared data structure

Monitoring applications generally need a central server

Hybrid clocks Multiple servers, each responsible for a subset of

processes Finds chains within a process group

Page 30: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

30

Example: Poset of width 2

For a poset of width 2, Alice can force Bob to use 3 chains

1

2

1

3

Page 31: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

31

Chain Partitioning Algorithm (ACC) Felsner gave an algorithm which meets the k(k+1)/2

bound Our algorithm is simpler and more efficient

B1 B2 B3

B1 … Bk : |Bi| = i

For an element z:

Insert into the first queue q in Bi with head < z

Swap queues in Bi and Bi-1 leaving q in its place

z

Page 32: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

32

Happened Before Relation (→)[Lamport 78] Distributed computation with N processes Every process executes a series of events

Internal, send or receive event

p1

p2

e → f if there is a path from e to f e║f if there is no path between e and f

Page 33: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

33

Future work

Lower bound for online chain decomposition when a decomposition into N chains is already known

Other chain decomposition strategies

Page 34: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

34

Distributed System: Time vs Threads

Events = 100 = 1%

Page 35: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

35

Distributed System: Events vs Time

Threads = 100 = 1%

Page 36: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

36

Effect of Number of Events

Threads = 100 = 1%

Page 37: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

37

DCC: Example

I. If e is receive of message m:

V = max (V, m.V)

II. If e is a relevant event:e.c = i s.t. Z[i] = V[i]V[e.c] = V[e.c] + 1Z[e.c] = Z[e.c] + 1

III. If e is a send of message m: m.V = V

(1)p1

p2(0,1)

(1,1) = max{(1),(0,1)}

1 10

V1 V2 Z

1 1 122

(2,1)

(3,2)p3

V3

132

3

(3,1)

13

(3,1)

2

Page 38: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

38

Page 39: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

39

Page 40: 1 Efficient Dependency Tracking for Relevant Events in Shared Memory Systems Anurag Agarwal (anurag@cs.utexas.edu) Vijay K. Garg (garg@ece.utexas.edu)

40

Example for DCC – is it appropriate ? Is the content a bit too much for this amount

Where can I reduce it ? Remove VCC or ACC ?

Chain clock Generalizes vector clocks Reduces the time and memory overhead Elegantly handles dynamic process creation