Sparse Coding for Specification Mining and Error Localization Runtime Verification September 26,...

17
Sparse Coding for Specification Mining and Error Localization Runtime Verification September 26, 2012 Wenchao Li, Sanjit A. Seshia University of California - Berkeley [email protected]

Transcript of Sparse Coding for Specification Mining and Error Localization Runtime Verification September 26,...

Sparse Coding for Specification Mining and Error Localization

Runtime VerificationSeptember 26, 2012

Wenchao Li, Sanjit A. Seshia

University of California - [email protected]

Runtime Verification 2012

Assertion-Based Verification

2

Problem: assertions are created manually

Simulator

Assertions

Coverage

Tests

Circuit/Program

Generate stimulus to patch coverage holes

Find bugs with assertions

“…typically 20% of specifications pass vacuously during the first formal verification runs of a new hardware design…” [IBM Haifa]

Runtime Verification 2012

Error Localization

3

Fatal

Error01010101010101101101010101011111101010101

Where?

Challenges:• Limited observability• Long error detection latency• Transient and hard-to-reproduce bugs

Idea: assertions can provide local observability and correctness checks

Runtime Verification 2012

Related Work

• Specification Mining:– Programs: single-state invariants, pre-/post-conditions, automata

learning, alternating patterns– Circuits: fixed-delay pairs, temporal logic patterns– Require templates

• Error Localization:– Programs: model checking, predicates– Circuits: instruction footprints, SAT-based, mined assertion-based– Require system model and good observability– Require templates

4

Our technique is template-free and does not require having the system model

Runtime Verification 2012

What can you tell by just observing a trace?

1 0 0 1 1 1 0 0 0 0

0 0 1 1 1 0 0 0 1 0

1 0 0 1 0 1 0 0 0 0

0 0 1 0 1 0 0 0 1 0

5

Obj1.m1()

Obj1.m2()

Obj1.m1()

Obj2.m1()Obj2.m1()

Cloud Hardware trace Program trace Human interaction/behavior Sensor network Distributed system

Runtime Verification 2012

A Sparse Coding Approach

6

» 0.8 * + 0.3 * + 0.5 *» 0.8 * + 0.3 * + 0.5 *

x » 0.8 * f3 + 0.3 * f30 + 0.5 * f61

Key idea: Express each subtrace as a Boolean combination of a few “basis subtraces”– a (sparsity-constrained) Boolean matrix factorization problem.

1 1 0 0 1

0 0 1 0 1

Sparsity helps to uncover latent structure of the data

Runtime Verification 2012

Contributions and Outline

• A new formalism for discovering structure in a trace

• A definition of the sparsity-constrained Boolean matrix factorization problem and an algorithm for solving it

• Applications to specification mining and error localization– Does not rely on redefined templates– Simultaneous perform error localization and explanation

• Outline: Problem formulation Algorithm Error localization and explanation Results

7

Runtime Verification 2012

Problem Formulation

8

1 1 0 0 10 0 1 0 1

1 11 00 00 1

= ○

basis coefficient

Multiplication as “AND”Addition as “OR”

columns are sparse

Subtrace

Runtime Verification 2012

Sparsity-Constrained Boolean Factorization

9

Given a data matrix and a positive integer , the sparsity-constrained Boolean factorization problem is to find , and such that

and

and is maximized.

𝑋 𝐵 𝑆

C = 2

Runtime Verification 2012

Algorithm Idea

• Observe that the data matrix X can be viewed as the adjacency matrix for a bipartitie graph.

• Idea: factorization → biclique cover (biclique ↔ basis subtrace)

10

v

u

Runtime Verification 2012

Algorithm Overview

• Incrementally generate maximal bicliques– Consensus-based algorithm

– Extend to a maximal biclique

• Keep track of closeness to sparsity constraint• Heuristically optimize for basis sharing

11

B

C

D A

C

E

A

C

DA

E

C

DA

EC

DA

E

Runtime Verification 2012

Y

Z

X

G

C

DA

E

F

B

Algorithm Overview

• Step 1: start with the set of v-rooted star bicliques• Step 2: Pick two stars and form a consensus• Step 3: Extend the consensus to a maximal biclique• Step 4: Add the biclique to the cover if possible• Step 5: update sparsity constraint at the covered nodes

12

B

C

DA

A

C

E

C

DA

E…C

DA

E

FC

DA

E

F

Runtime Verification 2012

An Arbiter Example

13

A 2-input 2-output arbiter with round-robin scheme

p0 0 1 0 1 1 0 … …

p1 1 0 0 1 1 1 … …

q0 0 1 0 0 1 0 … …

q1 1 0 0 1 0 1 … …

Sample mined assertions (basis subtrace):

0 0 0

1 0 0

0 0 0

1 0 0

0 1 0

0 0 0

0 1 0

0 0 0

0 1 1

0 1 0

0 0 1

0 1 0

12

Number of subtraces

0 1

1 0

Runtime Verification 2012

Error Localization and Explanation

14

• Error localization and explanation based on reconstruction:

A subtrace has an error if it cannot be

reconstructed from the basis subtraces

• A subtrace is error-free if • If not, a (minimum) error explanation is , where is the solution

to the minimization problem above.

0 1 0 1 1 0 … …

1 0 0 1 1 1 … …

0 1 0 0 1 0 … …

1 0 0 1 0 1 … …

𝑋 ∙, 1

Minimize

Subject to

𝑋 ∙, 2

𝑆 ∙, 𝑖

Runtime Verification 2012

All subtraces

Example Illustration

• Error localization and explanation (arbiter example):

15

0 0 1

0 1 0

0 0 1

0 0 0

0 0 0

0 1 0

0 0 0

0 0 0

1 0 0 0 1

1 1 0 1 0

1 0 0 0 1

0 1 0 0 0

Error trace Error subtrace Error explanation

0 0 0

0 0 0

0 0 0

0 1 0

Alternative error Explanation

Space spanned by the learned basis

Correct subtraces

Error

Runtime Verification 2012

Experimental Results

• Chip Multiprocessor Router:– Observe 14 control signals – Subtrace width of 2 cycles– Learn the basis from a single error-

free trace of 1000 cycles: 0.243 seconds to obtain 189 basis subtraces from 93 distinct subtraces

16

• Error Localization:– Inject a single bit flip at a random cycle for each of 99 error traces– Localize the error to the subtrace (out of 999) where it was injected

• Comparisons:– Baseline approach (1): hash all distinct subtraces – report error

even before an error is injected for the 99 traces– Baseline approach (2): use unit basis – 0% localization– Sparse Coding: 55.6% localization

A CMP Router in a NoC

Runtime Verification 2012

Conclusion

• A template-free assertion miner that can explore embedded patterns in digital circuit traces

• Effective assertion-mining based error localization and explanation

• Potential applications to other domains, e.g. programs or distributed systems

17

THANK YOU