Declare Mining for Dummies

78
Online version of slides: westergaard.eu/short12157

Transcript of Declare Mining for Dummies

Page 2: Declare Mining for Dummies

Declarative ModelsIf bored, we have to go to a Britney concert

We can only be truly happy after going to a Britney concert

We cannot go to both a Britney and an Xtina concert

No explicit flow of control

Good for flexible processes

Page 3: Declare Mining for Dummies

Declarative Mining

0

1Xtina

3

Britney 2

Britney

Xtina

0

1Britney

2

Happy0 1Bored

Britney

Page 4: Declare Mining for Dummies

Declarative Mining

0

1Britney

2

Happy

Bored, Britney, Happy

Bored, Beer, Happy

If we accept enough

(say ≥ 50%) traces, we

accept the constraint

Page 5: Declare Mining for Dummies

Declarative Mining0

1Britney

2

HappyBored, Britney, HappyBored, Beer, Happy

2 parameters (p)

4 event classes (n)

4 • 3 = 12 instantiations

In general n • (n-1) • … • (n - p + 1)

Realistic: n ≥ 30 Sometimes: p = 5

Then:30 • 29 • 28 • 27 • 26 =

17.100.720 instantiations

Page 6: Declare Mining for Dummies

Classic Solution

If often (say ≥ 50%)

!

!

Then and often

Contraposition

Page 7: Declare Mining for Dummies

Classic Solution

If not and often

!

!

Then not often

Only try mining constraints for often occurring

parameters

Page 8: Declare Mining for Dummies

Problem

• XHigh Risk,High Cost

Low Risk,Low Cost

X

Happens Rarely

Happens Often

Page 9: Declare Mining for Dummies

OutlineSymmetry reduction

Prefix sharing

Parallelism

Superscalarity

Support

Use cases

.

Page 10: Declare Mining for Dummies

PerformanceOld MINERful Naïve Sym Par Super

One 129m

Easy 550m 26 s 19 s 16 s 12 s 2 s

Almost All

117 s 35 s 27 s 7 s

All 215 m 154 s 108 s 56 s

Real data: 13.087 traces

n = 24 24 - 5.100.480

instances

p

2

≤ 2

≤ 3

≤ 5

7.224.024 checks

Page 11: Declare Mining for Dummies

Foundation

Convert log to efficient structure

Instead of instantiating constraints, use a single parametrized instance

Bored, Britney, HappyBored, Beer, Happy

1 2 3

1 4 3

0

1a

2

b

1 a

2 b

3,4

1 a

3 b

2,4

2 a

3 b

1,4

Page 12: Declare Mining for Dummies

PerformanceOld MINERful Naïve Sym Par Super

One 129m

Easy 550m 26 s 19 s 16 s 12 s 2 s

Almost All

117 s 35 s 27 s 7 s

All 215 m 154 s 108 s 56 s

7.224.024 checks

158.928.528 checks

p

2

≤ 2

≤ 3

≤ 567.749.981.760

checks

Page 13: Declare Mining for Dummies

Symmetry

not co-existence(A, B) = not co-existence(B, A)

Symmetry group: set S = { A1, A2, …, An } so that any Ai ∈ S can be swapped for any Aj ∈ S

Page 14: Declare Mining for Dummies

SymmetryNormally, a constraint hasn • (n - 1) • … • (n - p + 1) instantiations

A symmetry group S means that |S|! of those are the same

If |S| = p, the constraint “only” hasn • (n - 1) • … • (n - p + 1) / p! instantiations

This number is also known as ( )np

Page 15: Declare Mining for Dummies

SymmetryNormally, a constraint hasn • (n - 1) • … • (n - p + 1) instantiations

A symmetry group S means that |S|! of those are the same

If |S| = p, the constraint “only” hasn • (n - 1) • … • (n - p + 1) / p! instantiations

This number is also known as ( )np

Choice 1 of 3: 12.144 instances

158.928.528 checks

Choice 1 of 3: 3! = 6

Choice 1 of 3: 2.024 instances

26.488.088 checks

Choice 1 of 5: 5.100.480 instances

66.749.981.760 checks

Choice 1 of 5: 5! = 120

Choice 1 of 5: 42.504 instances

556.249.848 checks

Page 16: Declare Mining for Dummies

Symmetry?

=0

1Xtina

3

Britney 2

Britney

Xtina 0

1Britney

3

Xtina 2

Xtina

Britney

Page 17: Declare Mining for Dummies

0

1Xtina

3

Britney 2

Britney

Xtina 0

1Britney

3

Xtina 2

Xtina

Britney

Symmetry

=

Here we mean whether L(A1) = L(A2)

We check this by checking that L(A1) \ L(A2) = L(A1) ∩ L(A2)c =

L(A1) • L(A2)c = Ø and L(A2) • L(A1)c = Ø

Page 18: Declare Mining for Dummies

PerformanceOld MINERful Naïve Sym Par Super

One 129m

Easy 550m 26 s 19 s 16 s 12 s 2 s

Almost All

117 s 35 s 27 s 7 s

All 215 m 154 s 108 s 56 s

p

2

≤ 2

≤ 3

≤ 5

7.224.024 3.612.012

checks

66.749.981.760 556.249.848

checks 158.928.528 26.488.088 checks

Page 19: Declare Mining for Dummies

Prefix Sharing

0

1Britney

2

Happy

Bored, Britney, Happy

Bored, Beer, HappyWe replay the

“Bored” transition twice…

Page 20: Declare Mining for Dummies

Prefix Sharing

0

1Britney

2

Happy

Bored, Britney, Happy

Beer, Happy

0 1 1

0 2

Page 21: Declare Mining for Dummies

PerformanceOld MINERful Naïve Sym Prefix Super

One 129m

Easy 550m 26 s 19 s 16 s 43 s 2 s

Almost All

117 s 35 s 112 s 7 s

All 215 m 154 s 437 s 56 s

While we do get sharing (factor 4), checking becomes

more expensive

Page 22: Declare Mining for Dummies

ParallelizationMine each constraint

on each core

Split log and mine each fragment on each core

Page 23: Declare Mining for Dummies

Parallelization

||

Page 24: Declare Mining for Dummies

Parallelization

||

Page 25: Declare Mining for Dummies

Parallelization

Mine constraints in parallel Mine traces in parallel

Good for many constraints Good for many traces

Assumes constraints are similar Assumes traces are similar

Requires models are composable Requires results are composable

|| ||

True for our approach as we just count matching traces Better

Page 26: Declare Mining for Dummies

PerformanceOld MINERful Naïve Sym Par Super

One 129m

Easy 550m 26 s 19 s 16 s 12 s 2 s

Almost All

117 s 35 s 27 s 7 s

All 215 m 154 s 108 s 56 s

Just 2 computation

threads

Page 27: Declare Mining for Dummies

Superscalar Mining

Superscalar CPUs (90s): improve speed by executing multiple instructions at once

Superscalar mining: improve speed by mining multiple constraints at once

Page 28: Declare Mining for Dummies

Superscalar Mining

0

1Britney

2

Happy0 1BoredBritney

B H

Page 29: Declare Mining for Dummies

Superscalar Mining

0

1Britney

2

Happy0 1BoredBritney

BH

H•

0BH

1bH

Bored

2BH

Britney

4B

Happy

Britney

5bHappy

3bHBored

Britney

Bored

Britney

Page 30: Declare Mining for Dummies

Superscalar Mining

0BH

1bH

Bored

2BH

Britney

4B

Happy

Britney

5bHappy

3bHBored

Britney

Bored

Britney

Bored, Britney, Happy

Page 31: Declare Mining for Dummies

Superscalar Mining

We do not want to split up symmetry groups (speed)

We do not want to make the automaton too large (memory)

Page 32: Declare Mining for Dummies

Superscalar Mining

Given two sets of constraints, A and B

A can be merged into B if

We do not break any symmetry groups doing so

All constraints in A are special cases of all constraints in B

Page 33: Declare Mining for Dummies

Superscalar Mining

Given two sets of constraints, A and B

A can be merged into B if

We do not break any symmetry groups doing so

All constraints in A are special cases of all constraints in B

for all SA of A there is a SB of B such that SA ⊆ SB, and

for all S of A withS ∩ SB ≠ ø, we have S = SA

for all CA of A there is a CB of B so that CA ⇒ CB

Note: This is not

symmetrical! It captures that

it is ok to add something

easy to something hard, but

not the other way around

Page 34: Declare Mining for Dummies

Superscalar Mining[[A, B, C, D, E]]

exclusive choice[[A, B]]

not co-existence[[A, B]]

choice 1 of 2[[A, B]]

exclusive choice 2 of 3[[A, B, C]]

choice 2 of 3[[A, B, C]]

absence[[A]]

absence2[[A]]

exclusive choice 1 of 3[[A, B, C]]

choice 1 of 3[[A, B, C]]

init[[A]]

strong init[[A]]

existence[[A]]

existence3[[A]]

existence2[[A]]

absence3[[A]]

exactly1[[A]]

choice 1 of 5[[A, B, C, D, E]]

exactly2[[A]]

choice 1 of 4[[A, B, C, D]]

Page 35: Declare Mining for Dummies

[[A, B, C, D, E]]

exclusive choice[[A, B]]

not co-existence[[A, B]]

choice 1 of 2[[A, B]]

exclusive choice 2 of 3[[A, B, C]]

choice 2 of 3[[A, B, C]]

absence[[A]]

absence2[[A]]

exclusive choice 1 of 3[[A, B, C]]

choice 1 of 3[[A, B, C]]

init[[A]]

strong init[[A]]

existence[[A]]

existence3[[A]]

existence2[[A]]

absence3[[A]]

exactly1[[A]]

exactly2[[A]]

choice 1 of 4[[A, B, C, D]]

Superscalar Mining

for all SA of A there is a SB of B such that SA ⊆ SB, and

for all S of A withS ∩ SB ≠ ø, we have S = SA

for all CA of A there is a CB of B so that CA ⇒ CB

Page 36: Declare Mining for Dummies

Superscalar Mining[[A], [B]]

alternate[[A], [B]]

succession[[A], [B]]

precedence[[A], [B]]

response[[A], [B]]

alternate response[[A], [B]]

chain response[[A], [B]]

not chain succession[[A], [B]]

absence[[A]]

not succession[[A], [B]]

absence2[[A]]

init[[A]]

strong init[[A]]

existence[[A]]

existence3[[A]]

existence2[[A]]

responded existence[[A], [B]]

alternate succession[[A], [B]]

chain succession[[A], [B]]

absence3[[A]]

exactly1[[A]]

exactly2[[A]]

alternate precedence[[A], [B]]

Page 37: Declare Mining for Dummies

[[A], [B]]

alternate[[A], [B]]

succession[[A], [B]]

precedence[[A], [B]]

response[[A], [B]]

alternate response[[A], [B]]

chain response[[A], [B]]

not chain succession[[A], [B]]

absence[[A]]

not succession[[A], [B]]

absence2[[A]]

init[[A]]

strong init[[A]]

existence[[A]]

existence3[[A]]

existence2[[A]]

responded existence[[A], [B]]

alternate succession[[A], [B]]

chain succession[[A], [B]]

absence3[[A]]

exactly1[[A]]

exactly2[[A]]

alternate precedence[[A], [B]]

Superscalar Mining

for all SA of A there is a SB of B such that SA ⊆ SB, and

for all S of A with S ∩ SB ≠ ø, we have S = SA

for all CA of A there is a CB of B so that CA ⇒ CB

Page 38: Declare Mining for Dummies

Superscalar Mining

We can check 34 Declare constraints using just 3 checks:

20 “choices”: 53 states, SG {{A, B, C, D, E}}

22 “orders”: 112 states, SG {{A}, {B}}

2 “outliers”: 7 states, SG {{A}, {B}}

Page 39: Declare Mining for Dummies

PerformanceOld MINERful Naïve Sym Par Super

One 129m

Easy 550m 26 s 19 s 16 s 12 s 2 s

Almost All

117 s 35 s 27 s 7 s

All 215 m 154 s 108 s 56 s

Using 8 (slower) CPUs, we get below

30 seconds

Page 40: Declare Mining for Dummies

PerformanceOld MINERful Naïve Sym Par Super

One 129m

Easy 550m 26 s 19 s 16 s 12 s 2 s

Almost All

117 s 35 s 27 s 7 s

All 215 m 154 s 108 s 56 s

If the old ProM miner has as good progression as our naïve miner, it would

take just under 259 days to mine all constraints; if it’s as good as our best, it

would take more than 10 days

Page 41: Declare Mining for Dummies

Support

Classically: Detect

this using complex

formula operations

Page 42: Declare Mining for Dummies

SupportClassically: Detect this using complex formula operations The idea of the formula

operations is to prevent certain sub-behavior and

see if the constraint becomes trivially true

Declare doesn’t have non-trivial sub-

behavior, so this just corresponds to

preventing a task

Page 43: Declare Mining for Dummies

Support

&

&

&

=

=

=

true?

true?

true?

Page 44: Declare Mining for Dummies

))

Support

&

&

&

=

=

=

Ø?

Ø?

Ø?

c

c

c

(((

)

Page 45: Declare Mining for Dummies

))

Support

&

&

&

=

=

=

Ø

Ø

Ø

c

c

c

(((

)

Page 46: Declare Mining for Dummies

)

Support

&

| = true( |

Page 47: Declare Mining for Dummies

)

Support

&

| = ?( |

c

This expression is our measurement of support for the

not co-existence constraint

Page 48: Declare Mining for Dummies

)

Support

&

| = ?(c

This expression is our measurement of support for the

precedence constraint

Page 49: Declare Mining for Dummies

)

Support

&

| = ?(c

This expression is our measurement of support for the

response constraint

The reason for having exclusions of more than one

constraint is disjunctive splits

Page 50: Declare Mining for Dummies

?

Support

)&

| = true( |

Conjunction with the support expression is

always true ⇒ positive support

&

)( = false

Conjunction with the support expression is always false ⇒

negative support

Page 51: Declare Mining for Dummies

Support

)&

| = true( |

&

)( = false

“Holds against all odds”

“Never had a chance to hold”

This is by the way very related to

apriori reduction

Page 52: Declare Mining for Dummies

SupportOur notion of support coincides with previous

Previous notion and ours induce a notion of confidence that is identical when previous is defined

Our is computed automatically and works for all constraints

Page 53: Declare Mining for Dummies

ComparisonProM Miner MINERful Unconstrained

Miner

Speed Slow!! Fast Faster

Extensibility Good None Better

Confidence choices

Good Good Medium

Confidence generality

None None Full

Page 54: Declare Mining for Dummies

ComparisonProM Miner MINERful Unconstrained

Miner

Speed Slow!! Fast Faster

Extensibility Good None Better

Confidence choices

Good Good Medium

Confidence generality

None None Full

The reason is that some choices of confidence do

not make sense in general

Page 55: Declare Mining for Dummies

Use CaseX X

A

B+ +

D

E

C

Assuming this block structure…

Events of block A and block B can never happen together

Events of block B can never happen after events of block C

not co-existence(A, B) not succession(C, B)

Page 56: Declare Mining for Dummies

Use Cases

X X

A

B+ +

D

E

C

Output: log with 8986 traces and 20 event classes

Page 57: Declare Mining for Dummies

Use Case

Page 58: Declare Mining for Dummies

Use Case

Page 59: Declare Mining for Dummies

Use Case 10.132 constraint instances

Page 60: Declare Mining for Dummies

Use Case Sort by name and parameters (to group similar constraints)

Page 61: Declare Mining for Dummies

Use CaseAggregate results fornot co-existence and

not succession

Page 62: Declare Mining for Dummies

Use Case We clearly recognize the

blocks

…except for parallelism

Page 63: Declare Mining for Dummies

Use Case

A xor B, C, D, E

X X

A

B+ +

D

E

C

B → C, D, E

C → D, E

Page 64: Declare Mining for Dummies

Use Case

A xor B, C, D, E

X X

A

B+ +

D

E

C

B → C, D, E

C → D, E

What about loops?

Page 65: Declare Mining for Dummies

Use Case

Support and location in the hierarchy

is computed automatically

Page 66: Declare Mining for Dummies

Use Caseweak relations

a or b

bottoma | b

a -> b a <- b

a xor b

!a or b

top!a | b

!a -> b !a <- b

!a xor b

Page 67: Declare Mining for Dummies

Use Caseweak relations

a or b

choice 1 of 2 choice 1 of 3

bottoma | b

a -> b

!loop

a <- b

not successionnot chain succession

a xor b

not co-existence

!a or b

!choice 1 of 2 !choice 2 of 3!exactly1

!exactly2

chain succession !strong init

top!a | b

last

loop

!precedence

!not chain succession

!responded existence!alternate

exactly1

!choice 1 of 5

exactly2

strong init

exclusive choice 1 of 3

exclusive choice 2 of 3

!a -> b

!a xor b

!a <- b

!not succession

!not co-existence choice 2 of 3

responded existence

!chain succession

!last

precedence!exclusive choice 1 of 3

!exclusive choice 2 of 3

alternate

choice 1 of 5

!choice 1 of 3

Page 68: Declare Mining for Dummies

Use Case

weak relations

a or b

choice 1 of 2 choice 1 of 3

bottoma | b

a -> b

!loop

a <- b

not successionnot chain succession

a xor b

not co-existence

!a or b

!choice 1 of 2 !choice 2 of 3!exactly1

!exactly2

chain succession !strong init

top!a | b

last

loop

!precedence

!not chain succession

!responded existence!alternate

exactly1

!choice 1 of 5

exactly2

strong init

exclusive choice 1 of 3

exclusive choice 2 of 3

!a -> b

!a xor b

!a <- b

!not succession

!not co-existence choice 2 of 3

responded existence

!chain succession

!last

precedence!exclusive choice 1 of 3

!exclusive choice 2 of 3

alternate

choice 1 of 5

!choice 1 of 3

Page 69: Declare Mining for Dummies

Use Caseweak relations

a or b

choice 1 of 2 choice 1 of 3

bottoma | b

a -> b

!loop

a <- b

not successionnot chain succession

a xor b

not co-existence

!a or b

!choice 1 of 2 !choice 2 of 3!exactly1

!exactly2

chain succession !strong init

top!a | b

last

loop

!precedence

!not chain succession

!responded existence!alternate

exactly1

!choice 1 of 5

exactly2

strong init

exclusive choice 1 of 3

exclusive choice 2 of 3

!a -> b

!a xor b

!a <- b

!not succession

!not co-existence choice 2 of 3

responded existence

!chain succession

!last

precedence!exclusive choice 1 of 3

!exclusive choice 2 of 3

alternate

choice 1 of 5

!choice 1 of 3

Page 70: Declare Mining for Dummies

Use Caseall relations

A

a or b

!a xor B

existence

B

!A xor b

choice 1 of 3 choice 1 of 2

bottoma | b

a -> b

!A <- B

!loop

a -> B

response

A -> b

precedence

A -> B

!a <- b

succession

a <- b

!A -> B

not chain succession

not succession

a <- B

!precedence

A <- b

!response

A <- B

!a -> b

a xor b

not co-existence

a xor B

!A

A xor b

!B

!responded existence

!A -> b!exactly2 chain response!strong init !exactly1!existence !a -> Bchain precedence

!a or b

chain succession!choice 2 of 3 !choice 1 of 2

top!a | b

exactly2

loop

exclusive choice 2 of 3 !not chain succession

!alternate

last

strong init

!choice 1 of 5

exactly1

exclusive choice 1 of 3

!a xor b

!a <- B!A <- b

!not succession

choice 2 of 3 !not co-existence

responded existence

!chain succession

!chain precedence

alternate

!chain response!succession

!last!exclusive choice 2 of 3

!exclusive choice 1 of 3

choice 1 of 5

!choice 1 of 3

Page 71: Declare Mining for Dummies

Use Case alpha relations

bottoma | b

a -> b a <- b

topa # b

Page 72: Declare Mining for Dummies

Use Cases

Page 73: Declare Mining for Dummies

Use CasesA B C D E

A [-­‐>] [-­‐>][?(<-­‐),  ?(#),?(-­‐>),  ?(||)] [-­‐>]

B [<-­‐] [||] [-­‐>][?(<-­‐),  ?(#),?(-­‐>),  ?(||)]

C [<-­‐] [||] [-­‐>][?(<-­‐),  ?(#),?(-­‐>),  ?(||)]

D [?(<-­‐),  ?(#),?(-­‐>),  ?(||)] [<-­‐] [<-­‐] [<-­‐]

E [<-­‐][?(<-­‐),  ?(#),?(-­‐>),  ?(||)]

[?(<-­‐),  ?(#),?(-­‐>),  ?(||)] [-­‐>]

We just implemented the

Alpha algorithm!

Page 74: Declare Mining for Dummies

Use Cases

Page 75: Declare Mining for Dummies

Use Cases

Page 76: Declare Mining for Dummies

ConclusionEfficient implementation

Tricks: symmetry, parallelism, superscalarity

General concept of supportAnything other

miners can produce,

we can produce…and we are fully

extensible!

Page 77: Declare Mining for Dummies

Future Work

!

Better recognition of modules

Use of model-checking techniques

Directed model-checking for LTL

Page 78: Declare Mining for Dummies

ConclusionEfficient implementation

Tricks: symmetry, parallelism, superscalarity

General concept of supportAnything other

miners can produce,

we can produce…and we are fully

extensible!