Diagnosis of Asynchronous Discrete Event Systems: Datalog to the Rescue! Serge Abiteboul (INRIA & U....

38
Diagnosis of Asynchronous Discrete Event Systems: Datalog to the Rescue! Serge Abiteboul (INRIA & U. Paris 11) Zoë Abrams (INRIA & Stanford U.) Stefan Haar (INRIA) Tova Milo (Tel Aviv U.)
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    227
  • download

    0

Transcript of Diagnosis of Asynchronous Discrete Event Systems: Datalog to the Rescue! Serge Abiteboul (INRIA & U....

Diagnosis of Asynchronous Discrete Event Systems: Datalog to the Rescue!

Serge Abiteboul (INRIA & U. Paris 11) Zoë Abrams (INRIA & Stanford U.) Stefan Haar (INRIA) Tova Milo (Tel Aviv U.)

June 15th, 2005

2

History

Deductive databases was a hot topic in the late 80s• datalog• query optimization: magic sets and

QSQ Research in this area led to

beautiful results, with little industrial impact

3

Current Context

Years later, with networks everywhere, recursive data management is becoming more essential

Datalog is hot again!• Trevor and Suciu [2001]• Loo, Hellerstein, Stoica, and

Ramakreshnan [2005]• PODS Tutorial 1, Monica Lam et al.

[2005]• This paper: use datalog for diagnosis of

telecommunication systems

4

Diagnosis of Telecommunication Systems

A telecom system consists of software and hardware pieces distributed over a network

One piece fails and alarm signals are issued from throughout the network

messages

ack

messg. unprocessedtask incomplete

Alarms

5

Diagnosis of Telecom Systems cont.

Supervisor:• Collects alarms• Alarms are

asynchronous• Knows peer behavior

pattern • Goal: determine what

could have happened in the global system

messages

ack

messg. unprocessedtask incomplete

Alarms

6

Deductive Database Formulation Extensional data: a sequence of alarms

received by the supervisor Intensional data: the possible execution

flows that could have created the alarm sequenceCan the diagnosis problem be stated in terms of query evaluation in deductive databases?

Yes – it can!

7

Outline

Technical• Datalog and Query-Sub-Query (QSQ)• Adapt QSQ to distributed a setting: dQSQ

Application: Distributed Diagnosis of Telecommunication Systems• Petri Nets and Unfoldings• Datalog formulation of the diagnosis

problem• Benefits of using dQSQ

8

x,y (anc(x,y) ← parent(x,y))

x,y,z (anc(x,y) ← anc(x,z), parent(z,y))

Deductive Database

Explicit information Rules that enable inferences based

on the stored data

anc(x,y) :- parent(x,y)anc(x,y) :- anc(x,z), parent(z,y)

Datalog programAliceNancy

AliceJoyce

JoyceLois

LoisMark

LoisAndy

JoyceRuth

parent(x,y)

9

Query Evaluation

Query: “Who has Joyce as an ancestor?”

Naive evaluation: materialize everything, then evaluate query

Goal: Compute query with minimal data materialization

q(y) :- anc(“Joyce”,y)

10

Query-Sub-Query (QSQ)

Known techniques for optimization of Datalog queries: magic set and QSQ

QSQ rewrites the Datalog program according to the given query

Materializes tuples bottom-up QSQ is based on two main notions:

• Adorned relations • Supplementary relations

11

Adorned Relations

A variable in a relation can be “bound” to a constant

For each relation, adorned versions based on the bindings of the variables are considered

anc(“Joyce”,y)

bound to a constant free

12

Adorned Relations

anc (x,y) :- parent(x,y)anc (x,y) :- anc (x,z), parent(z,y)q(y) :- anc (“Joyce”,y)

bound to a constant free

Different adornments of the same relation are treated as different relations during the QSQ computation

bf

bf

bf bf

Rewriting using adorned relations

13

ancbf (x,y) :- parent(x,y)

ancbf (x,y) :- ancbf (x,z), parent(z,y)

q(x) :- ancbf (“Joyce”,x)

Supplementary Relations

in_anc_bf(“Joyce”) :-sup_10(x) :- in_anc_bf(x)sup_11(x,y) :- sup_10(x),

parent(x,y)anc_bf(x,y) :- sup_11(x,y)sup_20(x) :- in_anc_bf(x)sup_21(x,z) :- sup_20(x),

anc_bf(x,z)sup_22(x,y) :- sup_21(x,z),

parent(z,y)anc_bf(x,y) :- sup_22(x,y)

QSQ rewriting

sup_10(x) sup_11(x,y)

sup_20(x) sup_21(x,z) sup_22(x,y)

Datalog

supplementary relations accumulate the relevant bindings for each position in the rule

QSQ Example

sup_10(x) sup_11(x,y)

sup_20(x) sup_21(x,z) sup_22(x,y)

Joyce, LoisJoyce, Ruth

AliceNancy

AliceJoyce

JoyceLois

LoisMark

LoisAndy

JoyceRuth

parent(x,y)

LoisRuth

Joyce, LoisJoyce, Ruth

Joyce, MarkJoyce, Andy

Mark Andy

ancbf (x,y) :- parent(x,y)

ancbf (x,y) :- ancbf (x,z), parent(z,y)

q(y) :- ancbf (“Joyce”,y)

Joyce, MarkJoyce, Andy

query result

Joyce

Joyce

in_anc_bf(“Joyce”) :-sup_10(x) :- in_anc_bf(x)sup_11(x,y) :- sup_10(x),

parent(x,y)anc_bf(x,y) :- sup_11(x,y)sup_20(x) :- in_anc_bf(x)sup_21(x,z) :- sup_20(x),

anc_bf(x,z)sup_22(x,y) :- sup_21(x,z),

parent(z,y)anc_bf(x,y) :- sup_22(x,y)

QSQ rewriting

Datalog

15

Nice Properties of QSQ

Compute the correct answer to the query

Materialize only a minimal set of tuples

Guaranteed to terminate

16

Beyond Datalog

We allow “object creation” (using Skolem functions)• crucial for our application

In general, may not terminate OK for our context

17

Outline

Technical• Datalog and Query-Sub-Query (QSQ)• Adapt QSQ to distributed a setting: dQSQ

Application: Distributed Diagnosis of Telecommunication Systems• Petri Nets and Unfoldings• Datalog formulation of the diagnosis

problem• Benefits of using dQSQ

18

Previous Work Distribution in Deductive Databases

Gelder, 1986 Trevor and Suciu, 2001 Hulin, 1989

19

Distributed Environment

r1 r(x,y) :- a(x,y)r2 r(x,y) :- s(x,z), t(z,y)r3 s(x,y) :- r(x,y), b(y,z)r4 t(x,y) :- c(x,y)

Centralized Datolog program

Distribution of the program between 3 peers

R

hosting r, aS

hosting s, b

T

hosting t, c

r1 r@R(x,y) :- a@R(x,y)r2 r@R(x,y) :- s@S(x,z), t@T(z,y)

r3 s@S(x,y) :- r@R(x,y), b@S(y,z)

r4 t@T(x,y) :- c@T(x,y)

If a relation is maintained at some peer, the rules defining it are known at that peer

20

Distributed QSQ Rewriting

For each rule: The peer in the head of the rule starts the rewriting

When a remote relation is encountered, the peer delegates the remainder of the rule to the remote peer in charge of that relation

21

Nice Properties of dQSQ

Compute the correct answer to the query

Materialize only a minimal set of tuples• As good as QSQ

No need for global knowledge Need, in general, some standard

technique to detect termination

22

Outline

Technical• Datalog and Query-Sub-Query (QSQ)• Adapt QSQ to distributed a setting: dQSQ

Application: Distributed Diagnosis of Telecommunication Systems• Petri Nets and Unfoldings• Datalog formulation of the diagnosis

problem• Benefits of using dQSQ

Petri Net Model

Each piece is described by a Petri NetThe communications are modeled as transitions

Petri Net Model

place

alarm symbol

transition

marked place

Circles denote places Marked places model the current state of the peer

Squares denote transitions A transition node can fire iff all its parent nodes are

marked

When a transition fires, the current state changes. Children of the transition are marked and parents are unmarked For example, if transition (i) fires, the marking moves from places 1,7 to places 2,3

When the transition fires, an alarm symbol is reported to the supervisor. In our example, alarm (b) is reported when (i) fires

1 7

25

The Diagnosis Problem

The supervisor receives an alarm sequence (b,p1),(a,p2),…,(c,p1).a,b,c – alarm symbolspi – the peer that emitted the alarm

Due to asynchronous communication• Alarms sent by different peers may not

appear in the order they were emitted• We can only assume that the order of alarms

is kept for each individual peer

Goal: Find an explanation for a given alarm sequence

Unfoldings represent all possible sequences of transition firings

Unfolding Model

The set of shaded nodes in the unfolding is a diagnosis for the alarm sequence (b; p1), (c; p1)

The nodes circled in red is another diagnosis for the alarm sequence (b; p1), (c; p1)

Purple node: not useful in explaining alarm sequence (b;p1),(c;p1) QSQ Goal: eliminate unnecessary portions of the unfolding

An Unfolding of the Petri Net

4

v

Petri Net

27

Outline

Technical• Datalog and Query-Sub-Query (QSQ)• Adapt QSQ to distributed a setting: dQSQ

Application: Distributed Diagnosis of Telecommunication Systems• Petri Nets and Unfoldings• Datalog formulation of the diagnosis

problem• Benefits of using dQSQ

Relations for Unfolding

causal(x,y) relation: the transition x was fired, and this eventually led to the firing of node y

An Unfolding of the Petri Net

conflict(x,y) relation: transitions x and y cannot coexist (i.e. not possible for x and y to have both occurred)

29

Constructing the Unfolding with Datalog

The conflict and causal relations capture the information needed to create the unfolding.

The causal relation is similar to the ancestor example

Formulating the conflict relation in Datalog (without negation) was a significant technical challenge: see paper for details

30

Diagnosis of an alarm sequence using Datalog

Describe unfoldings in distributed Datalog intensionally

Describe the alarm sequence in distributed Datalog extensionally

alarmSeq@s(a1,b,p1,root)

alarmSeq@s(a2,c,p1,a1)

Describe query in dist. Datalogq@s(z,x) :- seqOut@p1(z,a2),

transInSeq@p1(z,x)

(b;p1),(c;p1)

zxAiAiiiBiiBiii

31

Outline

Technical• Datalog and Query-Sub-Query (QSQ)• Adapt QSQ to distributed a setting: dQSQ

Application: Distributed Diagnosis of Telecommunication Systems• Petri Nets and Unfoldings• Datalog formulation of the diagnosis

problem• Benefits of using dQSQ

32

The Benefits

We have stated the diagnosis problem using datalog – so what?

2. Can solve more general diagnosis

problems

1.Optimized distributed computation using dQSQ

Three major benefits:

3. Implementation language

33

Benefit 1: Efficiency of dQSQ

Minimal amount of unfolding materialized

thm: dQSQ achieves an optimization as good as that previously provided by the dedicated diagnosis algorithms [BFHJ03,BFHJ04]

34

Benefit 1 continued:Distributed Computation

dQSQ enables distributed computation • The dQSQ rewriting is performed locally

at each peer, without any global knowledge

• Limited communication: guarantee that a peer only need communicate with neighbours in the Petri Net.

• Diagnosis occurs without any global knowledge of the overall net structure

35

Benefit 2:Problem Generalizations

Hidden transitions: not all alarms reported to the supervisor

Alarm patterns: alarm patterns described by some regular language (eg ab*)

Constraints on the configurations of interest: alarm sequences not containing some known pattern

Issues with termination

36

Benefit 3:Active XML (AXML)

AXML = XML with embedded calls to Web services [INRIA]

Implementation of dQSQ using AXML [Noam Pettel, Tel Aviv]• An AXML document contains both

extensional and intensional data• Use of continuous services

Optimization of a fragment of AXML• The original motivation for dQSQ• Extended to “trees” – not in the paper

37

Conclusion

Datalog strikes back: relevant to current P2P systems

Contribution• distributed QSQ• an application to network diagnosis

Future work• optimization and analysis (termination,

confluence) of AXML and more generally P2P data management

merci