Diagnosis of Asynchronous Discrete Event Systems: Datalog to the Rescue! Serge Abiteboul (INRIA & U....
-
date post
15-Jan-2016 -
Category
Documents
-
view
227 -
download
0
Transcript of Diagnosis of Asynchronous Discrete Event Systems: Datalog to the Rescue! Serge Abiteboul (INRIA & U....
Diagnosis of Asynchronous Discrete Event Systems: Datalog to the Rescue!
Serge Abiteboul (INRIA & U. Paris 11) Zoë Abrams (INRIA & Stanford U.) Stefan Haar (INRIA) Tova Milo (Tel Aviv U.)
June 15th, 2005
2
History
Deductive databases was a hot topic in the late 80s• datalog• query optimization: magic sets and
QSQ Research in this area led to
beautiful results, with little industrial impact
3
Current Context
Years later, with networks everywhere, recursive data management is becoming more essential
Datalog is hot again!• Trevor and Suciu [2001]• Loo, Hellerstein, Stoica, and
Ramakreshnan [2005]• PODS Tutorial 1, Monica Lam et al.
[2005]• This paper: use datalog for diagnosis of
telecommunication systems
4
Diagnosis of Telecommunication Systems
A telecom system consists of software and hardware pieces distributed over a network
One piece fails and alarm signals are issued from throughout the network
messages
ack
messg. unprocessedtask incomplete
Alarms
5
Diagnosis of Telecom Systems cont.
Supervisor:• Collects alarms• Alarms are
asynchronous• Knows peer behavior
pattern • Goal: determine what
could have happened in the global system
messages
ack
messg. unprocessedtask incomplete
Alarms
6
Deductive Database Formulation Extensional data: a sequence of alarms
received by the supervisor Intensional data: the possible execution
flows that could have created the alarm sequenceCan the diagnosis problem be stated in terms of query evaluation in deductive databases?
Yes – it can!
7
Outline
Technical• Datalog and Query-Sub-Query (QSQ)• Adapt QSQ to distributed a setting: dQSQ
Application: Distributed Diagnosis of Telecommunication Systems• Petri Nets and Unfoldings• Datalog formulation of the diagnosis
problem• Benefits of using dQSQ
8
x,y (anc(x,y) ← parent(x,y))
x,y,z (anc(x,y) ← anc(x,z), parent(z,y))
Deductive Database
Explicit information Rules that enable inferences based
on the stored data
anc(x,y) :- parent(x,y)anc(x,y) :- anc(x,z), parent(z,y)
Datalog programAliceNancy
AliceJoyce
JoyceLois
LoisMark
LoisAndy
JoyceRuth
parent(x,y)
↨
9
Query Evaluation
Query: “Who has Joyce as an ancestor?”
Naive evaluation: materialize everything, then evaluate query
Goal: Compute query with minimal data materialization
q(y) :- anc(“Joyce”,y)
10
Query-Sub-Query (QSQ)
Known techniques for optimization of Datalog queries: magic set and QSQ
QSQ rewrites the Datalog program according to the given query
Materializes tuples bottom-up QSQ is based on two main notions:
• Adorned relations • Supplementary relations
11
Adorned Relations
A variable in a relation can be “bound” to a constant
For each relation, adorned versions based on the bindings of the variables are considered
anc(“Joyce”,y)
bound to a constant free
12
Adorned Relations
anc (x,y) :- parent(x,y)anc (x,y) :- anc (x,z), parent(z,y)q(y) :- anc (“Joyce”,y)
bound to a constant free
Different adornments of the same relation are treated as different relations during the QSQ computation
bf
bf
bf bf
Rewriting using adorned relations
13
ancbf (x,y) :- parent(x,y)
ancbf (x,y) :- ancbf (x,z), parent(z,y)
q(x) :- ancbf (“Joyce”,x)
Supplementary Relations
in_anc_bf(“Joyce”) :-sup_10(x) :- in_anc_bf(x)sup_11(x,y) :- sup_10(x),
parent(x,y)anc_bf(x,y) :- sup_11(x,y)sup_20(x) :- in_anc_bf(x)sup_21(x,z) :- sup_20(x),
anc_bf(x,z)sup_22(x,y) :- sup_21(x,z),
parent(z,y)anc_bf(x,y) :- sup_22(x,y)
QSQ rewriting
sup_10(x) sup_11(x,y)
sup_20(x) sup_21(x,z) sup_22(x,y)
Datalog
supplementary relations accumulate the relevant bindings for each position in the rule
QSQ Example
sup_10(x) sup_11(x,y)
sup_20(x) sup_21(x,z) sup_22(x,y)
Joyce, LoisJoyce, Ruth
AliceNancy
AliceJoyce
JoyceLois
LoisMark
LoisAndy
JoyceRuth
parent(x,y)
LoisRuth
Joyce, LoisJoyce, Ruth
Joyce, MarkJoyce, Andy
Mark Andy
ancbf (x,y) :- parent(x,y)
ancbf (x,y) :- ancbf (x,z), parent(z,y)
q(y) :- ancbf (“Joyce”,y)
Joyce, MarkJoyce, Andy
query result
Joyce
Joyce
in_anc_bf(“Joyce”) :-sup_10(x) :- in_anc_bf(x)sup_11(x,y) :- sup_10(x),
parent(x,y)anc_bf(x,y) :- sup_11(x,y)sup_20(x) :- in_anc_bf(x)sup_21(x,z) :- sup_20(x),
anc_bf(x,z)sup_22(x,y) :- sup_21(x,z),
parent(z,y)anc_bf(x,y) :- sup_22(x,y)
QSQ rewriting
Datalog
15
Nice Properties of QSQ
Compute the correct answer to the query
Materialize only a minimal set of tuples
Guaranteed to terminate
16
Beyond Datalog
We allow “object creation” (using Skolem functions)• crucial for our application
In general, may not terminate OK for our context
17
Outline
Technical• Datalog and Query-Sub-Query (QSQ)• Adapt QSQ to distributed a setting: dQSQ
Application: Distributed Diagnosis of Telecommunication Systems• Petri Nets and Unfoldings• Datalog formulation of the diagnosis
problem• Benefits of using dQSQ
18
Previous Work Distribution in Deductive Databases
Gelder, 1986 Trevor and Suciu, 2001 Hulin, 1989
19
Distributed Environment
r1 r(x,y) :- a(x,y)r2 r(x,y) :- s(x,z), t(z,y)r3 s(x,y) :- r(x,y), b(y,z)r4 t(x,y) :- c(x,y)
Centralized Datolog program
Distribution of the program between 3 peers
R
hosting r, aS
hosting s, b
T
hosting t, c
r1 r@R(x,y) :- a@R(x,y)r2 r@R(x,y) :- s@S(x,z), t@T(z,y)
r3 s@S(x,y) :- r@R(x,y), b@S(y,z)
r4 t@T(x,y) :- c@T(x,y)
If a relation is maintained at some peer, the rules defining it are known at that peer
20
Distributed QSQ Rewriting
For each rule: The peer in the head of the rule starts the rewriting
When a remote relation is encountered, the peer delegates the remainder of the rule to the remote peer in charge of that relation
21
Nice Properties of dQSQ
Compute the correct answer to the query
Materialize only a minimal set of tuples• As good as QSQ
No need for global knowledge Need, in general, some standard
technique to detect termination
22
Outline
Technical• Datalog and Query-Sub-Query (QSQ)• Adapt QSQ to distributed a setting: dQSQ
Application: Distributed Diagnosis of Telecommunication Systems• Petri Nets and Unfoldings• Datalog formulation of the diagnosis
problem• Benefits of using dQSQ
Petri Net Model
place
alarm symbol
transition
marked place
Circles denote places Marked places model the current state of the peer
Squares denote transitions A transition node can fire iff all its parent nodes are
marked
When a transition fires, the current state changes. Children of the transition are marked and parents are unmarked For example, if transition (i) fires, the marking moves from places 1,7 to places 2,3
When the transition fires, an alarm symbol is reported to the supervisor. In our example, alarm (b) is reported when (i) fires
1 7
25
The Diagnosis Problem
The supervisor receives an alarm sequence (b,p1),(a,p2),…,(c,p1).a,b,c – alarm symbolspi – the peer that emitted the alarm
Due to asynchronous communication• Alarms sent by different peers may not
appear in the order they were emitted• We can only assume that the order of alarms
is kept for each individual peer
Goal: Find an explanation for a given alarm sequence
Unfoldings represent all possible sequences of transition firings
Unfolding Model
The set of shaded nodes in the unfolding is a diagnosis for the alarm sequence (b; p1), (c; p1)
The nodes circled in red is another diagnosis for the alarm sequence (b; p1), (c; p1)
Purple node: not useful in explaining alarm sequence (b;p1),(c;p1) QSQ Goal: eliminate unnecessary portions of the unfolding
An Unfolding of the Petri Net
4
v
Petri Net
27
Outline
Technical• Datalog and Query-Sub-Query (QSQ)• Adapt QSQ to distributed a setting: dQSQ
Application: Distributed Diagnosis of Telecommunication Systems• Petri Nets and Unfoldings• Datalog formulation of the diagnosis
problem• Benefits of using dQSQ
Relations for Unfolding
causal(x,y) relation: the transition x was fired, and this eventually led to the firing of node y
An Unfolding of the Petri Net
conflict(x,y) relation: transitions x and y cannot coexist (i.e. not possible for x and y to have both occurred)
29
Constructing the Unfolding with Datalog
The conflict and causal relations capture the information needed to create the unfolding.
The causal relation is similar to the ancestor example
Formulating the conflict relation in Datalog (without negation) was a significant technical challenge: see paper for details
30
Diagnosis of an alarm sequence using Datalog
Describe unfoldings in distributed Datalog intensionally
Describe the alarm sequence in distributed Datalog extensionally
alarmSeq@s(a1,b,p1,root)
alarmSeq@s(a2,c,p1,a1)
Describe query in dist. Datalogq@s(z,x) :- seqOut@p1(z,a2),
transInSeq@p1(z,x)
(b;p1),(c;p1)
zxAiAiiiBiiBiii
31
Outline
Technical• Datalog and Query-Sub-Query (QSQ)• Adapt QSQ to distributed a setting: dQSQ
Application: Distributed Diagnosis of Telecommunication Systems• Petri Nets and Unfoldings• Datalog formulation of the diagnosis
problem• Benefits of using dQSQ
32
The Benefits
We have stated the diagnosis problem using datalog – so what?
2. Can solve more general diagnosis
problems
1.Optimized distributed computation using dQSQ
Three major benefits:
3. Implementation language
33
Benefit 1: Efficiency of dQSQ
Minimal amount of unfolding materialized
thm: dQSQ achieves an optimization as good as that previously provided by the dedicated diagnosis algorithms [BFHJ03,BFHJ04]
34
Benefit 1 continued:Distributed Computation
dQSQ enables distributed computation • The dQSQ rewriting is performed locally
at each peer, without any global knowledge
• Limited communication: guarantee that a peer only need communicate with neighbours in the Petri Net.
• Diagnosis occurs without any global knowledge of the overall net structure
35
Benefit 2:Problem Generalizations
Hidden transitions: not all alarms reported to the supervisor
Alarm patterns: alarm patterns described by some regular language (eg ab*)
Constraints on the configurations of interest: alarm sequences not containing some known pattern
Issues with termination
36
Benefit 3:Active XML (AXML)
AXML = XML with embedded calls to Web services [INRIA]
Implementation of dQSQ using AXML [Noam Pettel, Tel Aviv]• An AXML document contains both
extensional and intensional data• Use of continuous services
Optimization of a fragment of AXML• The original motivation for dQSQ• Extended to “trees” – not in the paper
37
Conclusion
Datalog strikes back: relevant to current P2P systems
Contribution• distributed QSQ• an application to network diagnosis
Future work• optimization and analysis (termination,
confluence) of AXML and more generally P2P data management