Post on 19-Jun-2015
description
A Hybrid Diagnosis Approach Combining Black-‐Box and White-‐Box Reasoning
Mingmin Chen1 Shizhuo Yu1 Nico Franz2 Shawn Bowers3 Bertram Ludäscher1 4
1 Department of Computer Science , University of California, Davis 2 School of Life Sciences, Arizona State University
3 Department of Computer Science, Gonzaga University 4 GSLIS & NCSA, University of Illinois at Urbana-‐Champaign
Outline • The Taxonomy Alignment Problem
T1 + T2 + A => T3 (ambiguous .. unique .. inconsistent)
• Model-‐based Diagnosis [Reiter’87] – Black-‐box
• Hybrid Approach – Black-‐box & White-‐box combined
• Benchmark Results
Hybrid Diagnosis in Euler @ RuleML2014, Prague 2
Meet Prof. Nico Franz: Curator of Insects @ ASU
Hybrid Diagnosis in Euler @ RuleML2014, Prague 3
What Nico et al. do for a living …
Hybrid Diagnosis in Euler @ RuleML2014, Prague 4
What Nico does for a living (cont’d): The Indoors Part
• First: go fun places, find new bugs, study them … – “Bugs-‐R-‐Us” (see taxonbytes.org)
• Now: Compare, align and revise taxonomies, based on careful observafon, “character” data, experfse …
• Formally: – Input: T1 + T2 (taxonomies) + A (expert ar.cula.ons)
– Output: revised, “merged” taxonomy (-‐ies) T3
Hybrid Diagnosis in Euler @ RuleML2014, Prague 5
• Given: – Taxonomies T1 , T2
• incl. constraints (coverage, disjointness) – Set of articulations (alignment) A
• Find: – Combined (“merged”) taxonomy T3 (= T1 + T2 + A)
• Is it a taxonomy? Or a DAG? – Optional:
• Final alignment (should be minimal)
6
Taxonomy Alignment Problem (TAP)
T1
T2
A T3
TAP: Possible Outcomes
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
Hybrid Diagnosis in Euler @ RuleML2014, Prague 7
TAP: Possible Outcomes
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
{A1, A2, A3, A4}
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4}
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4}
{A1} {A2} {A3} {A4}
{ }
Inconsistent! è Diagnosis (Reiter) = Black-‐Box Provenance
Hybrid Diagnosis in Euler @ RuleML2014, Prague 8
TAP: Possible Outcomes
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
{A1, A2, A3, A4}
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4}
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4}
{A1} {A2} {A3} {A4}
{ }
Inconsistent! è Diagnosis (Reiter) = Black-‐Box Provenance
1.b2.e
1.c
1.a2.d
2.f
Ambiguous! è Mul5ple Possible Worlds
Hybrid Diagnosis in Euler @ RuleML2014, Prague 9
TAP: Possible Outcomes
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
{A1, A2, A3, A4}
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4}
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4}
{A1} {A2} {A3} {A4}
{ }
Inconsistent! è Diagnosis (Reiter) = Black-‐Box Provenance
1.b2.e
1.c
1.a2.d
2.f
Ambiguous! è Mul5ple Possible Worlds
1.c2.f
1.b
1.a2.d
2.e
Hybrid Diagnosis in Euler @ RuleML2014, Prague 10
TAP: Possible Outcomes
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
{A1, A2, A3, A4}
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4}
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4}
{A1} {A2} {A3} {A4}
{ }
Inconsistent! è Diagnosis (Reiter) = Black-‐Box Provenance
1.b2.e
1.c
1.a2.d
2.f
Ambiguous! è Mul5ple Possible Worlds
1.c2.f
1.b
1.a2.d
2.e
1.b1.a
2.e
2.d1.c
2.fHybrid Diagnosis in Euler @ RuleML2014, Prague 11
• FO reasoning about taxonomies (MFOL)
• Earlier: CleanTax – Prover9/Mace4
• Now: Euler – ASP Reasoners (DLV,
Clingo) – Specialized reasoners
(PyRCC) – … – X = ASP, RCC, …
Euler/X Toolkit and Workflow
Hybrid Diagnosis in Euler @ RuleML2014, Prague 12
Real-‐world examples: Turn this …
Hybrid Diagnosis in Euler @ RuleML2014, Prague 13
… into this! (Perellescus Alignment Result)
• T3 := T1 and T2 are “merged” – Blue dashed: overlaps è resolve via “zoom-in view”
14
1.16
1.14
2.40
2.44
2.47
1.11
2.382.35
1.20
1.23
2.52
2.53
2.54
1.172.41
1.222.46
1.252.48
1.122.36
1.262.49
1.132.37
1.182.42
1.192.43
1.152.39
1.212.45
1.12L2.36L
1.272.50
1.242.51
Nodes
Taxonomy 1 5Taxonomy 2 8
MERGED Taxa 13 Edges
Overlaps 10Input 24
INFERRED 5
Possible Outcome: Inconsistency!
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
{A1, A2, A3, A4}
{A1, A2, A3} {A1, A2, A4} {A1, A3, A4} {A2, A3, A4}
{A1, A2} {A1, A3} {A2, A3} {A1, A4} {A2, A4} {A3, A4}
{A1} {A2} {A3} {A4}
{ }
Inconsistent! è Diagnosis (Reiter) = Black-‐Box Provenance
• Need to debug the input arfculafons è (black-‐box) diagnosis!
• Focus: – How do we efficiently compute the diagnosfc lance?
• Also: – How to visualize..
Hybrid Diagnosis in Euler @ RuleML2014, Prague 15
Example Instance (from synthefc benchmark suite)
• Here: N = 10 taxa in T1, T2 • Euler/X finds:
inconsistent! • è diagnos\c la]ce of 210
= 1024 nodes è Find minimal inconsistent
subset (MIS) è maximal consistent subset
(MCS) .. è show to user!
Hybrid Diagnosis in Euler @ RuleML2014, Prague 16
Visualizing Diagnoses: Scalability Issues
Hybrid Diagnosis in Euler @ RuleML2014, Prague 17
N = 10 arfculafons è 210 = 1024 node diagnosfc lance (… ouch!) … but only one MIS …
Beaer Idea: Just show MIS, MCS
Hybrid Diagnosis in Euler @ RuleML2014, Prague 18
N = 4 arfculafons è 24 = 16 node diagnosfc lance, but 3 MCS and 2 MIS are enough!
.. but 4 MCS and 1 MIC tell it all!
Hybrid Diagnosis in Euler @ RuleML2014, Prague 19
1024 node lance
Visualizing Diagnoses: Focusing on MIS (and MCS)
Visualizing Diagnoses
Hybrid Diagnosis in Euler @ RuleML2014, Prague 20
Example from paper: N=12 è 4096 nodes .. but 7 MCS and 5 MIC tell it all!
Black-Box Inconsistency Analysis (Diagnostic Lattice)
• Then: – Repair: find & revise minimal inconsistent subsets (Min-Incons) – Expand: find maximal consistent subsets (Max-Cons) & revise outs
What happens if you can’t have all (here: 4) articulations together?
21
• Black-‐box Analysis (Hinng Set algo.) yields a Diagnosis (lance) – for n=4 arfculafons, there are 168 possible diagnoses – depending on expected “red/green areas” è explore space differently
• |arfculafons| = n è |possible diagnoses| = |monotonic Boolean funcfons| = Dedekind Number (n): 2, 3, 6, 20, 168, 7581, 7828354, ...
Inconsistency Analysis (Diagnostic Lattice)
• The Min-Incons (MIS) and Max-Cons (MCS) sets determine all others
è Repair MIS and/or Expand MCS
22
Improving Diagnosis
• Reiter’s “black-‐box” (model-‐based) diagnosis helps debug the arfculafons
• Limited scalability (inherent complexity) • But every bit helps:
– Hinng Set Algorithm (“logarithmic extracfon”)
• Our idea: – Exploit “white-‐box” reasoning informafon è RULES to the rescue
Hybrid Diagnosis in Euler @ RuleML2014, Prague 23
Key Idea: exploit white-‐box info • We use Answer Set Programming (ASP) to solve Taxonomy Alignment Problem (TAP)
• Inconsistency = “False” is derived in the head: False :-‐ <denial of integrity constraint>
• Apply provenance trick from databases J – What arfculafons contribute to a derivafon of “False” ?
– Eliminate those that don’t! è an example of reusing inferences across separate black-‐box tests!
Hybrid Diagnosis in Euler @ RuleML2014, Prague 24
The Provenance “Trick”
Hybrid Diagnosis in Euler @ RuleML2014, Prague 25
Hybrid Provenance
A3: c < f Black-‐box Provenance
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
Hybrid Diagnosis in Euler @ RuleML2014, Prague 26
Hybrid Provenance
A3: c < f Black-‐box Provenance
r7: d = e ∪ f
a = e ∪ f
A1: a = d
r3: a = b ∪ c
f < c
r4: b ∩ c = ∅ r8: e ∩ f = ∅ A2: b < e
A1+A2 + … => f < c
White-‐box Provenance
1.a 1.bisa
1.cisa
2.d
=
2.e<
<
2.f<isa
isa
Input Alignment
Hybrid Diagnosis in Euler @ RuleML2014, Prague 27
The Hybrid Approach
Hybrid Diagnosis in Euler @ RuleML2014, Prague 28
Hybrid Approach
Hybrid Diagnosis in Euler @ RuleML2014, Prague 29
What ar5cula5ons contribute to some inconsistency?
Good old black-‐box (HST)
Benchmark Results
• White-‐box < Hybrid < Black-‐box (runfmes) • Note: white-‐box does not give you a diagnosis • Potassco < DLV
Hybrid Diagnosis in Euler @ RuleML2014, Prague 30
Benchmark DLV
• White-‐box < Hybrid < Black-‐box (runfmes) • Potassco < DLV
Hybrid Diagnosis in Euler @ RuleML2014, Prague 31
Benchmark Clingo
• White-‐box < Hybrid < Black-‐box (runfmes) • Potassco < DLV
Hybrid Diagnosis in Euler @ RuleML2014, Prague 32
Conclusions
• ASP rules can be used to efficiently solve real-‐world taxonomy reasoning problems
• Reiter’s diagnosis useful to debug inconsistent alignments
• Adding a “white-‐box” provenance approach speeds up state-‐of-‐the-‐art HST algorithm by elimina\ng independent ar\cula\ons
• Future work: – Further improvements, including parallelism:
• Trade-‐off with sharing inferences across parallel instances
Hybrid Diagnosis in Euler @ RuleML2014, Prague 33