Slider: an Efficient Incremental Reasoner, by Jules Chevalier
-
Upload
opencloudware -
Category
Internet
-
view
71 -
download
0
Transcript of Slider: an Efficient Incremental Reasoner, by Jules Chevalier
Slider: an Efficient Incremental Reasoner
Jules [email protected]
LT2C, Télécom Saint Etienne, Université Jean Monnet
Octobre 2014
Supervisors :Fréférique LaforestChristophe GravierJulien Subercaze
Semantic Web & Description Logic
I Formalises concepts to represent themI Standardizes this representationI Makes it readable for both humans and computersI Links these data togetherI Allows automatic operations on these data
I Integrity constraint validationI Query the knowledge baseI Extraction of implicit data
3 / 27
Semantic Web & Description Logic
I Formalises concepts to represent themI Standardizes this representationI Makes it readable for both humans and computersI Links these data togetherI Allows automatic operations on these data
I Integrity constraint validationI Query the knowledge baseI Extraction of implicit data = Reasoning
4 / 27
Reasoning : Forward Chaining VS Backward Chaining
Abraham
Homer Marge
Liza Bart
I What we know :I Abraham father HomerI Homer father LizaI Homer father BartI Marge mother LizaI Marge mother bart
I What Forward Chaining do :I Abraham grandfather LizaI Abraham grandfather BartI ...I Abraham grandfather Liza ? → yes
I What Backward Chaining do :I Abraham grandfather Liza ?I Abraham father X & X father Liza ?I Abraham father Homer &
Homer father Liza → yes
5 / 27
Reasoning : Forward Chaining VS Backward Chaining
Abraham
Homer Marge
Liza Bart
I What we know :I Abraham father HomerI Homer father LizaI Homer father BartI Marge mother LizaI Marge mother bart
I What Forward Chaining do :I Abraham grandfather LizaI Abraham grandfather BartI ...I Abraham grandfather Liza ? → yes
I What Backward Chaining do :I Abraham grandfather Liza ?I Abraham father X & X father Liza ?I Abraham father Homer &
Homer father Liza → yes
5 / 27
Reasoning : Forward Chaining VS Backward Chaining
Abraham
Homer Marge
Liza Bart
I What we know :I Abraham father HomerI Homer father LizaI Homer father BartI Marge mother LizaI Marge mother bart
I What Forward Chaining do :I Abraham grandfather LizaI Abraham grandfather BartI ...I Abraham grandfather Liza ? → yes
I What Backward Chaining do :I Abraham grandfather Liza ?I Abraham father X & X father Liza ?I Abraham father Homer &
Homer father Liza → yes5 / 27
Rule-based ReasoningRules
I An antecedent: Allows the rule to be executedI A consequent: The statement inferred
c1 subClassOf c2, x type c1 (cax-sco)x type c2
FragmentsI A fragment is a set of inference rulesI Semantic Web standards suggest different pre defined
fragments (RDFS, OWL Lite, OWL Full, OWL DL, ...)I The more they have a high expressivity, the more the
operations are complex (from P to NEXPTIME)I Choosing one fragment is trade off between expressivity and
computational complexity
6 / 27
Problematic
What we want to doI Efficient forward-reasoning over Big Data (NP complete for
most complex ontologies)
What are the problemsI Rules form a cyclic graph
I Complexity depends on the fragment !I The amount of triples generated is quite unpredictable
I The complexity also depends on data !I Big Data is not static
I We need to handle data streams !
7 / 27
Problematic
What we want to doI Efficient forward-reasoning over Big Data (NP complete for
most complex ontologies)
What are the problemsI Rules form a cyclic graph
I Complexity depends on the fragment !I The amount of triples generated is quite unpredictable
I The complexity also depends on data !I Big Data is not static
I We need to handle data streams !
7 / 27
Batch reasoning approachesWebPie : a Web-scale Parallel Inference Engine
I 2009 - Jacopo Urbani Thesis [6]I Uses MapReduce for OWL Horst and RDFS reasoning
I 2011 - Fix some issues to improve OWL Horst reasoning [7]I Duplicates limitationI Indexation for sameAsI Greedy schedulingI Cleaner Job after some rules, or at the end
MapResolve [5]
I Based their work on WebPie to develop a OWL Horst reasonerI Use 3 sets for triples : usable, used, inferredI Limits overheads, optimiseI Points out MapReduce limitations
9 / 27
Analysis : MapReduce approaches
MapReduceFramework
I Allows to implement distributedtasks
I The Hadoop frameworkI Best suited to batch process huge
amounts of data
I MapReduce requires an acyclicdataflow
I Jobs run in isolationI Not suitable network shufflingI Hadoop distributed file system
WebPie and MapResolveContributions
I Only provide batch reasoningI Nodes must wait for each otherI Generate a lot of duplicatesI Fragment dependantI Naive partitioning
I Critical letter for WebPie [4]
10 / 27
Incremental solutionsHistory Matters: Incremental Ontology Reasoning UsingModules [2]
I Maintains classification of ontologies as they evolveI Provides encouraging resultsI Not viable for static hierarchy of ontologiesI Not adapted on high number of nominals
Incremental Reasoning in OWL EL without Bookkeeping [3]
I Handles both addition and deletion of knowledgeI Incremental classification of TBoxI Limited to the classification on the TBoxI Dedicated to the EL+ fragment
11 / 27
Proposed solution
SliderI Parallel and Scalable Execution
I Rules mapped to independent modulesI Multiple rule instances allowed to run in parallel
I Duplicates LimitationI Shared triple storeI Vertical partitioning [1] and multiple indexing
I Data Stream SupportI Streamed architectureI Parallel parsing/reasoning
I Fragment’s CustomizationI Dynamic support of rulesetI ρdf and RDFS natively supportedI Extendible to any other fragment
13 / 27
Architecture
TRIPLE STORE
EvolvingData
Newtriples
Explicit Triples
Implicit Triples
Streamed Triples
R2
Buffer R3Distributor R3
Distributor R2
Distributor R1
Buffer R2
Buffer R1
R2R1
R2
R1R3
Input Manager Rules Buffers Thread Pool Distributors
R1
R1
R2
R2
R3
R3
Incomingtriples
Concurrent Access
Rule Modules
InputManager
R1
R2
R3
R3
R1
R2
14 / 27
Architecture
Input ManagerI Receives incoming triplesI Sends them to
I The triple storeI The rules buffers
Rules BuffersI A buffer for each ruleI Run the rule when fullI Run the rule when
timed-outI Ensures completeness
Thread PoolI Manages a pool instancesI Ensures scalability
Rule instanceI Execute the inferenceI Access concurrently the
triple store
DistributorI Stores inferred triplesI Dispatches them to the
buffers
15 / 27
Triple Store
Vertical Partitioning
21
34
57
689
(1,2,3)(4,2,5)(6,7,8)(6,7,9)
TRIPLES ENCODING
Near-optimal indexingI Indexing by predicates, subjects
and objectsI Best trade-off for nearly all rules
from the OWL fragments
Concurrent AccessI ReentrantReadWriteLocks
ensure concurrencyI Write lock to add triplesI Read lock for other methods
Duplicates EliminationI HashMap of MultiMaps∗
I Bans duplicatesI Ensures uniqueness of triples
∗Google’s Guava libraries
17 / 27
Rules Dependency Graph
I Directed graphI Edges represent rulesI A → B: B can use the output
of A
I Created at initialisation timeI Used to route new triples by
I The input managerI The distributors
PRP-DOM
CAX-SCO
PRP-RNG
PRP-SPO1
SCM-SCO
SCM-DOM2
SCM-RNG2
SCM-SPO
Universal Input
Rules Dependency Graph for ρdf
18 / 27
Architecture
TRIPLE STORE
EvolvingData
Newtriples
Explicit Triples
Implicit Triples
Streamed Triples
R2
Buffer R3Distributor R3
Distributor R2
Distributor R1
Buffer R2
Buffer R1
R2R1
R2
R1R3
Input Manager Rules Buffers Thread Pool Distributors
R1
R1
R2
R2
R3
R3
Incomingtriples
Concurrent Access
Rule Modules
InputManager
R1
R2
R3
R3
R1
R2
19 / 27
Experimentations
BaselineI OWLIM-SE (Standard Edition)I Semantic repository with reasoning featuresI Fastest reasoner available to the best of our knowledgeI Outperforms Jena and SesameI Natively supports RDFS, custom rule configuration for ρdf
DatasetI 13 ontologies from 3 sets:
I 2 Real life ontologies: WordNet and WikipediaI 5 generated by BSBM, from 100,000 to 5 million triplesI 6 subClassOf ontologies (closure computation, duplicates
intensive)
21 / 27
Experiments
ρdf reasoning RDFS reasoningOntology OWLIM Slider OWLIM SliderBSBM_100k 9.907s 4.636s 7.487s 4.558sBSBM_200k 13.338s 6.059s 11.064s 6.198sBSBM_500k 23.595s 11.133s 20.580s 10.984sBSBM_1M 39.364s 22.357s 35.602s 22.192sBSBM_5M 170.151s 126.292s 160.699s 127.037swikipedia 18.802s 17.422s 17.186s 22.443swordnet - - 15.075s 8.828ssubClassOf10 3.507s 1.209s 1.423s 1.216ssubClassOf20 3.730s 1.316s 1.536s 1.330ssubClassOf50 4.159s 1.615s 1.865s 1.583ssubClassOf100 4.397s 1.827s 2.242s 1.805ssubClassOf200 4.962s 2.210s 2.837s 2.170ssubClassOf500 9.862s 8.102s 7.584s 7.625s
ImprovementI Average 71.47%I RDFS 36.08%I ρdf 106.86%
RDFS
owlimse
slider
5
10
15
20
25
30
35
40
Infe
rence
Tim
e(i
n s
eco
nds)
ρdf
51015202530354045
BSBM
_100
k
BSBM
_200
k
BSBM
_500
k
BSBM
_1M
wikip
edia
wordn
et
subC
lass
Of10
subC
lass
Of20
subC
lass
Of50
subC
lass
Of100
subC
lass
Of200
subC
lass
Of500
owlimse
slider
Infe
rence
Tim
e(i
n s
eco
nds)
Inference time for Slider and OWLIM-SE on ρdf and RDFS
22 / 27
Conclusion and Future Work
SliderI Efficient incremental rule-based reasoningI Fragment agnocismI Data streams supportI Improvement of 71.47% in average against baseline
Future WorkI Implementation of new rulesetsI Just-in-time optimisation of rules schedulingI Use of historical statistics for adaptation
25 / 27
Bibliography I
[1] Abadi, D. J., Marcus, A., Madden, S. R., and Hollenbach, K.Scalable semantic web data management using vertical partitioning.In Proceedings of the 33rd International Conference on Very Large Data Bases (2007), VLDB ’07, VLDBEndowment, pp. 411–422.
[2] Cuenca Grau, B., Halaschek-Wiener, C., and Kazakov, Y.History matters: Incremental ontology reasoning using modules.In The Semantic Web. 2007.
[3] Kazakov, Y., and Klinov, P.Incremental reasoning in owl el without bookkeeping.In ISWC 2013. 2013.
[4] Patel-Schneider, P.Letter: Comments on WebPIE: A Web-scale parallel inference engine using MapReduce.Web Semantics: Science, Services and Agents on . . . 15 (Sept. 2012), 69–70.
[5] Schlicht, A., and Stuckenschmidt, H.Mapresolve.Web Reasoning and Rule Systems 6902 (2011), 294–299.
[6] Urbani, J.RDFS/OWL reasoning using the MapReduce framework.PhD thesis, Vrije Universiteit - Faculty of Sciences, 2009.
[7] Urbani, J., Kotoulas, S., Maassen, J., van Harmelen, F., and Bal, H. E.WebPIE: A Web-scale parallel inference engine using MapReduce.J. Web Sem. 10 (2012), 59–75.
26 / 27
Slider: an Efficient Incremental Reasoner
Thank you for your [email protected]
https://github.com/juleschevalier/sliderhttp://demo-satin.telecom-st-etienne.fr/slider/
27 / 27