User-centric pattern mining on knowledge graphs: An ...frankh/postscript/JWS2019.pdf ·...

Accepted Manuscript

User-centric pattern mining on knowledge graphs: An archaeological casestudy

W.X. Wilcke, V. de Boer, M.T.M. de Kleijn, F.A.H. van Harmelen,H.J. Scholten

PII: S1570-8268(18)30066-0DOI: https://doi.org/10.1016/j.websem.2018.12.004Reference: WEBSEM 486

To appear in: Web Semantics: Science, Services and Agents onthe World Wide Web

Received date : 1 December 2017Revised date : 27 July 2018Accepted date : 8 December 2018

Please cite this article as: W.X. Wilcke, V. de Boer, M.T.M.d. Kleijn et al., User-centric patternmining on knowledge graphs: An archaeological case study, Web Semantics: Science, Services andAgents on the World Wide Web (2018), https://doi.org/10.1016/j.websem.2018.12.004

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service toour customers we are providing this early version of the manuscript. The manuscript will undergocopyediting, typesetting, and review of the resulting proof before it is published in its final form.Please note that during the production process errors may be discovered which could affect thecontent, and all legal disclaimers that apply to the journal pertain.

https://doi.org/10.1016/j.websem.2018.12.004

User-Centric Pattern Mining on Knowledge Graphs: an ArchaeologicalCase Study

W.X. Wilckea,b,∗, V. de Boerb, M.T.M. de Kleijna, F.A.H. van Harmelenb, H.J. Scholtena

aDepartment of Spatial Economics, Vrije Universiteit, Amsterdam, The NetherlandsbDepartment of Computer Science, Vrije Universiteit, Amsterdam, The Netherlands

Abstract

In recent years, there has been a growing interest from the digital humanities in knowledge graphs asdata modelling paradigm. Already, this has led to the creation of many such knowledge graphs, many ofwhich are now available as part of the Linked Open Data cloud. This presents new opportunities for datamining. In this work, we develop, implement, and evaluate (both data-driven and user-driven) an end-to-end pipeline for user-centric pattern mining on knowledge graphs in the humanities. This pipeline combinesconstrained generalized association rule mining with natural language output and facet rule browsing to allowfor transparency and interpretability—two key domain requirements. Experiments in the archaeologicaldomain show that domain experts were positively surprised by the range of patterns that were discoveredand were overall optimistic about the future potential of this approach.

Keywords: User-Centric, Pattern Mining, Generalized Association Rules, Knowledge Graphs, DigitalHumanities, Archaeology

1. Introduction

Digital humanities communities have shown agrowing interest in the knowledge graph as a datamodelling paradigm [1]. Already, this interest hasinspired several large-scale international projects—amongst which are Europeana1, CARARE2, andARIADNE3—to actively explore the creation andpublication of knowledge graphs in their respectivedomains. These knowledge graphs, and many oth-ers like them, have been made available as part ofthe Linked Open Data (LOD) cloud—a vast and in-ternationally distributed network of heterogeneousknowledge—bringing large amounts of structureddata within arm’s reach of humanities researchers,who are now looking for ways to analyse this wealth

∗Corresponding authorEmail addresses: [email protected] (W.X. Wilcke),

[email protected] (V. de Boer), [email protected](M.T.M. de Kleijn), [email protected](F.A.H. van Harmelen), [email protected](H.J. Scholten)

1See www.europeana.eu2See www.carare.eu3See www.ariadne-infrastructure.eu

of knowledge. This presents new opportunities fordata mining [2].

Data mining is the process of identifying valid,novel, potentially useful, and ultimately under-standable patterns in data [3]. These patterns de-scribe regularities in a dataset which can help re-searchers gain more insight into their data. Re-searchers can then use this insight as a startingpoint to form new research hypotheses, as sup-port for existing ones, or simply to get a betterunderstanding of their data [4]. This entire pro-cess can take weeks or even months of hard workin the traditional setting. However, by incorporat-ing data mining into the workflow, much of thistime can be saved through the automatic discoveryof potentially-relevant patterns. This makes datamining interesting as a support tool for humanitiesresearchers.

Of course, the idea of using data mining as a sup-port tool in the humanities is, in itself, not novel.There have been various attempts before, for in-stance to classify coins [5] or to cluster cultural her-itage [6]. However, the majority of these studies in-volved mining unstructured data, most commonlyin the form of text mining, whereas mining struc-

Preprint submitted to Elsevier July 27, 2018

*ManuscriptClick here to view linked References

tured data has thus far been largely limited to tab-ular data and tailored to specific use cases. Withthe growing popularity of knowledge graphs in thehumanities, mining patterns from these structuresbecomes ever more important to researchers in thisdomain.

This work present the MINing On Semanticspipeline (MINOS) for pattern mining on knowledgegraphs in the humanities. Its aim is to support do-main experts in their analyses of such knowledgegraphs by helping them discover useful and inter-esting patterns in their data. To this end, the MI-NOS pipeline places users in the centre by lettingthem guide the mining process towards their topicsof interest and by letting them focus the results viaa facet pattern browser.

Under the hood, MINOS employs generalized as-sociation rule mining (ARM). An association ruleis an implication of the form X =⇒ y, where thepresence of a set of items X implies the presence ofanother item y. These implications are learned byiterating over a dataset of examples, called trans-actions. Generalized ARM works largely the same,except that the antecedent X holds the item classesrather then the items themselves.

This method was specifically chosen to help over-come two key issues with technological acceptancein the humanities, namely transparency and inter-pretability [7, 8]. With transparency, we refer tothe ease with which a method and its underlyingtheory can be understood: a black box method, forexample, is less transparent than a glass box one.With interpretability, we mean how easily one caninterpret the results of a method: it is, for instance,typically more difficult to interpret an n-order ten-sor than it is to interpret a set of symbolic state-ments.

Generalized ARM satisfies both of these con-straints: it employs basic statistical know-how toproduce human-readable rules in an overall deter-ministic process. A limited background in statis-tics, which most humanities researchers possess,therefore already suffices to understand how theserules map back to the input data and to checkwhether they are valid. This allows humanities re-searchers to put their trust in both the method andits results [9].

Of course, this trust is only gained if the pro-duced rules provide useful and interesting patternswhich can help these researchers to get a better un-derstanding of their data. We call this the effective-ness of the approach. To assess this effectiveness

we conducted experiments in the archaeological do-main, specifically on data from various excavations,during which domain experts were asked to evalu-ate a set of candidate rules on interestingness.

By placing domain experts at the centre of boththe pipeline and its evaluation, as opposed to datascientists, we contribute to an as yet largely unex-plored niche in this intersecting field of data min-ing, knowledge graphs, and humanities. Concretely,our main contributions are 1) insight into someof the challenges and possible solutions for intro-ducing data science tools to the humanities, 2) apipeline design for pattern mining on knowledgegraphs which is tailored to domain experts ratherthan to data scientists, and 3) a user-driven evalu-ation of our design choices instead of only a data-driven one.

With these contributions, our research aims toadd to the interdisciplinary field of the Digital Hu-manities. For this reason, we will refrain from de-veloping an ARM algorithm from scratch, but in-stead focus on how we can augment such an algo-rithm with complementary components to make itinto an effective tool for Humanities researchers.

A concise overview of related work is given next,followed by an overview of the pipeline, the dataset,and the experimental setup. This paper then dis-cusses the results from the user-driven evaluation,and concludes with a reflection on the chosen ap-proach in light of these results.

2. Related Work

Studies on data mining in the humanities havethus far largely focussed on unstructured data (textmining), whereas data mining on semi-structuredor structured data has been explored less fre-quently [4, 10]. An example of the latter kind isdiscussed in [11], in which the authors propose min-ing association rules from excavation data—sites astransactions, artefacts as items—using the provenApriori algorithm. This task is similar to that de-scribed in this work, but it is executed on a rela-tional database rather than a graph-shaped knowl-edge base.

Rule mining on (graph-shaped) knowledge basescan take many different forms. Initial efforts pri-marily focused on Inductive Logic Programming(ILP), due to its natural fit to logic-based sys-tems [12]. Performance issues led to the develop-ment of derivatives. Most well-known is arguablyAMIE [13], whose main difference is its use of the

2

Partial Completeness Assumption to cope with thelack of negative examples. In either case however,the rule-generation process is more complex thanthat of traditional ARM, making it less transpar-ent for non-experts and thus a less suitable choicefor the humanities domain.

A handful of studies have focussed on applyingARM algorithms to knowledge graphs. The moststraightforward approach simply converts all triplesto transactions (as used in traditional ARM) andthen feeds these to the Apriori algorithm [14]. Thishas the downside however, of 1) forcing relationaldata into an unnatural shape and potentially losinginformation in the process, and 2) limiting the pos-sible exploitation of both relations (via the graph’sstructure) and semantics [15]. Fortunately, thesecaveats can largely be avoided by using an ARMalgorithm that is specifically tailored to knowledgegraphs.

Several of these tailored ARM algorithms exist,for instance SWApriori; an adaptation of the com-mon Apriori algorithm to knowledge graphs [16].Its main selling points are its ability to discover pat-terns which span over multiple triples, i.e., a path,rather than over just a single one, and that it canmine multi-relational patterns. However, all seman-tic information is disregarded early on for efficiencyreasons, hence reducing the dataset to a directedgraph.

An alternative that does address this informa-tion is SWARM [17], which exploits RDF and RDFS

semantics to generalize patterns. Hereto, SWARMuses the rdf:type relation to infer the classes of ev-ery resource (item) in a set X, and then computesan inheritance tree for all resources in X using therdfs:subClassOf relation. Resources in the samebranch are grouped under their top-most commonclass, and are therefore said to share the same pat-terns.

SWARM’s ability to exploit common semanticsfor generalization is unique amongst ARM algo-rithms for knowledge graphs4. For this very reason,we have chosen SWARM for our implementation ofthe MINOS pipeline (see Section 3).

Other alternatives are discussed in [19] (aligninglocations), [20] (constructing ontologies), and [21](mapping categories), but these are designed to fita specific task and are therefore less applicable tothis work.

4Note that ARM algorithms that exploit general seman-tics as background knowledge do exist, for example [18].

Common amongst all these algorithm however,is that they treat resources as items—the things weare trying to find implications between. Anothercommonality is the focus on the graph’s structure—its vertices and edges—whereas the values of its lit-erals are left unaddressed. Therefore, similar valuesare still treated as completely different items.

Moving from the level of ARM to that of thepipeline, we can draw parallels between the ap-proach presented by Nebot&Berlanga [22, 23] andthe MINOS pipeline introduced in this work. Sim-ilar to MINOS, the scope of the mining processis tailored to the users’ interests provided via auser-defined pattern. However, where our pipelineencodes patterns as triples with optional unboundvariables, Nebot&Berlanga ask users to construct aformal SPARQL query using an extended grammar.While this offers more flexibility than our approach,it also increases the difficulty of entering such pat-terns for anyone not familiar with SPARQL.

Parallels can also be drawn with the approachpresented in [24, 25]. Here, the authors performdimension reduction by eliminating duplicate itemsets prior to mining, and by removing unwantedcandidate rules afterwards. This latter step is,again, similar to the data-driven filter used in theMINOS pipeline, whereas the former step is implic-itly dealt with by the uniqueness property of URIs.A final similarity between both pipelines is the useof generalized association rules.

A last but nevertheless important distinctionconcerns the evaluation of candidate rules: noneof the cited work so far has gone beyond a data-driven (objective) evaluation, despite its well knownlimitations for assessing the interestingness of pat-terns [26, 27]. Instead, this interestingness can onlybe assessed properly by combining a data-drivenevaluation with a user-driven one. In addition tothe common data-driven evaluation, we have there-fore asked the local archaeological community toassess a set of candidate rules via an online survey(see Section 5.2).

3. The MINOS pipeline

The MINOS pattern mining pipeline combinesan off-the-shelf ARM algorithm with a simple facetrule browser, and a number of crucial pre- andpost-processing components. These componentsenable users to integrate their interests into theprocess by restricting the search space beforehand,and by filtering the results afterwards. Hereto, the

3

pre-processing components translate user-providedtarget patterns into SPARQL queries, use thesequeries to retrieve relevant resources from the LODcloud, and perform context sampling to enrich themwith additional information. The post-processingcomponents constrain the mined rules based on theuser-specified patterns, and present them to theuser for evaluation using the facet rule browser. Aflowchart of this structure is provided in Figure 1.The processes depicted in this flowchart will be dis-cussed next.

3.1. Data Retrieval

To start the mining process, users are askedto provide a target pattern which describes theircurrent interest. This pattern takes the form ofone or more triples and serves as a language biaswhich restricts the search space to the relevant sub-graph [28]. Each triple in this pattern can convey asemantic range by leaving variables uninstantiated:( , p, o) to specify all entities for which (p, o)holds, ( , p, ) to specify all entities for which p

holds, ( , , ) to denote the entire graph, etc.The next step is the automatic transla-

tion from the provided target pattern to aSPARQL CONSTRUCT query, This query is thenused to construct an in-memory copy of thecorresponding subgraph. For instance, thepattern [( , rdf:type, :Burial Ground) ∧ ( ,

crm:contains, :Human Remains)] results in agraph which holds only instances of burial groundsat which human remains were found. We call theseinstances the target entities. Note that not alltriples in a pattern have to be hard constraints:users can specify soft constraints as well. If, forinstance, the second triple in the example patternwould be a soft constraint, then also ContextFindentities for which this (p, o)-pair is unknown areincluded in the resulting graph. Entities with con-flicting relations however, are still excluded.

3.2. Context Sampling

To find only relevant patterns we must first ac-curately capture the target entities’ semantic rep-resentations. These representations, which we calltheir contexts, include all information which mightbe relevant to them during the mining process.Hence, contexts serve a role similar to that of the“individuals” in tradition Data Mining. Capturingthese contexts is done using context sampling, andinvolves supplementing all target entities with ad-ditional triples, retrieved from the original graph,

which are directly or indirectly related to that en-tity.

In our implementation of the MINOS pipeline wechose a local-neighbourhood-based sampling strat-egy. This strategy was selected for its simplicity—itonly requires a single parameter—and for its abilityto produce good approximations without the needfor background knowledge. This strategy assumesthat the semantic representation of a target entitydiffuses with distance: closely-related entities aremore relevant than those further away in the graph.To reflect this, a local-neighbourhood strategy sam-ples the neighbouring nodes of a target pattern upto a certain depth.

3.3. Rule Mining

At the heart of the pipeline lies the ARM algo-rithm. For the reasons explained before—the ex-ploitation of RDF and RDFS semantics to general-ize patterns—we have chosen SWARM for our im-plementation of the pipeline. To understand howSWARM operates, we will now first expand on ourearlier brief introduction to ARM.

An association rule is an implication of the formX =⇒ y, where X is a finite set of items andy is a single item not present in X [29]. Withgeneralized ARM, X is reduced to the set of itemclasses C1, C2, . . . , Cn where Ci ∈ X if it coversat least θ% of the items. An example of such arule might be {Cemetery,BurialGround} =⇒HumanRemains, implying that human remainsare often found at both cemeteries and burialgrounds.

SWARM extends the notion of item sets fromsets of items with the same type to semantic itemsets, which contain items (entities) that frequentlyshare (p, o)-pairs. Instances of cemeteries andburial grounds, for instance, are likely to sharethe (:contains, :Human Remains)-pair. Once allsemantic item sets in the graph have been com-puted, SWARM proceeds by joining similar itemsets into common behaviour sets. In our exam-ple case, a likely combination might occur with in-stances of crypts and catacombs. The rationale un-derlying this is the assumption that entities withlargely overlapping contexts (sets of properties andinstances within a set distance) are likely to alsoshare other, as yet-unknown, (p, o)-pairs. Put dif-ferently: these entities follow the same pattern.To quantify this, SWARM introduces the similar-ity factor, which serves as a boundary that sepa-rates similar from dissimilar item sets. As a final

4

Figure 1: A flowchart of the MINOS pipeline. At start, a SPARQL query, automatically generated from a user-provided targetpattern, is executed to retrieve a relevant subset from the LOD-cloud. After capturing the contexts of the target entities inthis subset, these contexts are passed to the ARM algorithm. The resulting rules are then first filtered based on a predefinedconstraint template, with the remainder being presented to domain experts for evaluation.

step, SWARM generalizes the discovered patternsby exploiting class inheritance, ultimately produc-ing rules of the form ∀χ(Class(χ, c)→ (P (χ, φ)→Q(χ, ψ))). Here, χ, φ, and ψ are entities, c a class,and P and Q predicates.

There are several measures available to assess theinterestingness of the resulting rules. For our data-driven evaluation we use the well known confidenceand support measures. The confidence score of arule X =⇒ y conveys its strength and equals theproportion of triples (transactions) which satisfy Xthat also satisfy y. Its support score representsthe rule’s significance and equals the proportion oftriples which satisfy both X and y.

3.4. Dimension Reduction

Association rule mining algorithms commonlyproduce a large number of candidate rules, resultingin their own knowledge management problem [30].To combat this, our pipeline includes a data-drivenfilter which discards unwanted rules based on a pre-defined set of constraints. These constraints can in-clude minimum and maximum values for the sup-port and confidence scores, but also restrictions ontypes, on predicates, and on both entire antecedentsand consequents.

By default, the set of constraints filters allrules which are too common or too rare, orwhich are generally unwanted. Typical exam-ples of such unwanted rules are those which in-clude domain-independent relations—owl:sameAs,skos:inSchema, dcterms:medium, etc—which donot contribute to the pattern mining process. Op-tionally, the default filter can be overridden by pro-viding a custom constraints template at start. This

allows us to further reduce the number of candidaterules by tightening the constraints.

Note that the filter is run after the rules havebeen produced. This is a deliberate design choicethat allows users to revert any or all of its effectswithout having to repeat the whole mining process,and therefore offers more flexibility during the anal-ysis of the results. Hereto, the unfiltered result setis kept in memory.

3.5. Rule Browser

Candidate rules which pass the filter are pre-sented to users in an interactive facet browser. Toimprove their interpretability, these rules are au-tomatically translated to natural language by ex-ploiting the label-attributes of both entities andproperties. The resulting translations are then in-serted into predefined sentence templates; one perlanguage.

To change which rules are shown, users cansupply additional information about their interest.This involves modifying the same parameters asthose available for the constraints templates dis-cussed previously, and thus includes both restric-tions on assigned scores and on contents. Ruleswhich are deemed worthy of saving can be exportedto a human-readable file, or can be stored in a bi-nary file which can later be reopened by the rulebrowser for further analysis and mutation.

For our implementation of the pipeline we optedfor a virtual-terminal interface, which offered us thenecessary balance between simplicity and flexibilityneeded during our interactive sessions with domainexperts. If desired however, a full-fledged web inter-face can be used instead by running the rule browservia a client-side environment.

5

4. The package-slip Knowledge Graph

Excavation data is a valuable source of informa-tion in many archaeological studies [9]. These stud-ies are therefore likely to benefit from pattern min-ing on this type of data, and thus make it a suitablechoice to base our case study on. In agreement withdomain experts, we therefore selected the package-slip knowledge graph to run our experiments with.

Package slips are detailed summarizations of en-tire excavation projects. They are structured asspecified by the SIKB Protocol 0102, which is aDutch standard on modelling and sharing excava-tion data of various degree of granularity5. Specif-ically, package slips capture the following aspects:

• General information about individuals, compa-nies and organisations which are involved, aswell as the various locations at which the ex-cavations took place.

• Final and intermediate reports made duringthe project, as well as different forms of mediasuch as photos, drawings, and videos. Each ofthese is accompanied by meta-data and (cross)references to their respective file, subject, andmentioning.

• Detailed information about all artefacts dis-covered during the project, as well as their(geospatial and stratigraphic) relation and thearchaeological context in which they werefound.

• Fine-grained information about the precise lo-cations and their geometries at which artefactswere discovered, archaeological contexts wereobserved, and where media was created.

In its specification, the SIKB protocol makes useof the Extensible Markup Language (XML) to de-fine the various concepts that make up a packageslip. The limitations of this language for sharingand integrating data motivated the Dutch ARI-ADNE partners, amongst whom were we, to designan alternative data model based on semantic webstandards.

The semantic package slip is primarily builton top of the CIDOC conceptual reference model(CRM) and its archaeological extension CRMEnglish Heritage (CRM-EH). Every excavation

5www.sikb.nl/datastandaarden/richtlijnen/sikb0102

project is of the type DHProject (Dutch HeritageProject, a subclass of CRM-EH’s EHProject) andlinks to the discovered artefacts via productionevents. To classify these artefacts, and all relatedarchaeological contexts, the package-slip graph in-cludes a multi-schema SKOS vocabulary which con-tains more than 7.000 archaeological concepts.

A small and partial example of a package slipis depicted in Figure 2. There, a single artefact(crmeh:ContextFind) is shown with its archaeo-logical context (crmeh:Context). The artefact alsoholds several attributes of various types, and is it-self part of a unit (crm:Collection). These unitsare virtual containers that group artefacts whichwere found in the same archaeological context. Incontrast, bulk finds (crmeh:BulkFind) are physi-cal containers, often a box, which group artefactswith similar storage requirements (e.g. on humid-ity, temperature, or oxygen). These are linked to aDutch Heritage project via a production event.

4.1. Knowledge Integration

At present6, the package-slip knowledgegraph7contains the aggregated data from lit-tle over 70 package slips, totaling roughly 425.000triples [31]. These package slips were convertedfrom existing XML files8, and were subsequentlyintegrated into a single coherent graph by usinga central triple store which acted as an authorityserver: hash values were generated for all entitiesin a package slip—starting from the outer edge ofthe graph and recursively updating all encounteredparents—which were then cross referenced withthose on the server. By using this approach,we were able to unify different resources aboutthe same thing, most particular about people,companies, and locations.

The size of the package-slip knowledge graph isexpected to grow consistently, as the production ofpackage slips has recently been made mandatoryfor all Dutch institutes and companies which areinvolved with archaeological data. This increasesthe relevancy of this case study for the Dutch ar-chaeological community, and thus adds support forour choice of the package slip.

6Situation on July 20188The conversion tool is available at gitlab.com/

wxwilcke/pakbonLD8pakbon-ld.spider.d2s.labs.vu.nl

6

Figure 2: Small and partial example from the package slip knowledge graph. Each project produces (amongst others) oneor more containers. These containers are composed of one or more bulk finds, which in turn are composed of one or moreartefacts. Each of these artefacts is part of a unit of which its members share a common (multilayered) context. Note that,in this figure, solid and opaque circles represent instance and vocabulary resources, respectively. Three dots represent literals,and opaque boxes represent classes

5. Experiments

To assess the effectiveness of the MINOS pipeline,we have conducted four experiments on thepackage-slip knowledge graph (see Section 4). Eachof these experiments addressed a different granu-larity of the package-slip graph to investigate theeffects of these different granularities on the useful-ness of the discovered patterns for domain experts.In order from coarse-grained to fine-grained, theseare

Projects, which, amongst other, are of a projectclass, are held at a specific location, and duringwhich one or more artefacts are discovered.

Artefacts, which, amongst others, are of an arte-fact class, have dimensions and mass, consistof one or more parts, and are found in a cer-tain archaeological context and under specificconditions.

Archaeological Contexts, which, amongst oth-ers, are of a context class, have a geometry, astructure, and a dating, and which consist ofone or more subcontexts.

Archaeological Subcontexts, which, amongstothers, have a certain shape, colour, and tex-ture, which consist of zero or more nested sub-structures, and which hold an interpretation of

its significance as provided by archaeologists insitu.

The bottom three granularities—artefacts, con-texts, and subcontexts—roughly correspond to theinterests of three domain experts, with whom wehad several interviews during the experiment de-sign phase. The most coarse-grained granularity(projects) was added by us to create a balancedcross-section of the graph. From here on, we re-fer to these granularity levels as topics of interest(ToI).

All experiments were run using our implementa-tion of the MINOS pipeline9. Hereto, all four ofthese experiments were given the same set of hy-perparameters (Table 1). Concretely, we set local-neighbourhood context sampling depth (CSD) toa maximum of three hops, and varied the similar-ity factor (SF) over three values which range fromweakly similar to strongly similar. Therefore, eachexperiment was run three times (CSD×SF) withdifferent hyperparameters. Because different simi-larity factors may result in distinctly different, yetpossibly useful, patterns, we aggregated their out-put to produce a single result set per experiment.

Each experiment produced up to roughly 1800candidate rules, which was brought down from

9gitlab.com/wxwilcke/MINOS.

7

about 36.000 on average using the default filter set-tings. From these 1800 rules, we created a moremanageable evaluation sample by taking the topfifty candidates based on their confidence (first) andsupport (second) scores. We motivate our choice ofconfidence as the primary measure due to the em-phasis on producing correct, but also yet unknown,patterns.

Five candidate rules are listed in Table 2. Thesehave been hand picked from the evaluation sample,and are representative of the result set as a whole.We will use these five candidates as running exam-ples in our data-driven evaluation.

5.1. Data-Driven Evaluation

To assess the general effectiveness of the pipeline,we have chosen not to incorporate any dataset-specific adjustments into the data-driven evalua-tion. The reason for this is that we are not in-terested in how well our approach works on thisparticular dataset, but rather how effective the sup-port and confidence metrics are as criteria for ourdata-driven filter and, by extension, for the per-ceived interestingness of the discovered patterns todomain experts. For the purpose of this evaluation,we limit ourselves to the evaluation sample createdearlier.

An inspection of the evaluation sample revealsthat many of the candidate rules apply to classesother than the four selected topics of interests.Two examples of such rules are listed in Table 2:both rules R3 and R4 apply to entities of theSiteSubDivision class, which is not explicitlylisted in any of the target patterns. In fact, 16%of all rules in the evaluation sample apply to thisclass. If we look at the package-slip data modelhowever, we observe that said class lies within threehops of the DHProject class. Therefore, the pres-ence of these unexpected classes can likely be at-tributed to the chosen context sampling strategy.Indeed, the same is the case for all other unex-pected classes such as ContextStuff, Collection,and Place Appellation, which apply to 11.5%,8%, and 6% of all rules in the evaluation sample.

Another look at the evaluation sample revealsthat many of the candidate rules actually describevery similar patterns. This is especially evidentwith time spans, which are present in nearly half(43%) of the patterns. In fact, of the five exampleslisted in Table 2, all but one involves a time spanas consequent. The reason for their frequent occur-rence can likely be traced back to the package-slip

vocabulary: many of the artefact classes are in amany-to-one relationship with specific time spans.For instance, all known belly amphora stem fromthe pre-classical period (612 BCE -– 480 BCE).These predefined relationships are treated as pat-terns by the ARM algorithm, and, due to overrep-resented in the data, score high on confidence andsupport. This is also true for other predefined re-lationships, such as those that involve geographicalinformation: map areas are paired with their re-spective toponymies, and provinces are paired withmunicipalities.

A final revelation of the evaluation set is thatboth support and confidence scores alone provide apoor measure of pattern interestingness: over 96%and 99% of the patterns have support and confi-dence scores close (δ ≤ 0.05) or equal to 0 or 1,respectively. This makes it difficult to distinguishbetween patterns which are actually interesting andthose which describe predefined and therefore un-interesting relationships. While this difficulty is aknown problem with ARM [30], the extreme form ittakes here is likely caused by the nature of the pack-age slip data: nearly all information that might beinteresting to domain experts (artefacts, archaeo-logical contexts, etc.) is in a one-to-one relationshipwith each other. Therefore, this information is notshared within, and also not between, package-slipinstances.

5.2. User-Driven Evaluation

To assess the interestingness of the producedrules from a domain expert’s point of view weasked the local archaeological community—a Face-book group with over 3000 members from variousDutch academic and commercial organizations—toparticipate in an online survey (Fig. 3). At thestart of this survey, participants were greeted witha concise summary of this research and with a de-tailed description of the task we asked them to per-form: to rate a selection of forty candidate rules onplausibility (can this be true in the context of thedata? ), on relevancy (can this support you duringyour research? ), and on newness (is this unknownto you? ). Once completed, the participants werealso asked some final questions about their famil-iarity with the domain, and whether this approachof rule mining could contribute to their studies.

A five-point Likert scale was offered as an-swer template throughout the entire survey. Tostress the importance of the participants’ opinionsour Likert scale ranged from strongly disagree to

8

Table 1: Configurations of the four experiments. Column names indicate topic of interest (ToI), target pattern, number offacts in the subset, context sampling depth (CSD), and similarity factors (SFs).

ToI Target Pattern # facts CSD SFs

Projects [( , rdf:type, :DHProject)] 21.9k 3 (0.3, 0.6, 0.9)Artefacts [( , rdf:type, crmeh:ContextFind)] 192.8k 3 (0.3, 0.6, 0.9)Contexts [( , rdf:type, crmeh:Context)] 82.2k 3 (0.3, 0.6, 0.9)Subcontexts [( , rdf:type, crmeh:Context)

∧ ( , pbont:trench type, )]59.5k 3 (0.3, 0.6, 0.9)

Table 2: Five example rules selected from the top-50 candidates of all four experiments. Each rule applies to resources of typeTYPE, and consists of an antecedent and a consequent in the form of an IF-THEN statement. The last two columns listtheir confidence and support scores.

ID Semantic Association Rule Conf. Supp.

R1TYPE crmeh:ContextFind

IF :has artefact type, :AdzeTHEN crm:has time-span, [:Neolithic Period, :Bronze Age]

1.00 0.08

R2TYPE crmeh:ContextFind

IF :has artefact type, :Raw Earthenware (Nijmeegs/Holdeurns)

THEN crm:has time-span, [:Early Roman Age, :Late Roman Age]1.00 0.90

R3TYPE crmeh:SiteSubDivision

IF :has location type, :Base Camp

THEN crm:has time-span, [:Neolithic Period, :New Time]1.00 0.26

R4TYPE crmeh:SiteSubDivision

IF :has location type, :Flint Carving

THEN crm:has time-span, [:Mesolithic Period, :Neolithic Period]0.50 0.06

R5TYPE crmeh:Context

IF :has trench type, :Esdek

THEN :has sample method, :Levelling

1.00 0.71

Table 3: Translation in natural language of the five example rules given in Table 2. The last three columns list their normalizedmedian and (highest) mode scores for plausibility (P ), relevancy (R), and novelty (N).

ID Semantic Association Rule PMdn (PMode) RMdn (RMode) NMdn (NMode)

R1 For every artefact holds: if it concerns an adze, then it dates from theNeolithic period to the Bronze Age.

0.75 (0.75) 0.75 (0.75) 0.75 (0.75)

R2 For every artefact holds: if it concerns raw pottery Nijmeegs/Holdeurns,then it dates from Early Roman to Late Roman period.

0.75 (0.75) 0.50 (0.75) 0.25 (0.25)

R3 For every site holds: if it concerns a base camp, then it dates from theearly Neolithic period to Recent times.

0.25 (0.25) 0.25 (0.25) 0.50 (0.75)

R4 For every site holds: if it concerns flint carving, then it dates from theMesolithic to Neolithic period.

0.25 (0.25) 0.25 (0.25) 0.25 (0.25)

R5 For every context holds: if it concerns an esdek, then the collectionmethod involves levelling.

0.25 (0.25) 0.25 (0.25) 0.75 (0.75)

9

strongly agree, where our questions were stated suchthat a higher agreement corresponds with a higherinterestingness. Additionally, if desired, partici-pants could enter further remarks as free-form text.

All forty candidate rules were randomly sampledfrom the evaluation sample—ten per experiment(stratified)—and automatically10translated to nat-ural language using predefined language templates(see Section 3.5). We have listed five of these trans-lated rules in Table 3. Each of these rules is thetranslation of the rule with the same identifier inTable 2.

5.2.1. Survey Analysis

Twenty-one people participated in our survey. Astatistical analysis of their answers is provided inTable 4. For each of the four experiments, this tablelists the normalized median and mode scores forplausibility, relevancy, and novelty. Also listed arethe overall scores per experiment and per metric,and the score for the entire approach as a whole.For all scores holds that higher is better.

Looking at the four separate experiments it isclear that project and artefact related patterns arerated higher than those of contexts and subcon-texts, with an overall median score of 0.50 versusthat of 0.25, respectively. Patterns about artefactsalso score relatively high on the mode (0.75), ex-ceeding that of all other experiments (0.25). Thisimplies a cautious positive opinion towards arte-facts related patterns. Contributing to this posi-tiveness are the plausibility and relevancy ratings,on which artefacts score better than any of the otherthree experiments. Statistical tests (Kruskal-Walliswith α = 0.01) indicate that these findings are sig-nificant, and suggest a dependency between variousmetrics and the topics of interest: p = 7.66× 10−13

for plausibility, p = 6.97× 10−5 for relevancy, andp = 1.50× 10−3 for novelty.

The approach as a whole scores best on plausi-bility and novelty, both of which have a median of0.50. Between these two metrics, plausibility takesthe cake with a mode of 0.75 versus that of 0.25

for novelty. This implies a cautious positive opiniontowards the plausibility of the discovered patterns.More negative are the relevancy ratings with a me-dian and mode of 0.25. Together, the metrics com-bine to an overall score with median 0.50 and mode

10In some cases, minor edits were made to improve theflow of the sentence.

0.25, implying a neutral to slightly negative opin-ion about the approach as a whole. Remarks madeby domain experts suggest that this stems from thefrequent occurrence of patterns that are either toogeneral, too trivial, or which describe predefinedone-to-one or many-to-one relationships. This cor-responds with our findings during the data-drivenevaluation. Nevertheless, a median and mode scoreof 0.75 was given to the final separate questionabout the overall potential usefulness of this ap-proach, implying a more positive standpoint.

Table 5 lists the inter-rater agreements per ex-periment and per metric. The Krippendorff’s al-pha (αK) is used for this purpose, as it allowscomparisons over ordinal data with more than tworaters. Overall, the ratings are in fair agreementwith αK = 0.25. A similar agreement is found be-tween the different topics of interest, which rangefrom 0.18 to 0.30. A more extreme difference isfound between the different metrics: a moderateagreement on plausibility (αk = 0.41), and onlya slight agreement on both relevancy (αK = 0.08)and novelty (αK = 0.09). This stark difference maybe caused by the different familiarities and experi-ences of the domain experts: whereas plausibilitycomes down to everyday archaeological knowledge,novelty (and in a lesser degree: relevancy) is farmore dependent on one’s own view of the domain.

We can get a better understanding of these scoresby reading the remarks that have been left by thedomain experts. Overall, these remarks suggest adisconnect between how the evaluation task was in-structed and how the experts performed it: ratherthan assessing the patterns within the (limited)context of the data set, our panel of experts appearto have judged the patterns against the knowledgein the archaeological domain as a whole. Numerouspatterns have therefore only been scored as implau-sible because they do not necessary hold outside ofthe data set. Similarly, remarks on relevance andnovelty seems to indicate that the raters only as-sessed that exact pattern instance, and did not con-sider the potential of such patterns on different datasets. Interestingly, some experts appear to haveused this limited window to try to understand theknowledge creation process of fellow archaeologists:whether, for instance, the choice of an unexpectedclass could indicate an alternate interpretation ofthe same facts.

10

Figure 3: Collage of various screenshots of the web survey used during the user-driven evaluation. Participants were first askedabout their background and experience, and were then given a summary of this research and a detailed description of the task.Next, they were presented with 40 candidate rules and asked to rate these on interestingness. Afterwards, participants werealso asked about their opinion on the pipeline as a whole.

Table 4: Normalized separate and overall median and (highest) mode scores for plausibility (P ), relevancy (R), and novelty(N) per topic of interest as provided by 21 raters.

ToI PMdn (PMode) RMdn (RMode) NMdn (NMode) Overall

Projects 0.50 (0.25) 0.25 (0.25) 0.50 (0.75) 0.50 (0.25)Artefacts 0.75 (0.75) 0.50 (0.25) 0.25 (0.25) 0.50 (0.75)Contexts 0.25 (0.25) 0.25 (0.25) 0.50 (0.75) 0.25 (0.25)Subcontexts 0.25 (0.25) 0.25 (0.25) 0.50 (0.25) 0.25 (0.25)

Overall 0.50 (0.75) 0.25 (0.25) 0.50 (0.25) 0.50 (0.25)

Table 5: Inter-rater agreements (Krippendorff’s alpha: αK) over all raters per topic of interest (left) and per metric (right).Overall αK = 0.25.

ToI αK

Projects 0.22Artefacts 0.30Contexts 0.24Subcontexts 0.18

Metric αK

Plausibility 0.41Relevancy 0.08Novelty 0.09

11

6. Discussion

Our analysis of the survey’s results indicates thatthe panel of experts was cautiously positive aboutthe plausibility of the produced patterns. This(slight) positiveness does not come as surprise, asassociation rules describe the actual patterns whichexist in the data, rather than predict new ones. Wecan even further explain this observation by our de-cision to order the candidate rules on confidence—favoring accuracy above coverage—and because thepackage-slip knowledge graph only contains curateddata. Given that this is the case however, we maywonder why the patterns in our evaluation samplewere not rated even more positively on plausibility.

A possible reason is the presence of errors inthe data, but these would likely be incidental andshould therefore have only a minor effect on themean plausibility score. A more likely reason is thatwe used a suboptimal set of hyperparameters dur-ing our experiments. Prime suspects are the cho-sen similarity factors, which determine how muchgroups of entities (the semantic item sets) are per-mitted to overlap before they are combined intomore general groups (the common behaviour sets).Concretely, a too low value can result in overgener-alization: two or more item sets are erroneously at-tributed the same pattern. This results in the gen-eration of rules which do not hold for all membersof a set, despite a possible high confidence scoreimplying the opposite, and which are thus likely toscore poorly on plausibility in the eyes of domainexperts.

We can solve this problem to a certain extent byincreasing the similarity factor to a value at whichonly minor differences in set members are accepted.By doing so however, we risk trading one undesir-able situation for another: rather than overgener-alizing, we might undergeneralize such that similaritem sets are prevented from forming common be-haviour sets. In the worst case, we might even endup joining only very large items sets—the similar-ity factor is covariant with set size—which have anear perfect overlap, hence favouring sets with (p,o)-pairs that frequently co-occur across the graph.Examples of such patterns were present in our eval-uation set, most particular those that implied rela-tionships between two vocabulary concepts (typesof artefact and time spans, provinces and munici-palities, etc.) which naturally belong together.

Unfortunately, the optimal value(s) for the simi-larity factor are unknown beforehand, which is why

we varied over three different values—0.3, 0.6, and0.9—during our experiments. As explained before,our choice for these specific values came forth fromour preliminary findings, which showed that thesedifferent values result in distinctly different, yet po-tentially useful, patterns. Given the occurrence ofboth too general and too specific patterns in ourevaluation set, however, we can now surmise thatthe outer two values were likely too low and toohigh, respectively. This suggest that there existsbut a narrow range of optimal values in the sim-ilarity factor spectrum for which SWARM yieldsdesirable patterns. Determining this optimal rangewould require a great deal of time and effort fromour domain experts, unfortunately, which is why nofurther preliminary experiments have been run.

While the chosen hyperparameters thus seem tolargely correlate with the plausibility score, our re-sults suggests that it are the characteristics of thedataset which are correlated with the relevancy andthe novelty scores. On both of these scores, ourgroup of experts were less positive about the pre-sented patterns. The remarks left reveal that manyof these patterns were seen as trivial. Others wereseen as tautologies or were only thought to be appli-cable in a specific context. These remarks supportour findings made during the data-driven evalua-tion that the confidence and support metrics are in-effective measures for assessing the interestingnessof patterns to domain experts. Both measures arestrongly influenced by the size, variety, and quirksof the data. Peculiarities of the package slip data,specifically its inherent hierarchical structure, arelikely the reason which made this issue manifest it-self even more evidently in this work.

Another reason for the unsatisfying novelty andrelevancy scores may be found in the similarity be-tween patterns: many of the produced rules de-scribe variations of the same pattern, with onlya different vocabulary concepts in the consequent.One pattern, for example, might imply that somepots are made from red clay, whereas another pat-tern might imply that other pots are made fromgrey clay. Our dataset contained over 7.000 of theseconcepts separated into 22 categories (e.g., materialand period). If left unchecked, it is therefore likelyfor similar patterns to make their way into the re-sult set.

Additionally, many of the potentially interestingdata points were encoded via textual or numericalliterals. SWARM, as well almost all other methodsin this field, lacks the ability to compare literals

12

based on their raw values [15]. Instead, resourcesare only compared via their URIs, which are uniqueand therefore always spaced at equal distance fromeach other in the search space. Because of this lim-itation, our pipeline was unable to differentiate be-tween closely (or distantly) related literals. Thisbecame especially apparent with geometries, mea-surements, and descriptions, all three of which areabundantly found in archaeological data.

A further reason for the novelty and relevancyscores may lie in with the ontology that is used inthe package slip data model: many of the propertiesdefined in this model are expressed via rather longpaths (up to five hops). This characteristic is di-rectly inherited from the used CIDOC CRM ontology,which specifies that entities and properties are tobe linked via various events. Travelling along theselong paths is computationally expensive, which iswhy we had decided to leave such paths unexploredin favor for the more local patterns.

Zooming out from the individual scores can giveus an idea about the perceived overall effectivenessof the pipeline. The remarks left by the expertssuggest that this effectiveness may have been in-fluenced considerable by the size and scope of thedataset. Indeed, the relatively low number of inte-grated package slips might not have been enough toallow for well-generalized patterns to emerge, ratherthan the more-specific patterns which only makesense in a narrow context (e.g., within a single ex-cavation). Similarly, the limited scope of the data—the package slips were supplied by just two compa-nies, each with a particular specialisation—meantthat experts with different specialisations (e.g., adifferent culture or period) might have had insuffi-cient experience to evaluate the discovered patternson the criteria we asked them to.

A final aspect worthy of noting are the relativelylow number of raters we were able to muster, de-spite our greatest efforts, and the effect this mayhave had on the outcome of our evaluation: the in-fluence of possible outliers is greater, and the find-ings are less generalizable beyond our chosen usecase. An analysis of the survey’s access logs indi-cates that many potential raters did not actuallycomplete the survey, but instead gave up at an ear-lier point. From the remarks left by those who didcomplete the survey, we believe that this was likelydue to a combination of there being too many pat-terns to evaluate, and the limited novelty and rele-vance that these patterns provided to them.

7. Conclusion

In this work, we introduced the user-centric MI-NOS pipeline for pattern mining on knowledgegraphs in the humanities. With this pipeline, weaim to support domain experts in their analysesof such knowledge graphs by helping them dis-cover useful and interesting patterns in their data.Our pipeline therefore emphasizes the importanceof these experts and their requirements, rather thanthose of the usual data scientists. This has led toseveral design choices, most particular of which isthe use of generalized association rules to overcomethe lack of transparency and interpretability—twokey issues with technological acceptance in thehumanities—that persists with many other meth-ods.

To assess the effectiveness of our pipeline we con-ducted experiments in the archaeological domain.These experiments were designed together with do-main experts and were evaluated both objectively(data driven) and subjectively (user driven). Theresults indicate that the domain experts were cau-tiously positive about the plausibility of the dis-covered patterns, but less so about their noveltyand their relevance to archaeological research. In-stead, a large number of patterns were discardedby our experts for describing trivialities or tautolo-gies. Nevertheless, on average, the experts werepositively surprised by the range of patterns thatour pipeline was able to discover, and were opti-mistic about the future potential of this approachfor archaeological research.

During our research we encountered severalchallenges which limited the effectiveness of ourpipeline. We were unable to address them at thattime and therefore offer these as suggestions forfuture work. For the most part, these challengesconcern the nature of the data and the inability ofthe mining algorithms to exploit this. Concretely,the inability to 1) exploit common semantics otherthan RDF and RDFS (e.g., SKOS and, in a lesser de-gree, CIDOC CRM), and 2) cope with knowledge en-coded via literal attributes (rather than only viathe graph’s structure) which make up the majorityof knowledge in the humanities.

Solving these challenges would unlock a wealthof additional knowledge which is currently left un-used, and which can potentially lead to more usefuland more interesting patterns which humanities re-searchers can use to further their research.

13

Acknowledgements

We wish to express our deep gratitude to domainexperts Milco Wansleeben and Rein van ’t Veerfor their enthusiastic encouragement and useful cri-tiques during the various steps that have lead tothis work. We also wish to thank all domain expertswho participated in our survey for their willingnessto sacrifice their free time, and without whom wewould not have been able to complete this research.

This research has been partially funded bythe ARIADNE project through the EuropeanCommission under the Community’s SeventhFramework Programme, contract no. FP7-INFRASTRUCTURES-2012-1-313193.

References

[1] M. Hallo, S. Lujan-Mora, A. Mate, J. Trujillo, Currentstate of linked data in digital libraries, Journal of Infor-mation Science 42 (2) (2016) 117–127.

[2] A. Rapti, D. Tsolis, S. Sioutas, A. Tsakalidis, A survey:Mining linked cultural heritage data, in: Proceedingsof the 16th International Conference on EngineeringApplications of Neural Networks (INNS), ACM, 2015,p. 24.

[3] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, From datamining to knowledge discovery in databases, AI maga-zine 17 (3) (1996) 37.

[4] J. Hagood, A brief introduction to data mining projectsin the humanities, Bulletin of the American Society forInformation Science and Technology 38 (4) (2012) 20–23.

[5] C. Velu, P. Vivekanadan, K. Kashwan, Indian coinrecognition and sum counting system of image datamining using artificial neural networks, InternationalJournal of Advanced Science and Technology 31 (2011)67–80.

[6] K. Makantasis, A. Doulamis, N. Doulamis, M. Ioan-nides, In the wild image retrieval and clustering for 3dcultural heritage landmarks reconstruction, MultimediaTools and Applications 75 (7) (2016) 3593–3629.

[7] L. Manovich, Trending: The promises and the chal-lenges of big social data, Debates in the digital human-ities 2 (2011) 460–475.

[8] B. R. T. Rohle, Digital methods: Five challenges, in:Understanding digital humanities, Springer, 2012, pp.67–84.

[9] H. Selhofer, G. Geser, D2.1: First report on users needs,Tech. rep., ARIADNE (2014).URL http://ariadne-infrastructure.eu/Resources/

D2.1-First-report-on-users-needs

[10] H. Kamada, Digital humanities roles for libraries?, Col-lege & Research Libraries News 71 (9) (2010) 484–485.

[11] H.-P. Kriegel, P. Kroger, C. H. Van Der Meijden,H. Obermaier, J. Peters, M. Renz, Towards archaeo-informatics: scientific data management for archaeobi-ology, in: International Conference on Scientific andStatistical Database Management, Springer, 2010, pp.169–177.

[12] V. Tresp, M. Bundschus, A. Rettinger, Y. Huang, To-wards machine learning on the semantic web., in: Ursw(lncs vol.), Springer, 2008, pp. 282–314.

[13] L. Galarraga, C. Teflioudi, K. Hose, F. M. Suchanek,Fast rule mining in ontological knowledge bases withamie+, The VLDB Journal 24 (6) (2015) 707–730.

[14] T. Anbutamilazhagan, M. K. Selvaraj, A novel modelfor mining association rules from semantic web data,Elysium Journal 1 (2).

[15] X. Wilcke, P. Bloem, V. de Boer, The knowledge graphas the default data model for learning on heterogeneousknowledge, Data Science (1) (2017) 1–19.

[16] R. Ramezani, M. Saraee, M. Nematbakhsh, Swapriori:a new approach to mining association rules from se-mantic web data, Journal of Computing and Security 1(2014) 16.

[17] M. Barati, Q. Bai, Q. Liu, SWARM: An Approachfor Mining Semantic Association Rules from SemanticWeb Data, Springer International Publishing, Cham,2016, pp. 30–43. doi:10.1007/978-3-319-42911-3_3.URL http://dx.doi.org/10.1007/

978-3-319-42911-3_3

[18] R. G. Miani, C. A. Yaguinuma, M. T. Santos, M. Bia-jiz, Narfo algorithm: Mining non-redundant and gener-alized association rules based on fuzzy ontologies, En-terprise Information Systems (2009) 415–426.

[19] T. Kauppinen, K. Puputti, P. Paakkarinen, H. Kuitti-nen, J. Vaatainen, E. Hyvonen, Learning and visual-izing cultural heritage connections between places onthe semantic web, in: Proceedings of the workshop oninductive reasoning and machine learning on the seman-tic web (irmles2009), the 6th annual european semanticweb conference (ESWC2009), Citeseer, 2009.

[20] J. Volker, M. Niepert, Statistical schema induction, in:Extended Semantic Web Conference, Springer, 2011,pp. 124–138.

[21] J. Kim, E.-K. Kim, Y. Won, S. Nam, K.-S. Choi, Theassociation rule mining system for acquiring knowl-edge of dbpedia from wikipedia categories., in: NLP-DBPEDIA@ ISWC, 2015, pp. 68–80.

[22] V. Nebot, R. Berlanga, Mining association rules fromsemantic web data, in: International Conference on In-dustrial, Engineering and Other Applications of Ap-plied Intelligent Systems, Springer, 2010, pp. 504–513.

[23] V. Nebot, R. Berlanga, Finding association rules insemantic web data, Knowledge-Based Systems 25 (1)(2012) 51–62.

[24] R. G. L. Miani, E. R. H. Junior, Exploring associationrules in a large growing knowledge base, InternationalJournal of Computer Information Systems and Indus-trial Management Applications (2015) 106–114.

[25] R. G. L. Miani, E. R. Hruschka, Analyzing the use of ob-vious and generalized association rules in a large knowl-edge base, in: Hybrid Intelligent Systems (HIS), 201414th International Conference on, IEEE, 2014, pp. 1–6.

[26] K. McGarry, A survey of interestingness measures forknowledge discovery, The knowledge engineering review20 (1) (2005) 39–61.

[27] A. A. Freitas, On rule interestingness measures,Knowledge-Based Systems 12 (5) (1999) 309–315.

[28] C. d’Amato, A. G. Tettamanzi, T. D. Minh, Evolution-ary discovery of multi-relational association rules fromontological knowledge bases, in: Knowledge Engineer-ing and Knowledge Management: 20th InternationalConference, EKAW 2016, Bologna, Italy, November 19-

14

23, 2016, Proceedings 20, Springer, 2016, pp. 113–128.[29] R. Agrawal, T. Imielinski, A. Swami, Mining associa-

tion rules between sets of items in large databases, in:Acm sigmod record, Vol. 22, ACM, 1993, pp. 207–216.

[30] M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivo-nen, A. I. Verkamo, Finding interesting rules from largesets of discovered association rules, in: Proceedings ofthe third international conference on Information andknowledge management, ACM, 1994, pp. 401–407.

[31] X. Wilcke, H. Dimitropoulos, D16.3: Final report ondata mining, Tech. rep., ARIADNE (2017).URL http://ariadne-infrastructure.eu/Resources/

D16.3-Final-Report-on-Data-Mining

15

User-centric pattern mining on knowledge graphs: An ...frankh/postscript/JWS2019.pdf ·...

Documents

Transcript of User-centric pattern mining on knowledge graphs: An ...frankh/postscript/JWS2019.pdf ·...