Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

30
Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets Gong Cheng , Yanan Zhang, Yuzhong Qu Websoft Research Group State Key Laboratory for Novel Software Technology Nanjing University, China

description

Presented at ISWC 2014, Riva del Garda, Italy.

Transcript of Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Page 1: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Gong Cheng, Yanan Zhang, Yuzhong Qu

Websoft Research GroupState Key Laboratory for Novel Software Technology

Nanjing University, China

Page 2: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Association search

Page 3: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Association search

?

?

?air pollution autism

Page 4: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Association search

You

?

?

?

?

Page 5: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Association search on the Web of documents

associations hidden in text

Page 6: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Association search on an entity-relation graph

Alice Bob

article-A

paper-A conf-A

conf-B

paper-B

paper-C

paper-D

inProcOf

secondAuthor reviewer

chair

firstAuthor

firstAuthor inProcOf

citessecondAuthor

cites

extends

firstAuthor

associations exposed as graph

Page 7: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

association = path

Alice Bob

paper-A conf-AinProcOfsecondAuthor reviewer

paper-B conf-BinProcOffirstAuthor chair

paper-B paper-CcitesfirstAuthor firstAuthor

paper-D paper-CcitessecondAuthor firstAuthor

paper-D article-AextendssecondAuthor firstAuthor

Page 8: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Challenge

over 1,000 associationsin DBpedia

(within 4 hops)

How to explore them?

Page 9: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Exploration methods (1)

• Clustering

• Facets

Page 10: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

cluster = pattern

paper-A conf-AinProcOfsecondAuthor reviewer

paper-B conf-BinProcOffirstAuthor chair

Paper ConferenceinProcOfauthor role

Common super-property Common class

Position 1 Position 2 Position 3 Position 4 Position 5

associations

pattern

match

Page 11: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Problem: To recommend k patterns

paper-A conf-AinProcOfsecondAuthor reviewer

paper-B conf-BinProcOffirstAuthor chair

paper-B paper-CcitesfirstAuthor firstAuthor

paper-D paper-CcitessecondAuthor firstAuthor

paper-D article-AextendssecondAuthor firstAuthor

Page 12: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Step 1: Mining all significant patterns

Paper ConferenceinProcOfauthor role

paper-A conf-AinProcOfsecondAuthor reviewer

paper-B conf-BinProcOffirstAuthor chair

paper-B paper-CcitesfirstAuthor firstAuthor

paper-D paper-CcitessecondAuthor firstAuthor

paper-D article-AextendssecondAuthor firstAuthor

frequency = 2/5 > threshold

Page 13: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Formulated as frequent itemset mining

1. transaction = associationitem = <position, class> or <position, property>

2. Mining frequent itemsets

3. itemset pattern

paper-A conf-AinProcOfsecondAuthor reviewer

<1, secondAuthor><1, author>

<2, ConfPaper><2, Paper>

<3, inProcOf> <4, Conference> <5, reviewer><5, role>

Position 1 Position 2 Position 3 Position 4 Position 5

Page 14: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Formulated as frequent itemset mining

1. transaction = associationitem = <position, class> or <position, property>

2. Mining frequent itemsets

3. itemset pattern

paper-A conf-AinProcOfsecondAuthor reviewer

<1, author><2, ConfPaper><2, Paper>

<3, inProcOf> <4, Conference><5, role>

Position 1 Position 2 Position 3 Position 4 Position 5

Page 15: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Formulated as frequent itemset mining

1. transaction = associationitem = <position, class> or <position, property>

2. Mining frequent itemsets

3. itemset pattern

paper-A conf-AinProcOfsecondAuthor reviewer

<1, author><2, ConfPaper><2, Paper>

<3, inProcOf> <4, Conference><5, role>

Paper ConferenceinProcOfauthor role

Page 16: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Step 2: Finding k frequent, informative, and small-overlapping patterns

• Frequency (as previous)

• Informativeness

• Overlap

Page 17: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Step 2: Finding k frequent, informative, and small-overlapping patterns

• Frequency (as previous)

• Informativeness• informativeness of a class = self-information of its occurrence

(more informative = having fewer instances)e.g. ConfPaper > Paper

• informativeness of a property = entropy of its values(more Informative = having more diverse values)

e.g. is-author-of > nationality

• Overlap

Paper ConferenceinProcOfauthor role

Page 18: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Step 2: Finding k frequent, informative, and small-overlapping patterns

• Frequency (as previous)

• Informativeness

• Overlap• Ontological overlap: holding subClassOf/subPropertyOf relations

• Contextual overlap: matched by common associations in the results

Paper PapercitesfirstAuthor author

ConfPaper ConferenceinProcOfauthor role

ontological overlap

Page 19: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Formulated as multidimensional 0-1 knapsack

• Find k patterns thatmaximize frequency*Informativeness (goal)

and not share considerably large overlap (constraints)

• Solved by a greedy algorithm

Page 20: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Exploration methods (2)

• Clustering

• Facets• facet values = classes of entities and properties

appearing in associations in the results

• Problem: To recommend k facet values(solved in a similar way)

paper-A conf-AinProcOfsecondAuthor reviewer

ConfPaper Paper Conference

Page 21: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Demo based on DBpediaws.nju.edu.cn/explass

Page 22: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Demo based on DBpediaws.nju.edu.cn/explass

facet values(classes)

facet values(properties)

Page 23: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Demo based on DBpediaws.nju.edu.cn/explass

an expanded pattern

a collapsed pattern

associations not matching any pattern above

Page 24: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

User study

• 26 association exploration tasks over DBpedia• Derived from QALD queries and

“People also search for”

• Example: Suppose you will write an article about the associations between Abraham Lincoln and George Washington. Use the given system to explore their associations and identify several themes to discuss in the article.

• 20 subjects

• 3 approaches• Explass: clustering + facets

• RelClus: clustering into a hierarchy of patterns

• RF: facets only (similar to RelFinder)

from QALD

Page 25: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Post-task questionnaire results

Page 26: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Usability scores (SUS)

Page 27: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

User behavior

Page 28: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Conclusion

1. Provide patterns wisely.• To avoid deep, complicated hierarchy

• To avoid very general, almost meaningless concepts

2. Combine patterns and facets wisely.• Patterns as meaningful summaries of results

• Facets as filters for refining the search

Filters Summaries of results

Page 29: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Future work

• Performance optimization• (online) path finding

• (online) frequent itemset mining

• Exploring associations between several entitiesor, a data set

Page 30: Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets

Questions?