Generating Researcher Networks with Identified Persons on a Semantic Service Platform

33
Copyright © 2004-2009, KISTI 1 BlogTalk2009 Generating Researcher Networks with Identified Persons on a Semantic Service Platform 15 Sep. 2009 Hanmin Jung KISTI

description

Hanmin Jung(KISTI)

Transcript of Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Page 1: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI1BlogTalk2009

Generating Researcher Networkswith Identified Persons

on a Semantic Service Platform

15 Sep. 2009

Hanmin Jung

KISTI

Page 2: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI2BlogTalk2009

Research networks would be useful for finding

Collaborators

Speakers (Key persons of a researcher group)

Issues

Getting sources

Resolving identities

Finding experts

Generating networks

Agenda

Page 3: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI3BlogTalk2009

Getting sources …

Page 4: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI4BlogTalk2009

Identified Entities

Papers: 453,124Elsevier international journal papers with full-texts and metadata

Persons: 1,352,220

Topics: 339,947

Institutions: 91,514

Locations: 409,575 (with GPS coordinate)

RDF Triples: 283,087,518 (2008.11)

Sources

Page 5: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI5BlogTalk2009

Resolving identities … How to resolve identities?How to merge different identifiers as one?

Page 6: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI6BlogTalk2009

OntoFrame

RDF Triple Store

OntoURI®

ListenerTriple

Generator

OntoReasoner®

OntoFrame 2008 Service

WS API/SPARQL

XML

WS API

SQL/Expanded Triples

FieldInformation

DB Tables

Legacy DB Table

Legacy DB Table

… FieldInformation

OntologyInstances

OntologySchemataOntologies Search Engine

XML

WS API

WS API

Answers

Page 7: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI7BlogTalk2009

Ontology

Reference and Academic Knowledge Ontologies

Page 8: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI8BlogTalk2009

OntoFrame

Select Database & OntologySelect Database & Ontology

Edit Mapping RulesEdit Mapping Rules

Design Ontology ModelDesign Ontology Model Edit URI Generation RulesEdit URI Generation Rules

Edit Identity Resolution RulesEdit Identity Resolution Rules

Crawl DatabaseCrawl DatabaseNormalize Field ValuesNormalize Field Values

Extract TopicsExtract TopicsResolve IdentitiesResolve Identities

Refer Authority DataRefer Authority Data

Apply sameAs RelationsApply sameAs Relations

Apply Mapping RulesApply Mapping Rules

Test Mapping ProcessTest Mapping Process

Apply Identity Resolution RulesApply Identity Resolution Rules

Apply URI Generation RulesApply URI Generation Rules

Modeling-TimeProcess

Indexing-TimeProcess

Generate RDF TriplesGenerate RDF Triples

Assign URIsAssign URIs

Syntactic-to-Semantic Process

Run-TimeProcess

Page 9: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI9BlogTalk2009

Identity Resolution

One or TwoPersons?

Barry G.T.Lowden

BarryLowden

One or TwoPersons?

ChristianBecker

ChristianBecker

case 1case 1 case 2case 2

Page 10: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI10BlogTalk2009

Identity Resolution

Rules for Resolving Personal Identities

Class Resource Kind Match Relation Source Weight

Person Order 1

Person Name Pivot Exact Single OntoURI

Person hasInstitution Feature Exact Single OntoURI 2

Person Email Feature Number Single 4

Person hasCoauthor Feature Number Multiple OntoReasoner 1

Person hasTopic threshold 0.8

Page 11: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI11BlogTalk2009

Identity Resolution

Authority Data

Normalized Form Variant Form Kind Class

IBM International Business Machines Corporation Abbreviation Institution

Microsoft MS Abbreviation Institution

Microsoft 마이크로소프트 Korean Institution

London 런던 Korean Location

Academic Inc. Academic Press Inc, LTD Alternative Publication

Page 12: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI12BlogTalk2009

Identity Resolution

sameAs

Authorization

Page 13: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI13BlogTalk2009

Identity Resolution

sameAs

Candidates

Page 14: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI14BlogTalk2009

ReSIST (2006 ~ 2008)

Page 15: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI15BlogTalk2009

ReSIST (2006 ~ 2008)

Resilience Knowledge Base

"Deliverable D31: Final Workshop report" by ReSIST

Page 16: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI16BlogTalk2009

LOD Project

http://richard.cyganiak.de/2007/10/lod/

Linking Open Data Community Project

Available in RDF and SVG (Scalable Vector Graphics) versions

KISTI

Page 17: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI17BlogTalk2009

Finding experts … How to extract topics?How to determine topics of a researcher?

Page 18: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI18BlogTalk2009

Topic Extraction

System Architecture

Page 19: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI19BlogTalk2009

Propagating Topics of Entities

Topic Propagation

Article Person

Page 20: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI20BlogTalk2009

Experts Finding

Process

Knowledge expansionMaking direct relations for shorter access path

Experts retrievalQuerying with SPARQL for a given topic

Converting SPARQL-to-SQL

Using backward chaining path

Post-processingGrouping and counting retrieved authors

Ranking by names or the number of achievements

Making an XML document as the result

Page 21: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI21BlogTalk2009

Knowledge Expansion

Inference Rule

@prefix isrl: <http://www.kisti.re.kr/isrl/ResearchRefOntology#>

(?x isrl:hasCreatorInfo ?y) (?y isrl:hasCreator ?z) ->

(?x isrl:createdByPerson ?z)

Article

hasCreatorInfo

CreatorInfo

hasCreator

Person

createdByPerson

……

Page 22: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI22BlogTalk2009

Experts Retrieval

Backward Chaining Path

Page 23: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI23BlogTalk2009

Generating networks … How to find a researcher group?How about similar researchers?

Page 24: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI24BlogTalk2009

OntoFrame 2008

Page 25: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI25BlogTalk2009

Researcher Networks (T, P)

Page 26: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI26BlogTalk2009

Process

Getting co-author pairs for a target topic (T) SELECT DISTINCT ?person1 ?person2

WHERE {

?article aca:yearOfAccomplishment ?year .

FILTER(?year>=startYear && ?year<=endYear) .

?article aca:hasTopicOfArticle <topURI> .

?article aca:createdByPerson ?person1 .

?article aca:createdByPerson ?person2 .

FILTER(?person1 < ?person2) .

}

Selecting a target researcher (P) in the pairs

Tracing group members connected with him (seed)

Researcher Networks (T, P)

Page 27: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI27BlogTalk2009

Researcher Networks (P)

Page 28: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI28BlogTalk2009

Process

Getting co-author pairs including a target researcher (P) SELECT ?per1 ?per2

WHERE {

?article aca:yearOfAccomplishment ?year .

FILTER(?year>=startYear && ?year<=endYear) .

?article aca:createdByPerson ?per1 .

?article aca:createdByPerson ?per2 .

FILTER(?per1 < ?per2) .

FILTER(?per1=<perURI> || ?per2=<perURI>) .

}

Ranking them with the frequency of co-authorship

Researcher Networks (P)

Page 29: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI29BlogTalk2009

Similar Researchers

Page 30: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI30BlogTalk2009

Process (1/2)

Getting topics of a target researcher (P) SELECT ?per1 ?topic

WHERE {

?article aca:createdByPerson ?per1 .

?article aca:hasTopicArea ?topicArea .

?topicArea aca:hasTopicOfTopicArea ?topic .

FILTER(?per1=<perURI>) .

}

Ranking and selecting top n topics for him

Similar Researchers (P)

Page 31: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI31BlogTalk2009

Process (2/2)

Getting researchers who largely share topics with him SELECT DISTINCT ?per2

WHERE {

?per2 aca:hasTopicOfPerson ?topic1 .

?per2 aca:hasTopicOfPerson ?topic2 .

?per2 aca:hasTopicOfPerson ?topic3 .

?per2 aca:hasTopicOfPerson ?topic4 .

FILTER(?per2!=<perURI>) .

FILTER(?topic1 < ?topic2 && ?topic2 < ?topic3 && ?topic3 < ?topic4) .

{

FILTER(?topic1=<topic[0]> || ?topic1=<topic[1]> || ?topic1=<topic[2]> || ?topic1=<topic[3]> || ?topic1=<topic[4]>) .

FILTER(?topic2=<topic[0]> || ?topic2=<topic[1]> || ?topic2=<topic[2]> || ?topic2=<topic[3]> || ?topic2=<topic[4]>) .

FILTER(?topic3=<topic[0]> || ?topic3=<topic[1]> || ?topic3=<topic[2]> || ?topic3=<topic[3]> || ?topic3=<topic[4]>) .

FILTER(?topic4=<topic[0]> || ?topic4=<topic[1]> || ?topic4=<topic[2]> || ?topic4=<topic[3]> || ?topic4=<topic[4]>) .

}

Similar Researchers

Page 32: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI32BlogTalk2009

Processes to Generate Researcher Networks

Getting sources: Papers

Resolving identities: Rules, Authority data, sameAs

Finding experts: Topics, Reasoning

Generating networks: Topic-, Person-constrained

Next Research Topic

Service mashup to get researcher networks directly

Conclusions

Page 33: Generating Researcher Networks with Identified Persons on a Semantic Service Platform

Copyright © 2004-2009, KISTI33BlogTalk2009

Thank you

[email protected]

“A lot of times, people don’t know what they want until you show it to them.”

by Steve Jobs