SEDE: An Ontology For Scholarly Event Description

63
SEDE: An Ontology for Scholarly Event Description Senator Jeong [email protected] Biomedical Knowledge Engineering Lab., Seoul National University

description

I present the design and implementation of an ontology for scholarly event description (SEDE) to provide a backbone to represent, collect, share and allow inference from scholarly event information

Transcript of SEDE: An Ontology For Scholarly Event Description

Page 1: SEDE:  An Ontology For Scholarly Event Description

SEDE: An Ontology for Scholarly Event

DescriptionSenator Jeong

[email protected] Knowledge Engineering Lab.,

Seoul National University

Page 2: SEDE:  An Ontology For Scholarly Event Description

Publications

Senator Jeong. Toward Scholarly Event Digital Library Services. Bulletin of IEEE Technical Committee on Digital Libraries. 2008 Fall 2008;4(2).

Senator Jeong, Hong-Gee Kim. “SEDE: An Ontology for Scholarly Event Description“. Journal of Information Science. [in press] DOI: 10.1177/0165551509358487.

Senator Jeong, Sungin Lee, Hong-Gee Kim. “Are You an Invited Speaker?: A Bibliometric Analysis of Elite Groups for Scholarly Events in Bioinformatics“. Journal of the American Society for Information Science and Technology. 2009;60(6). pp.1118-1131.

Senator Jeong, Hong-Gee Kim. “Intellectual Structure of Biomedical Informatics reflected in Scholarly Events“. Scientometrics. [in press].

2

Page 3: SEDE:  An Ontology For Scholarly Event Description

Table of Contents

1 Introduction & Background

2 Generic Event Model

3 The SEDE Model & Implementation

4 Application Use Case Scenarios

5 Ontology Evaluation

6 Discussion & Conclusion

Page 4: SEDE:  An Ontology For Scholarly Event Description

INTRODUCTION & BACKGROUND

Page 5: SEDE:  An Ontology For Scholarly Event Description

Scholarly Events

5

• Conferences, Workshops, Seminars, Symposia• A sequentially and spatially organized collection of scholars’

interactions• with the intention of

• Delivering and Sharing knowledge, • Exchanging Research Ideas, and • Performing related activities.

Page 6: SEDE:  An Ontology For Scholarly Event Description

Scholarly Events

Publish up-to-date scientific research results,

Get feedback from scientific communities

Exchange research interests and ideas with each other

Demonstrate current research trends

Page 7: SEDE:  An Ontology For Scholarly Event Description

Information Needs wrt Scholarly Events

Information need of a simple magnitude

• Event Name, Topics• Event Date, Venue, Organizer• Due dates for Calls for Paper

A scientist does not gets a full and exhaustive picture of scholarly events held in the world

• Due to the sheer volume of events held by various academic societies and organizations

• no single information channel has been successful at keeping track of ever-growing conferences and providing their information to scientists

Page 8: SEDE:  An Ontology For Scholarly Event Description

Information Needs wrt Scholarly Events

Scientifically meaningful inference• prominent scientists• prominent events• best scientists suited for consultations and collaboration

might be met partially at a minimal level • since almost all event websites list leadership members such as• general chairs, committee members, invited speakers and/or award

winners• Users are not able to get the whole picture• existing library services do not provide this kind of meaningful

information in an integrated and collective manner

Page 9: SEDE:  An Ontology For Scholarly Event Description

Research Goal

Satisfy scientists’ basic information needs• by collecting, archiving and providing access to scholarly event

information.

Satisfy users’ in-depth information needs • by excavating scholarly meaningful information through reasoning

about knowledge

To define a description base for scholarly events • to enable software agents to crawl and extract event data, and • to facilitate the unified access to, and reason about, the collected

data

Page 10: SEDE:  An Ontology For Scholarly Event Description

Previous Work

• EventSeer, PapersInvited, Conference Alerts– focus on calls for papers– simple metadata about forthcoming events– proprietary description formats

• Semantic Web Conference ontology– best only for the ESWC conference

• Event Driven Model– ABC ontology, INDECS, OntologX, FRBR, CIDOC-

CRM, Enterprise Architecture, Event Ontology

Page 11: SEDE:  An Ontology For Scholarly Event Description

GENERIC EVENT MODEL

provide enough descriptive power and granularity to

span over multiple scientific disciplines and capture as many varied event types as possible

Page 12: SEDE:  An Ontology For Scholarly Event Description

“2008-11-08”

“John Smith”

“Biomedical Modeling”

“Washington, DC”

Time

Place

Agent

Entity

Agent (Who)

Time (When)

Place (Where)

Entity (What)

Action (How)

Event Presentation Event

“Present” Action

Generic Event ModelEvent≡ (∃Agent) ∧ (∃Action) ∧ (∃Entity) ∧ (∃Place) ∧ (∃Time)

12

(∃Agent(John.Smith)) ∧ (∃Action(present)) ∧ (∃Entity(Biomedical Modelling)) ∧(∃Place(Washington)) ∧ (∃Time(2008–11–08)).

Page 13: SEDE:  An Ontology For Scholarly Event Description

The classes of the generic event model

Page 14: SEDE:  An Ontology For Scholarly Event Description

THE SEDE MODEL & IMPLEMENTATIONOntology modelling principle

Scholarly event description structure

Key concepts in the SEDE ontology

n-ary relations and reification heuristics

Ontology improvement

Page 15: SEDE:  An Ontology For Scholarly Event Description

Scholarly Event

Track

SessionAtomEvent

AtomEvent

SessionAtomEvent

AtomEvent

SessionAtomEvent

AtomEvent

Track

SessionAtomEvent

AtomEvent

…Session

AtomEvent

AtomEvent

…… Scholarly

Event

ScholarlyEvent

Scholarly Event Description Structure

15

Page 16: SEDE:  An Ontology For Scholarly Event Description

foaf:Agent

Event Series

Committee

Call

Track

Session

hasSession

hasSessionChair

skos:Concept

skos:ConceptScheme

foaf:Group

Program

hasProgram

hasTrackhasCommittee

hasAtomEvent

CommitteeRolehasPresenter

foaf:Person

isMemberEventOf

hasTopic

hasTopic

Time

hasChildEvent

hasCommitteeRole

hasProceedings

Paper Proceedings

endDate

startDate

Role

playedBy

Presentation

hasTheme

hasSession

hasCall

skos:inSchemefoaf:Document

Artifact

hasArtifact

VideoClipPlace Country

City

Venue

geo:SpatialThing

Event

heldAt

AtomEvent

hasTopic

16

Page 17: SEDE:  An Ontology For Scholarly Event Description
Page 18: SEDE:  An Ontology For Scholarly Event Description

RDFS/OWL

18

Page 19: SEDE:  An Ontology For Scholarly Event Description

http://eventography.org/sede

Page 20: SEDE:  An Ontology For Scholarly Event Description

http://eventography.org/sede

Page 21: SEDE:  An Ontology For Scholarly Event Description

UML representation of Scholarly Event.

21

Page 22: SEDE:  An Ontology For Scholarly Event Description

The reified relationship btw. Committeeand Agent via CommitteeRole

22

Page 23: SEDE:  An Ontology For Scholarly Event Description

APPLICATION USE CASE SCENARIOS

Page 26: SEDE:  An Ontology For Scholarly Event Description

26

Ontology-based Information Extraction

Page 27: SEDE:  An Ontology For Scholarly Event Description

Ontology-based Information Extraction

• The limitations of fully automatic information extraction techniques

• The heterogeneous nature of event web pages • Strategy

– to make use of a more simple approach of data extraction,

– utilizes manually defined patterns of text content and HTML formatting based on general conventions for listing data in human-readable formats on the web.

27

Page 28: SEDE:  An Ontology For Scholarly Event Description

Method: Rule based Pattern Matching

Text Tokenizer

Extender

HTML Parser

Directory

Parse HTML

HTML Document

Opening HTML Tags:• tr, p, div à newlines• td à Tab• li à bulletClosing HTML tags:• p, table, li, h1-5, br à

newlines

• Tokenize text• pre-tag • Separate punctuation marks

(/n, “”, ,, !, (), :,;, .)• append EOF tag• split text by spaces• return array of tokens

Assign Tags

• Directory class call ‘createTagIndex’ function

• Match Tags using REG keyword matches and gazetter lookup

matchLookup

Text string

Tokenizetext

TokenArray

TagArray

Tag form: /aBCD• a: Tag Category• BCD: Tag description

Start

(Grammar Parser)Chainer

List of rules for identifying similar patterns of tags

String + chain index

+chain Type

Realmer

Realm Data

ModifyRealm Data

Exporter

Add Realm

Holds a hierarchy of realmsEach realm correspond to a different chain in the document

Extracted Data

Data Extraction

RuleLookup

End

Regular Expression Keyword

Gazetteer

28

Page 29: SEDE:  An Ontology For Scholarly Event Description

Method: Tag Cassification

Punctuations /pCOM

Literal/lEML

Tag

Data & Numbers/iYEA

Name-Related/nTTL

Keywords/kUNI

Additional/xCAP

Grammar related /gOF

Category Tag MeaningGrammerCategory

/gART [article ex. the|this|its|...]/gOF of/gFOR for/gON on/gAT at/gIN in/gABT about/gFRM from/gTO To | through | until/gCNJ [conjunction = and | or | &] 29

Page 30: SEDE:  An Ontology For Scholarly Event Description

Method: Tag Cassification

Tag Meaning Example/UNI university universtiy|college|academy|Universitat.../CTR center center|centre|institute|department|division/ORG organization society|association|council|consortium/EVT event conference|conf|symposium|meeting|congress|roundtable|colloquium|seminar|summit|convention|forum|program/QUA qualifier annual|biannual|biennial|interdisciplinary|special|joint|asian|european|international|metropolitan|national|polytechnic|glob

al|graduate|limited|ltd(\\.)?|incorporated|inc(\\.)?|int(\\.)|applied)/SBJ subject (Aeronautics|aerospace|Agriculture|applications|Astronomy|Biology|Biotechnology|Biochemistry|bioinformatics|business

|Chemistry|Cryptology|Ecology|economics|Electronics|Energy|Engineering|Environment|Forensics|Geography|health|informatics|information|Mathematics|Mechanical|medicine|Meteorology|Nanotechnology|Oceanography|Paleontology|Physics|Policy|Psychology|Research|science(s)?|security|securities|solution(s)?|Space|systems|technology|Vibrations|Wireless)"

/OTH other (webpage-related)

"(Main|Media|Home|you|of|(Us)|((?i)(tutorial|proceeding(s?)|download|PDF|PostScript|HTML|MSWord|LaTex|Format|ASCII|collocated|copyright|see|contact)))

Punctuations /pCOM

Literal/lEML

Tag

Data & Numbers/iYEA

Name-Related/nTTL

Keywords/kUNI

Additional/xCAP

Grammar related /gOF

Page 31: SEDE:  An Ontology For Scholarly Event Description

Realms: ExampleThere were few surprises about the submission of the paperIt will take place at the University of Technology, Brahms, Canada.

Submission due date: September 5th, 2009

TEXT_CHUNKSUBMISSION_MARKERUNIVERSITY_NAMECOUNTRY

DEADLINE_CONTAINERSUBMISSION_MARKERDATE

DEADLINE_CONTAINERNOTIFICATION_MARKERDATE

COMMITTEE_MARKERAFFILIATION_GROUPNAMEUNIVERSITY_NAMECOUNTRY

Program Committee:Dolldrum Flannery, University of Texas, USA

HTML Text Realms

Notification date: November 6th, 2009

Page 32: SEDE:  An Ontology For Scholarly Event Description

Implementation: Workbench

32

Page 33: SEDE:  An Ontology For Scholarly Event Description

Implementation: Export to RDF KB

33

Page 35: SEDE:  An Ontology For Scholarly Event Description

Semantic S&R on Scholarly Events(1)

• Finding events with a specific call-for-paper topic, a submission deadline, and an event start date

SELECT DISTINCT ?Topic ?Event ?Deadline ?Event_StartWHERE { ?x a sede:Event; rdfs:label ?Event. ?x sede:hasCall ?y.?y rdfs:label ?Call. ?y sede:hasTopic ?z. ?z skos:prefLabel ?Topic.?y sede:submissionDeadline ?Deadline. ?x sede:startDate ?Event_Start.FILTER ( (regex(?Topic, "data mining")||regex(?Topic, "Data mining") )|| (regex(?Topic, "Ontolog*")||regex(?Topic, "ontolog*") ) ) }ORDER BY ?Topic

35

Page 36: SEDE:  An Ontology For Scholarly Event Description

Semantic S&R on Scholarly Events(2)

• Retrieving artifacts from an atom event:

• A user missed an invited talk session on the topic of “semantic search” at the ESWC2008 Conference. So, the user searches for invited talk session covering that topic to come up with its video clip URI.

36

Page 38: SEDE:  An Ontology For Scholarly Event Description

SELECT ?Topic ?Presenter ?Video_Clip ?Event ?Session WHERE {?x a sede:Event. ?x skos:altLabel ?Event.?x sede:hasSession ?y. ?y rdfs:label ?Session.?y sede:hasAtomEvent ?z.?z sede:hasPresenter ?p.?p foaf:name ?Presenter.?z rdfs:label ?AtomEvent.?z sede:hasArtifact ?c. ?c dc:identifier ?Video_Clip.?z sede:hasTopic ?t. ?t skos:prefLabel ?Topic.FILTER ((regex(?Event, "ESWC*"))&&((regex(?Session, "Invited Talk")||regex(?Session, "invited talk")))&&((regex(?Topic, "Semantic Search")||regex(?Topic, "semantic search"))))}

Semantic S&R on Scholarly Events(2)

38

Page 39: SEDE:  An Ontology For Scholarly Event Description

Semantic S&R on Scholarly Events(3)

• Finding domain experts

SELECT DISTINCT ?Domain ?Expert ?AffiliationWHERE{

?x a sede:Session. ?x sede:hasTopic ?topic. ?topic skos:prefLabel ?Domain.?x sede:hasSessionChair ?chair. ?chair foaf:name ?Expert.FILTER (regex(?Domain, "Decision")|| regex(?Domain, "decision”))OPTIONAL{?chair sede:hasAffiliation ?y. ?y foaf:name ?Affiliation.}

}ORDER BY ?Domain

39

Page 41: SEDE:  An Ontology For Scholarly Event Description

Coupling of Events and Scientists

( ) , ,

2 2, ,

, t i t ji j

t i t j

w wsim E E

w w= ∑

∑ ∑

41

Page 43: SEDE:  An Ontology For Scholarly Event Description

Domain Knowledge Structure Analysis

43(data mining and its usage context in Bioinformatics, cosine ≥0.1; k-nn 2; n=69)

Page 44: SEDE:  An Ontology For Scholarly Event Description

*Co-word Analysis: Assumption

article Topic A

Topic B

These two topics are likely to be related

article

articleTopic C

……

……

44

Page 45: SEDE:  An Ontology For Scholarly Event Description

*Co-word Analysis

1 1

2 2 2 2

1 1 1 1

Cosine( , )( ) ( )

n n

i i i ii i

n n n n

i i i ii i i i

x y x yx y

x y x y

= =

= = = =

= =

×

∑ ∑

∑ ∑ ∑ ∑

SNA.dat file

d3

d2

d1

111001100101t4t3t2t1

t3t2t1

021205310t3t2t1

t

t

tt

t

t

t

tt

t

t

t

,,

,

logi ji j

k j ik

f NW TF IDFn n

= × = ×∑

Papers from Events

Event Topics

45

Page 46: SEDE:  An Ontology For Scholarly Event Description

*Tool: BiKE Text Analyzer (BTA)

• Java Application

• Vocabulary Manager

• Synonym Manager

• Stopword Manager

• Stemming Manager

46

Page 47: SEDE:  An Ontology For Scholarly Event Description

*Tool: BTA: Identify variables

47

Page 48: SEDE:  An Ontology For Scholarly Event Description

*Tool: BTA: SNA data file

48

Page 50: SEDE:  An Ontology For Scholarly Event Description

Generation of Domain KOS<skos:Concept rdf:ID="BiomedicalInformaticsAndComputation">

<skos:prefLabel>Biomedical informatics and computation</skos:prefLabel> <skos:inScheme rdf:resource="#BIBE2007Themes"/><skos:narrower rdf:resource="#Bio-molecularAndPhylogeneticDatabases"/><skos:narrower rdf:resource="#DataVisualization"/><skos:narrower rdf:resource="#Interoperability"/><skos:narrower rdf:resource="#BiomedicalImaging"/><skos:narrower rdf:resource="#DrugDiscoveryGeneExpressionAnalysis"/><skos:narrower rdf:resource="#MolecularEvolutionAndPhylogeny"/><skos:narrower rdf:resource="#Bio-Ontology"/><skos:narrower rdf:resource="#BioinformaticsEngineering"/><skos:narrower rdf:resource="#ProteinStructurePredictionAndMolecularSimulation"/><skos:narrower rdf:resource="#SystemBiology"/><skos:narrower rdf:resource="#SignalingAndComputationBiomedicalDataEngineering"/><skos:narrower rdf:resource="#ModelingAndSimulation"/><skos:narrower rdf:resource="#QueryLanguages"/><skos:narrower rdf:resource="#SequenceSearchAndAlignment"/><skos:narrower rdf:resource="#Proteomics"/><skos:narrower rdf:resource="#Telemedicine"/><skos:narrower rdf:resource="#FunctionalGenomics"/><skos:narrower rdf:resource="#IdentificationAndClassificationOfGenes"/><skos:narrower rdf:resource="#Biolanguages"/>

</skos:Concept>

<skos:Concept rdf:ID="Semantic_Web"><skos:prefLabel>Semantic Web</skos:prefLabel><skos:inScheme rdf:resource="#ICSD2009CfPTopics"/><skos:topConceptOf rdf:resource="#ICSD2009CfPTopics"/>

…………….. <skos:narrower rdf:resource="#Knowledge_Organization_and_Ontologies"/>

</skos:Concept>

<skos:Concept rdf:ID="Bio-Ontologies"><skos:prefLabel>Bio-Ontologies</skos:prefLabel><skos:inScheme rdf:resource="#Bio-OntologiesBioLink2006Topics"/>

<skos:narrower rdf:resource="#Current_Research_In_Ontology_Languages_and_its_implication_for_Bio-Ontologies"/><skos:narrower rdf:resource="#Biological_Applications_of_Ontologies"/><skos:narrower rdf:resource="#Reports_on_Newly_Developed_or_Existing_Bio-Ontologies"/><skos:narrower rdf:resource="#Tools_for_Developing_Ontologies"/><skos:narrower rdf:resource="#Use_of_Semantic_Web_technologies_in_Bioinformatics"/><skos:narrower rdf:resource="#The_implications_of_Bio-Ontologies_or_the_Semantic_Web_for_the_drug_discovery_process"/>

</skos:Concept>

<skos:Concept rdf:ID="ComputingLearningOrBehaviour"><skos:prefLabel>Computing learning or behaviour</skos:prefLabel><skos:topConceptOf rdf:resource="#BSBT2009Theme"/><skos:inScheme rdf:resource="#BSBT2009Theme"/><rdfs:label>Computing learning or behaviour</rdfs:label><skos:narrower rdf:resource="#Ontologies"/><skos:narrower rdf:resource="#MathematicalBiology"/><skos:narrower rdf:resource="#ModellingLearningInLivingSystems"/><skos:narrower rdf:resource="#TeachingHumanoidRobots"/>

</skos:Concept>

skos:related

skos:broader

owl:sameAs

50

Page 52: SEDE:  An Ontology For Scholarly Event Description

Academic Performance Evaluation

52

Page 53: SEDE:  An Ontology For Scholarly Event Description

Scholar’s Prominence Evaluation

1( | )( )

nt t t

t T

f

w k fP S

=∈=∑

Weight# of Elite Group

Membership

Field

# of Events in a Specific Field

Normalizer

Definition (1)

Prominence of Scholar S

Elite Group Type

53

Page 54: SEDE:  An Ontology For Scholarly Event Description

Scholarly Event’s Prominence Evaluation Metrics

54

Page 55: SEDE:  An Ontology For Scholarly Event Description

Scholarly Event’s Prominence Evaluation Metrics

1 ( )( )

ns

s S

f

P SP E

=∈=∑

Event’s Prominence

Scholar’s Prominence(Def. 1)

# of Elite Group Member for an Event belong to a Specific Field

Definition (2)

55

Page 56: SEDE:  An Ontology For Scholarly Event Description

Event Series’ Prominence Evaluation

56

Page 57: SEDE:  An Ontology For Scholarly Event Description

Event Series’ Prominence Evaluation

1 ( )( )

ng

g G

f

P EP

zε τ

=∈=∑

Event Prominence(Def. 2)

Definition (3)

# of event instances (e.g.,AMIA2009)belonging to Event Series (AMIA)in a given subject field (Medical Informatics)

Event Series Prominence

57

Page 58: SEDE:  An Ontology For Scholarly Event Description

ONTOLOGY EVALUATION

Page 59: SEDE:  An Ontology For Scholarly Event Description

Ontology Evaluation

Page 60: SEDE:  An Ontology For Scholarly Event Description

Ontology EvaluationCompetency Question SEDE SWC

Does it have a container for topics?

Yes. It uses SKOS to describe topics.

No. It uses SWRC’s research topic which has a limited number of topics.

Does it have a container for committees?

Yes. It has the Committee class No.

Does it identify various roles in a committee?

Yes. It defines a generic class Role identifiable with a label.

No. It enumerates Chair, Delegate, Presenter, Program Committee Member, resulting in no mechanisms to identify variant names such as co-chair, vice-chair, founder, etc.

Does it support the representation of an event’s structure in a flexible way?

Yes. It is more flexible than SWC, in that it furnishes the class from the top level (Event) down to the leaf level classes (AtomEvent).

Arguable. The WorkshopEvent, TutorialEvent, ConferenceEvent, and PanelEvent should be deprecated, since they can be described with the top level class, such as AcademicEvent, TrackEvent and SessionEvent.

Does it have a container for Call?

Yes, it has the Call class No. The Call class was deprecated, and it uses the CfP ontology.[1] CfP Vocabulary Specification, http://sw.deri.org/2005/08/conf/cfp.html 60

Page 61: SEDE:  An Ontology For Scholarly Event Description

DISCUSSION & CONCLUSION

Page 62: SEDE:  An Ontology For Scholarly Event Description

Discussion & Conclusion

• The SEDE ontology provides a backbone to represent, collect, share and allow inference from scholarly event information in a logical way

• Basic information needs– semantic search and retrieval using the facts stored in the KB

• Scientifically meaningful information needs– unearth hidden knowledge for the academic community

• SEDE– helps to improve information accessibility through greater

semantic interoperability of information.– makes it possible to build a scholarly semantic web

• isolated pieces of scholarly event data integrated through relationships with other scientific data on the web thus creating added information.

Page 63: SEDE:  An Ontology For Scholarly Event Description

SEDE: An Ontology for Scholarly Event

DescriptionSenator Jeong

[email protected]

Biomedical Knowledge Engineering Lab.,

Seoul National University