Post on 17-Dec-2015
1
MedOnto: Medical Ontology Learning System
(Work in Progress)
Syed Farrukh MehdiReza Fathzadeh
S. M. Faisal Abbas (Presenter){fmehdi,reza,fabbas}@cs.dal.ca
2
Ontology◦ Machine readable information
Text◦ Human readable information, most of the current
information is text. Ontology Learning
◦ (Semi) automatic extraction of relevant concept and relations
Medical Domain
Introduction:
3
Syntax based concept learning augmented with domain specific subject corpora
Methodology
Domain Specific
Knowledge base
Syntax Based Extraction
4
Medical Domain Terminology◦ OpenGalen project
GALEN Terminology Server
For Other domains, domain specific terminology corpus should be used.
Domain Specific Corpus
5
Syntax Based Extraction Levels
Paul Buitelaar
6
Parsing ◦ Linguistic Method
Using Production Rules specified by linguists
◦ Statistical Method Using statistical models derived from written text.
We used Stanford NLP Parser which is a statistical parser
Dependency Trees instead of Parse Trees
Term Extraction
7
Domain Specific Terminology Corpus Language corpus for general concepts
◦ GRAIL Terminology Server for Medical Domain◦ WordNet for English Language
Synonym Extraction
8
Intension◦ Formal and information definition of terms
Extension◦ Deriving concepts
Linguistic Realization◦ Concept coverage
Concept Extraction
9
Terminal Concept◦ Nouns, Noun Phrases
Compound Concepts◦ Defined Rules
Terminal and Compound Concepts
10
Concepts are related Defined Rules
Relation Extraction
11
IN subordinating conjunction (FUNC_WORD) or preposition (PREP) ◦ “of”
Candidate for Taxonomy
Rules (IN)
12
CC coordinating conjunction ◦ “and”, “or” etc
◦ Compound concepts, broken into terminal concepts
Rules (CC)
13
RB adverb and adverbial phrase DT determiner/demonstrative pronoun
Ignored in our work so far
Rules (RB, DT, PDT)
14
Verb is used as a relation between subject and object
Rule (VB)
15
JJ adjective NN common noun
Rule (JJ+NN -> NP)
16
Recursive, until dependency tree is exhausted
Create compound concepts and relate them with the rule and then apply the rules on the sub phrases
Algorithm
17
Framework Institution Reference
ASIUM INRIA, Jouy--‐en--‐Josas Faure and Nedellec 1999
TextToOnto AIFB, University of Karlsruhe Madche and Volz 2001
HASTI Amir Kabir University, Teheran Shamsfard,Barforoush2004
OntoLT DFKI, Saarbrucken Buitelaar et al. 2004
DOODLE Shizuoka University Morita et al.2004
Text2Onto AIFB, University of Karlsruhe Cimiano and Volker 2005
OntoLearn University of Rome Velardi et al. 2005
OLE Brno University of Technology Novacek and Smrz 2005
OntoGen Institute Jozef Stefan, Ljubljana Fortuna et al., 2007
GALeOn Technical University of Madrid Manzano-Macho et al. 2008
DINO DERI, Galway Novacek et al.2008
OntoLancs Lancester University Gacitua et al. 2008
RELExO AIFB, University of Karlsruhe Volker and Rudolph 2008
OntoComp University of Dresden Sertkaya 2008
Other Work
18
[Buitelaar05] Paul Buitelaar, etal. Ontology Learning from Text, October 3 rd , 2005
[Kim09] Jin-Dong Kim et al., Overview of BioNLP’09 Shared Task On Event Extraction
[Stuck] Semantic Technologies, Ontology Learning, Prof. Dr. Heiner Stuckenschmidt, Dr. Johanna Völker
[Biemann] Chris Biemann: Ontology Learning from Text: A Survey of Methods
[StanParser] http://nlp.stanford.edu/software/lex-parser.shtml
[WordNet] http://wordnet.princeton.edu/ [OpenGALEN] http://www.opengalen.org/
References
19
Please provide us Comments and Directions
Thank you.