1(c) 2012 Adam Pease – [email protected]
Formal Ontology andthe Suggested Upper Merged
Ontology (SUMO)
Adam Pease, Articulate [email protected]
http://www.ontologyportal.org/
2(c) 2012 Adam Pease – [email protected]
NLP Application of Formal Ontology
• Sentiment Analysis
• Information Extraction
• Question Answering/Textual Entailment
• Word Sense Disambiguation
• Document classification and clustering
3(c) 2012 Adam Pease – [email protected]
Technology Map: Semantics
• Structure databases– Ensure that all fields are clear and consistent
– Database consistent with SUMO
• Expose features– Finding correlations only possible when features that correlate
are exposed• Abstraction based on superclasses etc
• Capture knowledge– Express in a computable language knowledge extracted from
text
• Inference– Deduce new facts from existing facts and rules
4(c) 2012 Adam Pease – [email protected]
Sentiment Analysis• Emotional content of text
• Pilot project combining– Sentiment analysis (computational linguistics)
– Concept extraction (linguistic semantics/ontology)
• Note this is just a pilot project and the computational linguistic method used is really basic, not state of the art
• Applications: – Fine grained search by features
– Ratings by review, not by stars, and integrated across sources
– Merge hotel ratings from different services that have different scales by using sentiment
5(c) 2012 Adam Pease – [email protected]
Meadowood, St. Helena: Restaurant:10“In recent years the elegant but unstuffy dining room has won rave reviews, becoming a destination restaurant.“
Marys Lake Lodge and Resort, CO: Roadway: -8“Not to mention it is very expensive and located in a place that doesn't get much sun so it's icy and cold; and the maintenance of roads is terrible in winter.”
6(c) 2012 Adam Pease – [email protected]
Word Sense Disambiguation
• Brown corpus of English– One million words
– “balanced” corpus from many sources – newspapers, novels etc.
• WordNet SemCor– Marked up Brown corpus by hand with parts
of speech and word senses
7(c) 2012 Adam Pease – [email protected]
WordNet SemCor
• “Bed” sense 1– air_mattress curtain sleep sleeping_bag slipper
– SUMOTerm: Bed
• “Bed” sense 2 – compost decayed manure pansy spade
spread_over yard
– SUMOTerm: CultivatedLandArea
• “Bed” sense 3 – dry face homely river tilt
– SUMOTerm: GeographicArea
8(c) 2012 Adam Pease – [email protected]
Sentiment words
type word POS polarityweak abandon verb negativeweak abate verb negativeweak abdicate verb negativestrong aberration adj negativeweak able adj positive
Created at U. Pittsburgh by having a team of students score words in WordNet, (and then validated by inter-subject agreement measures).
9(c) 2012 Adam Pease – [email protected]
Sentiment Finding
Scorer
WordSentiment
List
StopWordList
SentimentScore
Disambig.Corpus
Word SenseDisambig.
WordSenses
SUMO-WordNetMapping
SUMOConcepts
Reviews ConceptSentiment
"The guest rooms, ... are somewhat shopworn or downright shabby ..."
room (in a building) – hall 12, guest 10... room (space) – grow 22,
Associate word senses with frequency of co-occurring words
shabby -5shopworn -1...
room (in a building) → Room
SF Drake: Room (-6)
Application: Remove amenities with negative sentiment from hotel search
10(c) 2012 Adam Pease – [email protected]
Outline
• Current Application
• General concepts
• SUMO and WordNet– Ontology mapping
– Why upper ontology
• Sigma
• Pitfalls (in ontology development)
• SUMO details
• Language to Logic
11(c) 2012 Adam Pease – [email protected]
C.K. Ogden/I.A. Richards, The Meaning of MeaningA Study in the Influence of Language upon Thought and The Science of SymbolismLondon 1923, 10th edition 1969
Concept
Referent
Refers To Symbolizes
Stands For“Orange”
Terms and Concepts
from the slide of [Bargmeyer, Bruce, Open Metadata Forum, Berlin, 2005]
Slide adpated from (c) Key-Sun Choi for Pan Localization 2005
Term
Ontology work should be here,since logic is needed to substitute for
human thought.
Lots of “ontology” workhas really been here.
12(c) 2012 Adam Pease – [email protected]
Imagine...your view of the web
CV
name
education
work
private
Joe Smith
BS Case Western Reserve,1982MS UC Davis, 1984
1985-1990 ACME Software,programmer
Married, 2 children
13(c) 2012 Adam Pease – [email protected]
...and the Computer's View
name
CV
education
work
private
Εα σολυμ μινιμυμ ευμΝιβχ σολετ υβικυε εα σιθ
Ει κυις σιμιλικυε ρεφορμιδανς πρω, αν λαβωρε δισερετ μινιμυμ δυο, σεδ εα σαλυταθυς σορρυμπιθ. Ιλλυμ φασιλις ιν πρι.
Μεα συ ιψυμ υλλυμ Μελ ευ κυωδ μεδιοσριθαθεμ
Υταμυρ θραξιθ
14(c) 2012 Adam Pease – [email protected]
But wait, we've got XML -
<job name=”Joe Smith” title=”Programmer”>
15(c) 2012 Adam Pease – [email protected]
But wait, we've got XML -
<job name=”Joe Smith” title=”Programmer”>
<x83 m92=”|||||||||” title=”..............”>
18(c) 2012 Adam Pease – [email protected]
Wait, we've got semantics -
Person
Mammal
JoeSmith
instance
subclass
implies
Mammal
JoeSmith
instance
19(c) 2012 Adam Pease – [email protected]
Wait, we've got semantics -
Person
Mammal
JoeSmith
instance
subclass
implies
Mammal
JoeSmith
instance
u8475
x9834
p3489
r53
r22
implies
x9834
p3489
r53
20(c) 2012 Adam Pease – [email protected]
Taxonomy
• What's an automobile?– truck or sedan
– Alone it might be taken as not including trucks
– Does truck include 18-wheelers?
automobile
truck
AdamsHonda
sedan
21(c) 2012 Adam Pease – [email protected]
Automation
• if d is an a, a can't be a d (usually)a
b
d
c
25(c) 2012 Adam Pease – [email protected]
Fixing Meaning
Horse is a mammal that has four legs and is
capable of carrying ahuman rider that largely
controls its actions
27(c) 2012 Adam Pease – [email protected]
Language Formality & Expressiveness
Formality
Expressiveness
Human LanguageSUO-KIF
weak semanticsweak semantics
strong semanticsstrong semantics
Is Disjoint Subclass of with transitivity property
Higher Order Logic
Logical Theory
Thesaurus
Has Narrower Meaning Than
Taxonomy
Is Sub-Classification of
Conceptual Model
Is Subclass of
DB Schemas, XML Schema
UML
First Order Logic
RelationalModel, XML
ER
Extended ER
Description LogicDAML+OIL, OWL
RDF/S
XTM
Syntactic Interoperability
Structural Interoperability
Semantic Interoperability
Thanks to Leo Obrst, MITRE
Note, these are languages, not ontologies.
OWL+RuleML
28(c) 2012 Adam Pease – [email protected]
Frame Restrictions• b is between a and c
– (between1 a betweenness1)– (between2 b betweenness1)– (between3 c betweenness1)– vs– (between a b c)
• Adam is not an accountant– (notOccupation Adam Accountant)– vs– (not (occupation Adam Accountant))
• Existential vs. Universal quantification• Similar problems for many description logics• Very efficient computation however
29(c) 2012 Adam Pease – [email protected]
Existential vs Universal Quantification
• All farmers like tractors.
• Some farmer likes a tractor.
(forall (?F ?T) (=> (and (instance ?F Farmer) (instance ?T Tractor)) (likes ?F ?T)))
(exists (?F ?T) (and (instance ?F Farmer) (instance ?T Tractor) (likes ?F ?T)))
30(c) 2012 Adam Pease – [email protected]
First Order vs Higher Order
• (believes Mary (likes John Sue)) – higher order
• Higher order logic is very useful, but much harder to compute
31(c) 2012 Adam Pease – [email protected]
Digression: Implementation is Different from Representation
• Why lose meaning at design time just because of runtime issues?– We can’t reason with English definitions, but that
doesn’t mean we shouldn’t document our terms
• Many different implementations may be done from the same representation
• This does not mean that run time issues should be ignored at design time– If you represent information you know can’t be
reasoned with, it better not be essential in most conceivable applications
32(c) 2012 Adam Pease – [email protected]
Open Source
• There’s too much knowledge for any one entity to capture and code it
• Network effort – the more people that use the ontology, the more valuable it is– Needed to remove barrier to adoption
• Can’t anticipate how it will be used– Testing theories of linguistic analogy!
33(c) 2012 Adam Pease – [email protected]
Ontology vs Language and Knowledge
Ontology
- expandable- language independent- machine understandable
Language
- understood by humans- ambiguous
Knowledge
- changes rapidly- may be local to an entity
34(c) 2012 Adam Pease – [email protected]
Suggested Upper Merged Ontology
• Initial versions: 1000 terms, 4000 axioms, 750 rules
•Mapped by hand to all of WordNet 1.6• then ported to 3.0 and continually updated
•Associated domain ontologies totalling 20,000 terms and 80,000 axioms
– Now linked with factbases including YAGO for millions of facts
– New ontologies of Hotels and Dining
•Free• SUMO is owned by IEEE but basically public domain
• Domain ontologies are released under GNU
• www.ontologyportal.org
35(c) 2012 Adam Pease – [email protected]
SUMO (continued)
•Formally defined, not dependent on a particular implementation
•Open source toolset for browsing and inference
–https://sourceforge.net/projects/sigmakee/
•Many uses of SUMO (independent of the SUMO authors and funders)
–http://www.ontologyportal.org/Pubs.html
36(c) 2012 Adam Pease – [email protected]
SUMO Structure
Structural Ontology
Base Ontology
Set/Class Theory Numeric Temporal Mereotopology
Graph Measure Processes Objects
Qualities
37(c) 2012 Adam Pease – [email protected]
SUMO+Domain OntologyStructuralOntology
BaseOntology
Set/ClassTheory
Numeric Temporal Mereotopology
Graph Measure Processes Objects
Qualities
SUMO
Mid-Level
Military
Geography
Elements
Terrorist Attack Types
Communications
People
TransnationalIssues Financial
Ontology
TerroristEconomy
NAICSTerroristAttacks
…
FranceAfghanistan
UnitedStates
DistributedComputing
BiologicalViruses
WMD
ECommerceServices
Government
Transportation
WorldAirports
Total Terms Total Axioms Total Rules
20977 88257 4730
Relations: 1280
Hotel
Food&Dining
38(c) 2012 Adam Pease – [email protected]
SUMO Validation
• Mapping to all of WordNet lexicon– A check on coverage and completeness (at a given
level of generality)
• Peer review– Open source since its inception
• Formal validation with a theorem prover– Free of contradictions (within a generous time bound
for search)
• Application to dozens of domain ontologies
39(c) 2012 Adam Pease – [email protected]
WordNet
• A dictionary for computational linguistics applications
• 100,000 word senses, hand-created
• Open source
• Concise - No need for OED-style etymology
• Precise data structures
• Semantic links–Aid in computation
–Verification of meaning during construction
40(c) 2012 Adam Pease – [email protected]
Formal Ontology
• WordNet has synsets for “earlier” etc• But nothing in WordNet would allow a
computer to assert that the end of one event precedes the start of another if one event is earlier than the other
• This is not a criticism of WordNet
time(<=> (earlier ?INTERVAL1 ?INTERVAL2) (before (EndFn ?INTERVAL1) (BeginFn ?INTERVAL2)))
Interval 1 Interval 2
41(c) 2012 Adam Pease – [email protected]
Internationalization
• Translation of SUMO paraphrases to diverse multiple languages– Some confidence there’s no cultural or linguistic bias– Chinese, Hindi, Tagalog, Czech, German, Italian,
Korean, Romanian, Arabic
• SUMO is linked to multiple very large lexicons (Euro WordNet, Balkanet, HowNet etc)– English, Chinese, Italian, Arabic
42(c) 2012 Adam Pease – [email protected]
Example #1 – Simple statement
• Robert has an orange.(exists (?orange) (and (attribute Robert-1 Male) (instance Robert-1 Human) (instance ?orange OrangeFruit) (possesses Robert-1 ?orange)))
∃o attribute(Robert-1,Male) ^ Human(Robert1) ^ OrangeFruit(o) ^ possesses(Robert-1,o)
43(c) 2012 Adam Pease – [email protected]
Example #2 – Simple Query
• Who has a fruit?(exists (?fruit) (and (instance ?fruit FruitOrVegetable) (instance ?who Human) (possesses ?who ?fruit)))
∃f FruitOrVegetable(f) ^ Human(w) ^ possesses(w,f)
44(c) 2012 Adam Pease – [email protected]
Counter-example
• Brutus stabbed Caesar with a knife on Tuesday.
(exists (?S ?K ?T) (and (instance ?S Poking) (instance ?K Knife) (instance ?T Tuesday) (agent ?S Brutus) (patient ?S Caesar) (time ?S ?T) (instrument ?S ?K)))
(exists (?S ?K ?T) (and (instance ?S stabs) (instance ?K knife) (instance ?T Tuesday) (agent ?S Brutus) (object ?S Caesar) (on ?S ?T) (with ?S ?K)))
● Logical translation - “word translation”
45(c) 2012 Adam Pease – [email protected]
Example #3 – Stative
• Dickens writes Oliver Twist in 1837.(and (authors Dickens OliverTwist) (exists (?EV) (and (instance ?EV Writing) (agent ?EV Dickens) (equals (YearFn 1837) (WhenFn ?EV)) (result ?EV OliverTwist))))
46(c) 2012 Adam Pease – [email protected]
Example #4 – Another Stative
• Bob is a pianist.(and (attribute Bob-1 Male) (instance Bob-1 Human) (attribute Bob-1 Musician))
47(c) 2012 Adam Pease – [email protected]
Example #5 - Counting• Bob kills 5 rats.
(exists (?CLNrats ?event) (and (attribute Robert-1 Male) (forall (?I) (=> (member ?I ?CLNrats) (instance ?I Rat))) (instance Robert-1 Human) (instance ?CLNrats Collection) (member-count ?CLNrats 5) (agent ?event Robert-1) (instance ?event Killing) (patient ?event ?CLNrats)))
48(c) 2012 Adam Pease – [email protected]
Example #6 - Preposition
• Bob in on the boat.
(exists (?boat) (and (attribute Robert-1 Male) (instance Robert-1 Human) (instance ?boat Watercraft) (orientation Robert-1 ?boat On)))
49(c) 2012 Adam Pease – [email protected]
Example #7 – Another Preposition
• The party is on Monday.
(exists (?party ?monday) (and (instance ?party SocialParty) (instance ?monday Monday) (during ?party ?monday))
50(c) 2012 Adam Pease – [email protected]
PrepositionsPreposition Class SUMO relationat,in,on location locationat,in,on time duringfor person destinationfor, through time durationwith person agentwith object instrumentacross path traverseswithin,into object properlyFillsfrom object originfrom time BeginFnthrough object traversesuntil time EndFnafter time greaterThanbefore time lessThan
51(c) 2012 Adam Pease – [email protected]
Example #8 – Quantification
• Some horses eat hay.
(exists (?event ?hay ?horse) (and (instance ?horse Horse) (instance ?event Eating) (instance ?hay Hay) (patient ?event ?hay) (agent ?event ?horse)))
52(c) 2012 Adam Pease – [email protected]
Example #9 – Possessives• Tom's father is rich. Bob's nose is big.
Mary's car is fast.(exists (?father) (and (attribute Tom-1 Male) (instance Tom-1 Human) (attribute ?father Rich) (father ?father Tom-1)))
(exists (?nose) (and (attribute Robert-1 Male) (instance Robert-1 Human) (attribute ?nose SubjectiveAssessmentAttribute) (instance ?nose Nose) (part ?nose Robert-1)))
(exists (?car) (and (attribute Mary-1 Female) (instance Mary-1 Human) (attribute ?car Fast) (possesses Mary-1 ?car)))
53(c) 2012 Adam Pease – [email protected]
Example #10 – Negation• Bob did not write a book. Bob did not
write the book.(not (exists (?book) (and (attribute Robert-1 Male) (instance Robert-1 Human) (authors Robert-1 ?book) (instance ?book Book))))
(exists (?book) (and (attribute Robert-1 Male) (instance Robert-1 Human) (not (authors Robert-1 ?book)) (instance ?book Book)))
54(c) 2012 Adam Pease – [email protected]
High Level Distinctions
The first fundamental distinction is that between ‘Physical’ (things which have a position in space/time) and ‘Abstract’ (things which don’t)
Entity
Physical Abstract
55(c) 2012 Adam Pease – [email protected]
High Level Distinctions
Partition of ‘Physical’ into ‘Objects’ and ‘Processes’
Physical
Object Process
56(c) 2012 Adam Pease – [email protected]
Objects
ObjectSelfConnectedObject
SubstanceCorpuscularObject
RegionCollection
57(c) 2012 Adam Pease – [email protected]
Processes
DualObjectProcess Substituting Transaction Comparing Attaching Detaching Combining SeparatingInternalChange BiologicalProcess QuantityChange Damaging ChemicalProcess SurfaceChange Creation StateChangeShapeChange
IntentionalProcess IntentionalPsychologicalProcess RecreationOrExercise OrganizationalProcess Guiding Keeping Maintaining Repairing Poking ContentDevelopment Making Searching SocialInteraction ManeuverMotion BodyMotion DirectionChange Transfer Transportation Radiating
58(c) 2012 Adam Pease – [email protected]
Abstract
SetOrClassRelationPropositionQuantity
NumberPhysicalQuantity
AttributeGraphGraphElement
59(c) 2012 Adam Pease – [email protected]
Case Roles
• Roles that entities play in a Process– agent, patient, instrument etc.
60(c) 2012 Adam Pease – [email protected]
Case Roles
• “Brutus stabbed Caesar with a knife on Tuesday.”
A Stabbing
A Tuesday
A KnifeBrutus
Caesar
patient
agent
time
instrument
61(c) 2012 Adam Pease – [email protected]
Case Roles
• “Brutus stabbed Caesar with a knife on Tuesday.”
(exists (?S ?K ?T) (and (instance ?S Stabbing) (instance ?K Knife) (instance ?T Tuesday) (agent ?S Brutus) (patient ?S Caesar) (time ?S ?T) (instrument ?S ?K)))
A Stabbing
A Tuesday
A KnifeBrutus
Caesar
patient
agent
time
instrument
62(c) 2012 Adam Pease – [email protected]
Example Rules(=> (instance ?DRIVE Driving) (exists (?VEHICLE) (and (instance ?VEHICLE Vehicle) (patient ?DRIVE ?VEHICLE))))
“If there's an instance of Driving, there's a Vehicle that participatesin that action.”
Not just an English definition for humans to read, but a logicaldefinition that can be used in proofs.
63(c) 2012 Adam Pease – [email protected]://www.ontologyportal.org/Book.html
Top Related