NLP & Semantic Computing Group
N L P
Semantic Relation Classification: Task Formalisation and RefinementVivian S. SilvaManuela HürlimanBrian DavisSiegfried HandschuhAndré Freitas
NLP & Semantic Computing Group
Outline•Motivation
•Revisiting Semantic Relation Classification Using Foundational Ontologies (DOLCE)
•Systematic Analysis
•Summary
NLP & Semantic Computing Group
Motivation
NLP & Semantic Computing Group
Source: Ontotext
IntroductionSemantic Relation Classification (SRC) is a fundamental
task in NLP, allowing the induction of semantic representation models for both commonsense and
domain-specific data.
Source: W3C
Source: Semantrix
NLP & Semantic Computing Group
Our Goals• Improve the coverage, description and the
formalisation of the semantic relation classification task
•Provide a critique and generalization of the existing SemEval-2010 task 8
NLP & Semantic Computing Group
Semantic Relation Classification•SemEval-2010 task 8
Most common semantic relation set Relations covered:• Cause-Effect (CE)• Instrument-Agency (IA)• Product-Producer (PP)• Content-Container (CC)• Entity-Origin (EO)• Entity-Destination (ED)• Component-Whole (CW)• Member-Collection (MC)• Message-Topic (MT)
NLP & Semantic Computing Group
Semantic Relations ClassificationThe <e1> burst </e1> has been caused by water hammer <e2> pressure </e2>.
NLP & Semantic Computing Group
Semantic Relations Classification• Despite the obvious intuition around the utility of
SRC…
Semantic relations set and their expressive coverage has not been fully grounded with regard to an ontological framework
1 When projecting these semantic relations back to the corpora-level, it can be observed that the majority of the words within a text does not have a direct semantic relationship connecting them
2
NLP & Semantic Computing Group
Semantic Relations Classification•SemEval-2010 task 8Has some constraints… …that brings some limitationsFocus on Nominals: only noun
phrases are considered
Locality Constraint: only relations for arguments in the same clause
Focus on Concrete Relations: most relation refer to physical objects
Exclusion of Conditionals: conditional clauses not considered
No relations between events, when represented by verbs, and
their objectsNo relations between terms
belonging to different, subordinate clauses
No relations for abstract entities or quantitative/qualitative roles
No relations expressing conditional dependencies
NLP & Semantic Computing Group
Revisiting Semantic Relation Classification
NLP & Semantic Computing Group
Main question•Given two sets of content words in a
sentence, can we provide a semantic relation between them?
•Can this task be useful as a semantic interpretation mechanism?
NLP & Semantic Computing Group
Main strategy•Start using foundational ontologies for this
task
•Define relation compositions
•Expand the model with custom abstract relations that stand on the interface between dependency relations and an ontology-based representation
NLP & Semantic Computing Group
Why Foundational Ontologies?
Representation ReasoningData
Foundational ontologies are intended to represent the world in the way people perceive it,
classifying entities into categories that are familiar to people’s common sense
can represent data in a formal
way
can reason over data using high-
level restrictions
NLP & Semantic Computing Group
When is a foundational ontology useful?
• 1. When subtle distinctions are important• 2. When recognizing disagreement is
important• 3. When rigorous referential semantics is
important• 4. When general abstractions are important• 5. When careful explanation and justification of
ontological commitment is important• 6. When mutual understanding is more
important than interoperability.
Guarino, 2006
NLP & Semantic Computing Group
DOLCE•DOLCE (Descriptive Ontology for Linguistic
and Cognitive Engineering)
•Strong cognitive/linguistic bias: Descriptive (as opposite to prescriptive)
attitude Categories mirror cognition, common sense,
and the lexical structure of natural language Emphasis on cognitive invariants
NLP & Semantic Computing Group
DOLCE•Any term can be mapped to a DOLCE high
level category (class)
• It’s always possible to find a relation between any two DOLCE categories, and, therefore, between the entities mapped to them
NLP & Semantic Computing Group
NLP & Semantic Computing Group
NLP & Semantic Computing Group
DOLCE Relations• 23 immediate relations and 25 mediated
(composed) relations, many of them having sub-relations. Some examples:
immediate-relation
mediated-relation
instrumentperformed-bytarget
functional-participant
part
referencesresource
temporally-coincides
precedestemporal-relationabstract-locationco-participates-with
temporally-overlaps
temporally-includes
… …
NLP & Semantic Computing Group
Applications: Simple Text Entailment Example
Assumption Mary is a motherHypothesis Mary gave birthCommonsense KB a mother is a woman who has given
birth
Foundational Ontology Mapping
Mary
mother
give birth
agent role action
(agent plays role)(role performs action)(agent performs action)(agent plays role) and (role performs action) -> (agent performs action)
Foundational classes
Commonsense concepts
Foundational relations
NLP & Semantic Computing Group
Systematic Analysis
NLP & Semantic Computing Group
Corpus-based Analysis1. Corpus construction
• Focused on the financial domain (merges both commonsense with domain-specific discourse)
• Contains both factoid and definition type of discourse• We created a financial corpus by crawling two distinct
types of sources:a) definitions, from three
sources:b) articles, from two
sources:Bloomberg Financial Glossary
SGM Glossary
Investopedia Definitions
Wikipedia
Investopedia
NLP & Semantic Computing Group
Corpus Construction•Definitions
Bloomberg financial Glossary (8324 definitions; 212,421 tokens)
SGM Glossary (1007 definitions; 43,638 tokens) Investopedia Definitions4 (15476 definitions;
2,462,801 tokens), •Articles
Investopedia (5890 articles; 5,129,793 tokens) Wikipedia (articles on Investment and Finance;
8306 articles; 6,714,129 tokens).
NLP & Semantic Computing Group
Corpus-based Analysis1. Corpus construction
• We created a financial corpus by crawling two distinct types of sources:
• Word pair selection: Corpus split into sentences First word randomly selected among the sentence tokens Second word manually selected
a) definitions, from three sources:
b) articles, from two sources:Bloomberg Financial
GlossarySGM Glossary
Investopedia Definitions
Wikipedia
Investopedia
NLP & Semantic Computing Group
[…] the legislation's include a lifting of a 40-year ban on the United States' exporting of crude oil
Corpus-based Analysis2. Manual Classification Analysis
• 300 pairs of words manually annotated• Words mapped to DOLCE classes• Relation between them chosen among the set of relations
that exist between the classes assigned to the words• 3 different scenarios occurred:
a) Direct relationship:
b) Relation composition:
c) No relation found:Concepts too far away
After 30 days the trustee can then use the contributions to pay the insurance policy premium
target
target target
indirect-target
NLP & Semantic Computing Group
Custom Relations• DOLCE relations can be defined specifically for a class, or be
inherited from an ancestor class In the second case, the kind of relationship can become
too general To avoid semantically vague relations, we proposed a small
set of custom relations. A few examples:Relation ExampleCorrelated variation
It also decreases the value of the currency - potentially stimulating exports and decreasing imports - improving the balance of trade.
Ownership The lessor is the legal owner of the asset.Sibling concept Operating activities include net income, accounts
receivable, accounts payable and inventory.Value component Valuation of life annuities may be performed by
calculating the actuarial present value of the future life contingent payments.
NLP & Semantic Computing Group
Some Statistics
• Most common DOLCE relations: patient, patient-of, target, target-of
• Most common custom relations: qualifier, indirect-target, ownership
Relation type
DOLCE Relation
Custom Relations
Total
Direct 35.32% 64.68% 72.67%Composite 48.65% 51.35% 24.67%Unclassifie
d- - 2.66%
NLP & Semantic Computing Group
Semantic Relation X Semantic Relatedness• The corpus was further annotated by two domain experts in
finance • Two human annotators scored each of the 300 concept
pairs for semantic relatedness on a scale from 0 (unrelated) to 10 (identical or highly related) Average of their scores taken as final score Comparing the semantic relation to the semantic
relatedness score assigned to the same pair:
NLP & Semantic Computing Group
Summary• This work described a preliminary study on the
improvement of the coverage, description and the formalisation of the semantic relation classification task
• A foundational ontology (DOLCE), composite relations and custom semantic abstract relations were used
• DOLCE accounted for 38.2% of the semantic relations• 67 % of the pairs were assigned to a direct relation• 2.66% of the pairs could not be classified
• Relevant research questions: The impact of foundational ontology models in
distributional and compositional-distributional semantics.Data available at: http://bit.ly/2gpTkHT
NLP & Semantic Computing Group
Some Limitations (Currently being addressed)
•Scaling corpus annotation size (currently 300 elements)
•Grounding the custom relations into the foundational ontology
NLP & Semantic Computing Group
Work in Progress• Train an automatic annotator, capable of
identifying FO classes semantic relations in text.
Top Related