TripleRank: Ranking Semantic Web Data By Tensor Decomposition
Typed Tensor Decomposition of Knowledge Bases for Relation Extraction
-
Upload
arabella-elvery -
Category
Documents
-
view
71 -
download
1
description
Transcript of Typed Tensor Decomposition of Knowledge Bases for Relation Extraction
Kai-Wei Chang, Scott Wen-tau Yih, Bishan Yang & Chris MeekMicrosoft Research
Typed Tensor Decomposition of Knowledge Bases for Relation Extraction
โข Useful resources for NLP applicationsโข Semantic Parsing & Question Answering [e.g., Berant+,
2014] โข Information Extraction [Riedel+, 2013]
Knowledge Base
FreebaseDBpedia
YAGONELL
OpenIE/ReVerb
โข Captures world knowledge by storing properties of millions of entities, as well as relations among them
โข Knowledge base is never complete!โข Extract previously unknown facts from new corporaโข Predict new facts via inference
โข Modeling multi-relational dataโข Statistical relational learning [Getoor & Taskar, 2007]โข Path ranking methods (e.g., random walk) [e.g., Lao+ 2011]โขKnowledge base embeddingโขVery efficientโขBetter prediction accuracy
Reasoning with Knowledge Base
โข Each entity in a KB is represented by an vectorโข Predict whether is true by โข Linear: or Bilinear:
โข Recent work on KB embeddingโขRESCAL [Nickel+, ICML-11], SME [Bordes+, AISTATS-12], NTN [Socher+, NIPS-13], TransE [Bordes+, NIPS-13]โข Train on existing facts (e.g., triples)โข Ignore relational domain knowledge available in the KB (e.g., ontology)
Knowledge Base Embedding
โข Example โ type constraint can be true only if
โข Example โ common sense can be true only if
Relational Domain Knowledge
โข KB embedding via Tensor Decompositionโข Entity vector, Relation matrix
โข Relational domain knowledgeโข Type information and constraintsโขOnly legitimate entities are included in the loss
โข Benefits of leveraging type informationโข Faster model training timeโขHighly scalable to large KBโขHigher prediction accuracy
โข Application to Relation Extraction
Typed Tensor Decomposition โ TRESCAL
โข Introductionโข KB embedding via Tensor Decompositionโข Typed tensor decomposition (TRESCAL)โข Experimentsโข Discussion & Conclusions
Road Map
โข Collection of subj-pred-obj triples โ
Knowledge Base Representation (1/2)
Subject Predicate Object
Obama Born-in Hawaii
Bill Gates Nationality USA
Bill Clinton
Spouse-of Hillary Clinton
Satya Nadella
Work-at Microsoft
โฆ โฆ โฆ
โข Collection of subj-pred-obj triples โ
Knowledge Base Representation (1/2)
Subject Predicate Object
Obama Born-in Hawaii
Bill Gates Nationality USA
Bill Clinton
Spouse-of Hillary Clinton
Satya Nadella
Work-at Microsoft
โฆ โฆ โฆ
: # entities, : # relations
Knowledge Base Representation (2/2)
e1 ยซ en
e 1 ยซ
e n ฯฯk ๐ณ๐
: born-in
Hawaii
Obama 1
-th slice
Knowledge Base Representation (2/2)
e1 ยซ en
e 1 ยซ
e n ฯฯk ๐ณ๐
: born-in
Hawaii
Obama 1
-th slice
A zero entry means either:โข Incorrect (false)โข Unknown
โข Objective:
Tensor Decomposition Objective
~~ ร ร
๐ณ๐ ๐๐๐โ๐
12 (โ
๐โ๐ณ๐โ๐โ ๐๐
๐โ๐น2 )+ 1
2 (โ๐ดโ๐น2+โ
๐โโ๐โ๐น
2 )
RESCAL [Nickel+, ICML-11]
Reconstruction Error Regularization
-th relation
Measure the Degree of a Relationship
ร ร
๐๐๐โborn โ in
Hawaii
Obama
โข Introductionโข KB embedding via Tensor Decompositionโข Typed tensor decomposition (TRESCAL)โขBasic ideaโข Training procedureโขComplexity analysisโข Experimentsโข Discussion & Conclusions
Road Map
โข Reconstruction error:
Typed Tensor Decomposition Objective
12โ๐ โ๐ณ๐โ๐โ๐๐
๐โ๐น2
~~ ร ร
๐ณ๐ ๐๐๐โ๐
โข Reconstruction error:
Typed Tensor Decomposition Objective
12โ๐ โ๐ณ๐โ๐โ๐๐
๐โ๐น2
~~ ร ร
๐ณ๐ ๐๐๐โ๐
Relation: born-in
โข Reconstruction error:
Typed Tensor Decomposition Objective
12โ๐ โ๐ณ๐โ๐โ๐๐
๐โ๐น2
~~ ร ร
๐ณ๐ ๐๐๐โ๐
people Relation: born-in
โข Reconstruction error:
Typed Tensor Decomposition Objective
12โ๐ โ๐ณ๐โ๐โ๐๐
๐โ๐น2
~~ ร ร
๐ณ๐ ๐๐๐โ๐
locations
people Relation: born-in
โข Reconstruction error:
Typed Tensor Decomposition Objective
12โ๐ โ๐ณ๐
โฒ โ๐๐๐โ๐๐๐๐
๐โ๐น2
~~ ร ร
๐ณ๐โฒ ๐๐๐ ๐๐๐
๐โ๐
Training Procedure โ Alternating Least-Squares (ALS) Method
๐โ[โ๐ ๐ณ๐๐โ๐๐+๐ณ๐
๐๐โ ๐] [โ๐ ๐ต๐+๐ถ๐+๐๐ ]โ1
where .
๐ฏ๐๐ (โ๐ ) โ (๐T ๐+๐๐ )โ1๐T ๐ฏ๐๐ (๐ณ๐ )
where is vectorization, and is the Kronecker product.
Fix , update
Fix , update
Training Procedure โ Alternating Least-Squares (ALS) Method
๐โ[โ๐ ๐ณ๐๐โ๐๐+๐ณ๐
๐๐โ ๐] [โ๐ ๐ต๐+๐ถ๐+๐๐ ]โ1
where .
๐ฏ๐๐ (โ๐ ) โ (๐T ๐+๐๐ )โ1๐T ๐ฏ๐๐ (๐ณ๐ )
where is vectorization, and is the Kronecker product.
Fix , update
Training Procedure โ Alternating Least-Squares (ALS) Method
๐โ[โ๐ ๐ณ๐๐โ๐๐+๐ณ๐
๐๐โ ๐] [โ๐ ๐ต๐+๐ถ๐+๐๐ ]โ1
where .
๐ฏ๐๐ (โ๐ ) โ (๐T ๐+๐๐ )โ1๐T ๐ฏ๐๐ (๐ณ๐ )
where is vectorization, and is the Kronecker product.
Training Procedure โ Alternating Least-Squares (ALS) Method
๐โ[โ๐ ๐ณ๐โฒ ๐๐๐
โ๐T+๐ณ๐
โฒ T๐๐๐โ๐] [โ๐ ๐ต๐๐
+๐ถ๐๐+๐๐ ]โ 1
where .
๐ฏ๐๐ (โ๐ ) โ (๐๐๐
T ๐๐๐โจ๐๐๐
T ๐๐๐+๐๐ )โ๐
ร ๐ฏ๐๐ (๐๐๐
T ๐ณ๐โฒ ๐๐๐
)
โข Without Type information (RESCAL): โข : # entitiesโข : # non-zero entriesโข : # dimensions of projected entity vectors
โข With Type information (TRESCAL): โข : average # entities satisfying the type constraint
Complexity Analysis
โข Introductionโข KB embedding via Tensor Decompositionโข Typed tensor decomposition (TRESCAL)โข ExperimentsโขKB CompletionโขApplication to Relation Extractionโข Discussion & Conclusions
Road Map
โข KB โ Never Ending Language Learning (NELL)โข Training: version 165โขDeveloping: new facts between v.166 and v.533โข Testing: new facts between v.534 and v.745
โข Data statistics of the training set
Experiments โ KB Completion
# Entities 753k
# Relation Types 229
# Entity Types 300
# Entity-Relation Triples 1.8M
โข Entity Retrieval: โขOne positive entity with 100 negative entitiesโข Relation Retrieval: โข Positive entity pairs with equal number of negative pairs
โข Baselines:
Tasks & Baselines
RESCAL[Nickel+, ICML-11]
๐๐ ๐ ๐
๐๐
TransE[Bordes+, NIPS-13]
Training Time Reduction
โข Both models finish training in 10 iterations.โข TRESCAL filters 96% entity triples with incompatible
types.
1
2
0 5 10 15 20 25
4.46
20.5
Model Training Time (hours)
4.6x speed-up
Training Time Reduction
โข # iterations for TransE is set to 500 (the default value).
1
2
0 10 20 30 40 50 60 70 80 90 100
4.46
96
Model Training Time (hours)
21.5x speed-up
Entity Retrieval
1 2 358.0%
60.0%
62.0%
64.0%
66.0%
68.0%
70.0%
72.0%
67.56%
62.91%
69.26%
Mean Average Precision (MAP)
Relation Retrieval
1 2 368.0%
70.0%
72.0%
74.0%
76.0%
78.0%
70.71%
73.08%
75.70%
Mean Average Precision (MAP)
Experiments โ Relation Extraction
Satya Nadella is the CEO of Microsoft.
(Satya Nadella , work-at, Microsoft)
โข Row: Entity Pairโข Column:
Relation
Relation Extraction as Matrix Factorization[Riedel+ 13]
Fig.1 of [Riedel+ 13]
โข Raw data: NY Times corpus & Freebaseโข Entities in NY Times and Freebase are alignedโข Raw tensor constructionโข 80,698 entities & 1,652 relationsโข Type information from Freebase & NERโข Type constraints are derived from training data
โข Task โ identify FB relations of entity pairs in textโข 10,000 entity pairs: 2,048 have both entities in FBโข Evaluation metric โ Weighted mean average precision (MAP) on 19 relations
Data & Task Description
Relation Extraction
1 2 3 4 50.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.490.52
0.58
0.70.72
Chart Title
โข Evaluated using only 2,048 FB entity pairs
[updated version]
Relation Extraction
1 2 3 4 50
0.1
0.2
0.3
0.4
0.5
0.6
0.330.36
0.39
0.47
0.57
Chart Title
โข Evaluated using all 10,000 entity pairs
โข TRESCAL: A KB embedding model via tensor decompositionโข Leverages entity type constraintโขFaster model training timeโขHighly scalable to large KBโขHigher prediction accuracyโขApplication to relation extraction
โข Challenges & Future WorkโขCapture more types of relational domain knowledge โข Support more sophisticated inferential tasks
Conclusions