Towards Semantic Embedding in Visual Vocabulary Towards Semantic Embedding in Visual Vocabulary The...

Towards Semantic Embedding in Visual Vocabulary

The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition

Session: Object Recognition II: using Language, Tuesday 15, June 2010

Rongrong Ji1, Hongxun Yao1, Xiaoshuai Sun1, Bineng Zhong1, and Wen Gao1,2

1Visual Intelligence Laboratory, Harbin Institute of Technology2School of Electronic Engineering and Computer Science, Peking University

Overview

✴ Problem Statement

✴ Building patch-labeling correspondence

✴ Generative semantic embedding

✴ Experimental comparisons

✴ Conclusions

Overview

✴ Conclusions

Introduction

• Building visual codebooks– Quantization-based approaches

• K-Means, Vocabulary Tree, Approximate K-Means et al.

– Feature-indexing-based approaches• K-D Tree, R Tree, Ball Tree et al.

• Refining visual codebooks– Topic model decompositions

• pLSA, LDA, GMM et al.

– Spatial refinement• Visual pattern mining, Discriminative visual phrase et al.

Introduction

• With the prosperity of Web community

User Tags

Our Contribution

Introduction

• Problems to achieve this goal

– Supervision @ image level

– Correlative semantic labels

– Model generality

A traditional San Francisco street A traditional San Francisco street view with car and buildingsview with car and buildings

??Flower

Introduction

• Our contribution– Publish a ground truth path-labeling set

http://vilab.hit.edu.cn/~rrji/index_files/SemanticEmbedding.htm

– A generalized semantic embedding framework• Easy to deploy into different codebook models

– Modeling label correlations• Model correlative tagging in semantic embedding

Introduction

• The proposed framework

Build patch-level correspondenceSupervised visual codebook construction

Overview

✴ Conclusions

Building Patch-Labeling Correspondence

• Collecting over 60,000 Flickr photos– For each photo

• DoG detection + SIFT

• “Face” labels (for instance)

• Purify the path-labeling correspondences– Density-Diversity Estimation (DDE)– Formulation

• For a semantic label , its initial correspondence set is

– Patches extracted from images with label

– A correspondence from label to local patch

1 2, { , ,..., },

{ , ,..., }i

i i n ii

i i n i

d s d s d s

D s d d d s

le le le

1 2{ , ,..., }i nD d d d

ijleis

Purify le links in <Di, si>

• Density– For a given , Density reveals its

representability for :

• Diversity– For a given , Diversity reveals it

unique score for :

ld ldDen

1exp( || || )

d l j Lj

Den d dm

ldDivld

ln( )l

n nDiv

Average neighborhood distance in m

neighborsnm: number of images in

neighborhood ni: number of total images with

label si

Purifyi j d

D d DDE T

s t DDE Den Div

High T: Concerns more on Precision,

not Recall

• Case study: “Face” label (before DDE)

• Case study: “Face” label (after DDE)

Overview

✴ Conclusions

✴Generative Hidden Markov Random Field

✴ A Hidden Field for semantic modeling

✴ An Observed Field for local patch quantization

Generative Semantic Embedding

• Hidden Field– Each produces correspondence links (le) to

a subset of patch in the Observed Field– Links denotes semantic correlations

between and

1{ }mi iS s

WordNet::SimilarityPedersen et al. AAAI 04

• Observed Field– Any two nodes follow visual metric (e.g. L2)– Once there is a link between and , we

constrain by from the hidden field

1{ }ni iD d

jsjsid

• In the ideal case

– Each is conditionally independent given :

Neighbors in the Hidden Field

( | ) { ( | ) | ( | ) 0}i j i ji m

P D S P d s P d s

Feature set D is regarded as (partial) generative from the Hidden Field

– Formulize Clustering procedure

• Assign a unique cluster label to each

• Hence, D is quantized into a codebook with corresponding features

• Cost for codebook candidate C

P(C|D)=P(C)P(D|C)/P(D)

1{ }Kk kW w 1{ }Kk kV v

P(D) is a constraintSemantic Constraint

Visual Distortion

• Semantic Constraint P(C)– Define a MRF on the Observed Field

– For a quantizer assignments C, its probability can be expressed as a Gibbs distribution from the Hidden Field as

• That is– Two data points and in contribute to

if and only ifjdid kw

Observed FieldObserved Field

Hidden FieldHidden Field

ccii=c=cii

ssxx ssyy

ddjjddii

llxyxy

• Visual Constraint P(D|C)– Whether the codebook C is visually consistent

with current data D• Visual distortion in quantization

• Overall Cost• Finding MAP of P(C|D) can be converted into

maximizing its posterior probability

The visual constraint P(D|C)

The semantic constraints P(C)

• EM solution

• E step

• M step

Assign local patches to the closest clusters

Update the visual word center

• Model generation– To supervised codebook with label

independence assumption

– To unsupervised codebooks• Making all l=0

Overview

✴ Conclusions

Experimental Comparisons

Case study of ratios between inter-class distance and intra-class distance with and without semantic embedding in the Flickr dataset.

MAP@1 comparisons between our GSE model to Vocabulary Tree [1] and GNP [34] in Flickr 60,000 database.

Confusion table comparisons on PASCAL VOC dataset with method in [24].

Overview

✴ Conclusions

Conclusion

• Our contribution• Propagating Web Labeling from images to local

patches to supervise codebook quantization• Generalized semantic embedding framework for

supervised codebook building• Model correlative semantic labels in supervision

• Future works• Adapt one supervised codebook for different tasks• Move forward supervision into local feature

detection and extraction

Thank you!Thank you!

Towards Semantic Embedding in Visual Vocabulary Towards Semantic Embedding in Visual Vocabulary The...

Documents

Transcript of Towards Semantic Embedding in Visual Vocabulary Towards Semantic Embedding in Visual Vocabulary The...

Visual vocabulary with a semantic twist: Supplementary materialvgg/software/fast_semantic... · 2014. 10. 30. · Visual vocabulary with a semantic twist: Supplementary material 3

NSEEN: Neural Semantic Embedding for Entity Normalization · entity normalization based on syntactic and semantic information. We compute these similarities via embedding the entity

Towards Semantic Embedding in Visual Vocabulary

Deep semantic-visual embedding with localization

Semantic expansion using word embedding clustering and ...

Graph-RISE: Graph-Regularized Image Semantic Embedding · Graph-RISE: Graph-Regularized Image Semantic Embedding Da-Cheng Juan, Chun-Ta Lu, Zhen Li, Futang Peng, Aleksei Timofeev,

The BODC Parameter Markup and Usage Vocabulary Semantic Model

Semantic IR: Exploring Dependency and Word Embedding ... · ©2013 MFMER | slide-1 Semantic IR: Exploring Dependency and Word Embedding Features in Biomedical IR Rastegar-Mojarad,

The Effect of Semantic Mapping as a Vocabulary Instruction ...

Semantic Word Clouds with Background Corpus Normalization ... · Semantic Word Clouds with Background Corpus Normalization and t-distributed Stochastic Neighbor Embedding ERICH SCHUBERT,

Towards Semantic Embedding in Visual Vocabulary semantic embedding in visual... · Visual vocabulary serves as a fundamental component in many computer vision tasks, such as object

Multiple Visual-Semantic Embedding for Video Retrieval ... · guitar Similarity Aggregation s N Final Similarity Global Visual Network Textual Embedding Network Sequential Visual

Vocabulary Acquisition and Lexical Training by Semantic ...

DeViSE: A Deep Visual-Semantic Embedding Model

Semantic Enrichment of Pretrained Embedding Output for ...

A Dual Attention Network with Semantic Embedding for Few-shot … · 2019. 6. 10. · A Dual Attention Network with Semantic Embedding for Few-shot Learning Shipeng Yan∗, Songyang

Semantic Regularisation for Recurrent Image Annotation · supervised semantic regularisation to the embedding layer. (2) Our proposed semantic regularisation enables reliable ne-tuning

TEACHING VOCABULARY THROUGH SEMANTIC MAPPING TECHNIQUE …digilib.unila.ac.id/22733/3/A SCRIPT RESULT AND DISCUSSION.pdf · TEACHING VOCABULARY THROUGH SEMANTIC MAPPING ... 2.5 Teaching

NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model

Semantic Web Publishing Vocabulary (SWP) User Manualwifo5-03.informatik.uni-mannheim.de/bizer/wiqa/swp/... · Chapter 1 The Semantic Web Publishing Vocabulary Graph names provide