Download - Relevance Aggregation Projections for Image Retrievalwliu/RAP_slide.pdf · Relevance Aggregation Projections for Image Retrieval ... just address asymmetry in CBIR. Core Idea: ...

Relevance Aggregation Projections for Image Retrieval

Wei Liu Wei Jiang Shih-Fu [email protected]

CIVR 2008

Syllabus

Motivations and Formulation

Our Approach: Relevance Aggregation Projections

Experimental Results

Conclusions

Liu et al. Columbia University 2/28

Syllabus




Conclusions


Motivations and FormulationRelevance feedback

to close the semantic gap. to explore knowledge about the user’s intention.to select features, refine models.

Relevance feedback mechanismUser selects a query image.The system presents highest ranked images to user, except forlabeled ones.During each iteration, the user marks “relevant” (positive)and ”irrelevant” (negative) images.The system gradually refines retrieval results.


Problems

Small sample learning – Number of labeled images is extremely small.

High dimensionality – Feature dim >100, labeled data number < 100.

Asymmetry – relevant data are coherent and irrelevant data are diverse.


Asymmetry in CBIR


query relevant images

irrelevant images

Possible Solutions

Asymmetry:

Small sample learning semi-supervised learning

Curse of dimensionality dimensionality reduction


T

query margin =1

margin =1

query

Previous Work


√asymmetry

2l-1l-1ddimensionbound

√√√√unlabeled

√√√labeled

SRACM MM’07

SSPACM MM’06

AREACM MM’05

LPPNIPS’03

Methods

image dim: d, total sample #: n, labeled sample #: lIn CBIR, n > d > l

Disadvantages

LPP: unsupervised.

SSP and SR: fail to engage the asymmetry.SSP emphasizes the irrelevant set.SR treats relevant and irrelevant sets equally.

ARE, SSP and SR: produce very low-dimensionalsubspaces (at most l-1 dimensions). Especially for SR (2D subspace).


Syllabus


Relevance Aggregation Projections (RAP)


Conclusions


Symbols


: relevant set, : irrelevant set: relevant #, : irrelevant #

: subspace, : projecting vector( , , ) : graph, : graph Laplacian

d r d

F Fl lA aG V E W L D W

+ −

+ −

×∈ ∈= −

1 1

1

: total #, : labeled #: original dim, : reduced dim

[ ,..., , ,..., ] : samples

[ ,..., ] : labeled samples

d nl l n

d ll l

n ld rX x x x x

X x x

×+

×

= ∈

= ∈

Graph Construction

Build a k-NN graph as

Establish an edge if is among k-NNs of or is among k-NNs of .

Graph Laplacian : used in smoothness regularizers.

2

2exp( ), ( ) ( )

0, otherwise

i j k ki j j i

ij

x xx N x x N xW σ

⎧ −⎪ − ∈ ∨ ∈= ⎨⎪⎩


ix jx

n nL D W ×= − ∈

jx

ix

Our Approach

2

min ( ) (1.1)

. . / , (1.2)

( / ) , (1.3)

d r

T T

A

T Ti j

j F

Ti j

j F

tr A XLX A

s t A x A x l i F

A x x l r i F

×

+

+

∈

+ +

∈

+ −

∈

= ∀ ∈

− ≥ ∀ ∈

∑

∑


Target – subspace A reducing raw data from d dims to r dimsObj (1.1) – minimize local scatter using labeled and unlabeled dataCons (1.2) – aggregate positive data (in F+ ) to the positive centerCons (1.3) – push negative data (in F-) far away from the positive

center with at least r unit distances.Cons (1.2) (1.3) just address asymmetry in CBIR.

Core Idea: Relevance Aggregation

An ideal subspace is one in which the relevant examples are aggregated into a single point and the irrelevant examples are simultaneously separated by a large margin.


Relevance Aggregation Projections

We transform eq. (1) to eq. (2) in terms of each column vector a in A (a is a projecting vector):

where is the positive center.

2

min (2.1)

. . , (2.2)

( ) 1, (2.3)

d

T T

a

T Ti

Ti

a XLX a

s t a x a c i F

a x c i F

∈

+ +

+ −

= ∀ ∈

− ≥ ∀ ∈


/jj F

c x l+

+ +

∈

= ∑

Solution

Eq. (2.1-2.3) is a quadratically constrained quadraticoptimization problem and thus hard to solve directly.

We want to remove constraints first and minimize the cost function then.

We adopt a heuristic trick to explore the solution.Find ideal 1D projections which satisfy the constraints.Removing constraints, solve a part of the solution.Solve another part of the solution.


Solution: Find Ideal Projections

Run PCA to get the r principle eigenvectors and renormalize them to get such that .

On each vector v in V,

Form the ideal 1D projections on each projecting direction v


,

, 1(3)

1, 0 11, 1 0

T

T T Ti i

i T T Ti

T T Ti

v c i F

v x i F v x v cy

v c i F v x v cv c i F v x v c

+ +

− +

+ − +

+ − +

⎧ ∈⎪

∈ ∧ − ≥⎪= ⎨+ ∈ ∧ ≤ − <⎪

⎪ − ∈ ∧ − < − <⎩1[ ,..., ]T l

ly y y= ∈

1[ ,..., ] d rrV v v ×= ∈

2, , 1,..., .T Ti jv x v x i j n− < =

T TV XX V I=

Solution: Find Ideal Projections

1T llv X ×∈ 1T T

iv x v c+− >


1T ly ×∈

The vector y is formed according to each PCA vector v.

Tv c+ 1Tiy v c+− >

1T Tiv x v c+− ≤

1Tiy v c+− =

Solution: QR FactorizationRemove constraints eq. (2.2-2.3) via solving a linear system

Because , eq. (4) is underdetermined and thus strictly satisfied.

Perform QR factorization:

The optimal solution is a sum of a particular solution and a complementary solution, i.e.

where

[ ]1 2 10l

RX Q Q Q R⎡ ⎤

= =⎢ ⎥⎣ ⎦


(4)TlX a y=

l d<

1 1 2 2 (5)a Q b Q b= +1

1 ( )Tb R y−=

Solution: RegularizationWe hope that the final solution will not deviate the PCA solution too much, so we develop a regularization framework.

Our framework is

controls the trade-off between PCA solution and data locality preserving (original loss function). The second term behaves as a regularization term.

Plugging into eq. (6), we solve

2( ) (6)T Tf a a v a XLX aγ= − +


-12 2 2 2 2 1 1( ) ( )T T T T Tb I Q XLX Q Q v Q XLX Q bγ γ= + −

0γ >

1 1 2 2a Q b Q b= +

Algorithm

① Construct a k-NN graph

② PCA initialization

③ QR factorization

④ TransductiveRegularization

⑤ Projecting


, , TW L S XLX=

1[ ,..., ]rV v v=

1 2, ,Q Q R

11

-12 2 2

2 2 1 1

1 1 2 2

for 1:

( )

( )

( )

end

j

T

T

T Tj

j

j rform y with v

b R y

b I Q SQ

Q v Q SQ b

a Q b Q b

γ

γ

−

=

=

= +

−

= +

1[ ,..., ]Tra a x

Syllabus




Conclusions


Experimental SetupCorel image database: 10,000 image, 100 image per category.

Features: two types of color features and two types of texture features, 91 dims.

Five feedback iterations, label top-10 ranked images in each iteration.

The statistical average top-N precision is used for performance evaluation.


Evaluation


Syllabus




Conclusions


ConclusionsWe develop RAP to simultaneously solve three fundamental issues in relevance feedback:

asymmetry between classes small sample size (incorporate unlabeled samples)high dimensionality

RAP learns a semantic subspace in which the relevant samples collapse while the irrelevant samples are pushed outward with a large margin.

RAP can be used to solve imbalanced semi-supervised learning problems with few labeled data.

Experiments on COREL demonstrate RAP can achieve a significantly higher precision than the stat-of-the-arts.


Thanks!

http://www.ee.columbia.edu/~wliu/