The 2012 ICSI/Berkeley Video Location Estimation System

The 2012 ICSI / Berkeley Location Estimation System

Jaeyoung Choi, Venkatesan Ekambaram, Gerald Friedland and Kannan Ramchandran

ICSI / UC Berkeley, USAOctober 4th, 2012

1Thursday, October 4, 12

Agenda

• Baseline Approach

• Drawbacks

• Graphical Model Framework

• Result


Baseline Approach

• Investigate ‘Spatial Variance’ of feature:

• spatial variance is small : feature is likely location-indicative

• spatial variance is large : feature is likely not indicative


ExampleTag Matches in

Training setSpatial Variance

pavement 2 5.739

ucberkeley 4 0.132

berkeley 14 68.138

greek 0 N/A

greektheatre 0 N/A

spitonastranger 0 N/A

live 91 6453.109

video 2967 6735.844


Problem: Sparsity coming from biased dataset


The effect of sparsity

6

0"

10"

20"

30"

40"

50"

60"

0≤e<1&

10≤e<100&

100≤e<1000&

1000≤e<10000&

10000≤e&

Percen

tage&[%

]&

Distance&error&(e)&between&ground&truth&and&es<ma<on&[km]

>6400"6400"1600"400"100"

*  Test"video"from"a"dense"area"has"higher"chance"of"being"es<mated"with"lower"error"in"distance."""


Geo-‐tagging: an es-ma-on-‐theore-c viewpoint

{berkeley, sathergate, campanile}

{berkeley, haas} {campanile} {campanile, haas}

Observa(ons:

Images:

Tags: , , ,

{tk1} {tk2} {tk3} {tk4}, , ,Es(mate:Geoloca-ons:

x1 x2 x3 x4, , ,7Thursday, October 4, 12

Interpre-ng tradi-onal approaches

Loca-ons are random variables: {x1, x2, ....., xN}




Tradi-onal approaches es-mate:p(xi|{tki }) �

Y

k

p(xi|tki )

wherep(xi|tki ) is obtained from the training set

Probability of loca-on given tags





Y

k

p(xi|tki )


Example: the distribu-on for the tag “washington” is depicted here






Y

k

p(xi|tki )


Example: the distribu-on for the tag “washington” is depicted here

Loca-on es-mate:Z

xi p(xi|{tki })dxi



DrawbacksData sparsity: Not all tags in test set are available in training set. Hence es-mate of can be bad

p(xi|tki )Sub-‐op(mality: The approaches are subop-mal given the data.

What we ideally want:p(x1, x2, ....., xN |{tk1}, {tk2}, ..., {tkN})

Mean of the above distribu-on gives the best es-mate of the loca-onsi.e. for each image we want

p(xi|{tk1}, {tk2}, ...., {tkN})

Tradi-onal algorithms only give:p(xi|{tki })


Bayesian graphical framework{berkeley, sathergate, campanile}

{berkeley, haas}

{campanile} {campanile, haas}

Node: Geoloca-on of the image

Edge: Correlated loca-ons (e.g. common tag)

Edge Poten(al: Strength of an edge, (e.g. posterior distribu-on of loca-ons given common tags)

p(xi, xj |{tki } � {tkj })

p(xj |{tkj })p(xi|{tki })


Coopera-ve geo-‐taggingIntui-on: Images in the training set having common tags have correlated geo-‐loca-ons captured by the joint distribu-on


Coopera-ve geo-‐taggingIntui-on: Images in the training set having common tags have correlated geo-‐loca-ons captured by the joint distribu-onJoint probability modeling:

p(x1, x2, ....., xN |{tk1}, {tk2}, ..., {tkN}) �Y

i

p(xi|{tki })Y

(i,j)

p(xi, xj |{tki } ⇥ {tkj })

Pairwise distribu-on given at least one common tag



p(x1, x2, ....., xN |{tk1}, {tk2}, ..., {tkN}) �Y

i

p(xi|{tki })Y

(i,j)



is obtained from the training set as before

p(xi, xj |{tki } � {tkj }) Modeled as an indicator func-on I(xi = xj)If the common tag has low spa-al variance or occurs infrequently, e.g. if the common tag is “haas”, its very likely the loca-ons are the same

p(xi|{tki })



p(x1, x2, ....., xN |{tk1}, {tk2}, ..., {tkN}) �Y

i

p(xi|{tki })Y

(i,j)



is obtained from the training set as before

p(xi, xj |{tki } � {tkj }) Modeled as an indicator func-on I(xi = xj)If the common tag has low spa-al variance or occurs infrequently, e.g. if the common tag is “haas”, its very likely the loca-ons are the same

Ques-on: How to es-mate to op-mal marginal distribu-on ?

p(xi|{tk1}, {tk2}, ...., {tkN})

p(xi|{tki })


Belief propaga-on updatesp(xi|{tk1}, {tk2}, ...., {tkN})Itera-ve algorithm to approximate

the posterior distribu-on

Gaussian modeling p(xi|{tki }) � N (µi,�2i )

At itera-on 0 each node calculates (µi,�2i )

At itera-on t each node updates its loca-on as a weighted mean of its previous loca-on and that of its neighbors

µ(t)i =

1

(�(t)i )2

µ(t�1)i +

Pk⇥N (i)

1

(�(t)k )2

µ(t)k

(�(t)i )2

1

(�(t)i )2

=1

(�(t�1)i )2

+X

k2i

1

(�(t�1)k )2

The weights reflect the confidence in that measurements, i.e. higher the spa-al variance lower is the weight


Belief propaga-on

(µ1,�21)

(µ2,�22)

(µ3,�23)

Posterior mean and variance assuming Gaussian beliefs

Audio visual features are incorporated in modeling the edge and node poten-als


Incorpora-ng Audio-‐Visual features• GIST features are extracted for the images.• MFCC features are extracted for the audio.• These are now incorporated into the node and edge poten-als as exponen-al distribu-ons.

p(xi, xj |ai, aj) ⇥ exp(� ||xi � xj ||�||ai � aj ||

)

ai are the audio features associated with image i

The intui-on is that closer the audio features are, higher the probability that the geo-‐loca-ons are closer.Similarly this can be included in the node poten-als as well as for the visual features.


Result• Percentage of test videos (out of 4182 videos) correctly es-mated under distances in the top row from the groundtruth loca-on.

– run1 -‐ baseline approach without using gaze_eer– run2 -‐ graphical model based approach with gaze_eer– run3 -‐ baseline approach with gaze_eer– run4 -‐ k-‐NN with gist visual feature

• Graphical model approach with gaze_eer outperforms baseline approaches in range above 1km.

14


Conclusion

• graphical model framework can achieve performance improvement over baseline approach by incorpora-ng results from test data • various issues remain to be explored– the modeling of edge poten-al • text : hard threshold (current) -‐-‐> sod• visual/audio features

– assump-on of condi-onal independence of loca-on distribu-on given mul-ple tags

15


Thank You!

16

Questions?

Work together with: Venkatesan Ekambaram, Kannan

Ramchandran, Giulia Fanti Howard Lei, Adam Janin, and Gerald

Friedland

http://mmle.icsi.berkeley.edu


http://mmle.icsi.berkeley.edu/mmle/

http://mmle.icsi.berkeley.edu/mmle/

The 2012 ICSI/Berkeley Video Location Estimation System

Technology

Transcript of The 2012 ICSI/Berkeley Video Location Estimation System