Context-based Visual Concept Detection Using Domain Adaptive Semantic Diffusion Yu-Gang Jiang, Jun...

Context-based Visual Concept Context-based Visual Concept Detection Using Domain Adaptive Detection Using Domain Adaptive

Semantic DiffusionSemantic Diffusion

Yu-Gang Jiang‡, Jun Wang‡, Shih-Fu Chang‡, Chong-Wah Ngo† † VIREO Research Group (VIREO), City University of Hong Kong‡ Digital Video and Multimedia Lab (DVMM), Columbia University

1NIST TRECVID Workshop, Nov. 2009

Overview: framework

Local FeatureLocal Feature Global FeatureGlobal Feature

SVM ClassifiersSVM Classifiers

66

55

1-41-4

VIREO-374:374 LSCOM

concept detectors

VIREO-374:374 LSCOM

concept detectors

Domain Adaptive Semantic DiffusionDomain Adaptive

Semantic Diffusion

3

Overview: performance

DASDLocal + global features

Local feature alone

Local feature is still the most powerful component (MAP=0.150) Global features help a little bit (MAP=0.156) DASD further contributes incrementally to the final detection

Overview: framework



66

55

1-41-4

VIREO-374:374 LSCOM

concept detectors

VIREO-374:374 LSCOM

concept detectors

Domain Adaptive Semantic DiffusionDomain Adaptive

Semantic Diffusion

Local feature representation

5

Chang et al TRECVID 2008; Jiang, Yang, Ngo & Hauptmann, IEEE TMM, to appear

Keypoint extraction

Visual word vocabulary 1

SIF

T fe

atur

e sp

ace

......... .........

Visual word vocabulary 2

DoG Hessian Affine

+ + +

+ + ++ + ++ + +

+

- - -

- - -

- - -

- - - - - -

- - - - - - - - -

SVM classifiers

......

+ + +

+ + ++ + ++ + +

+

- - -

- - -

- - -

- - - - - -

- - - - - - - - -

+ + +

+ + ++ + +

- - -

- - -

- - -

- - - - - -

- - - - - - - - -

Vocabulary Generation BoW Representation

+ + +

+ + ++ + +

- - -

- - -

- - -

- - - - - -

- - - - - - - - -

BoW histograms Using Soft-weighting

Context-based concept detection



66

55

1-41-4

VIREO-374:374 LSCOM

concept detectors

VIREO-374:374 LSCOM

concept detectors

DASD: Domain Adaptive Semantic

Diffusion

DASD: Domain Adaptive Semantic

Diffusion

DASD - motivation

• Most existing methods aim at the assignment of concept labels individually– but concepts do not occur in isolation!

military personnel

smoke

explosion_fire

road outdoor

vehicle

building

7

8

Documentary Videos

Broadcast News Videos

• Most existing methods aim at the assignment of concept labels individually– but concepts do not occur in isolation!

• Domain change between training and testing data was not considered

DASD - motivation

9Jiang, Wang, Chang & Ngo, ICCV 2009

DASD - overview

road vehicle

0.05

0.19

0.80

0.46

0.13

0.01

0.12

0.91

0.18

0.05

water

0.11

0.58

0.10

0.13

0.02

sky

0.01

0.36

0.53

0.17

0.23

DASD - overview

• Domain adaptive semantic diffusion (DASD)– Semantic graph

• Nodes are concepts• Edges represent

concept correlation

– Graph diffusion• Smooth concept

detection scores w.r.t the concept correlation

vehicle

road

Water sky

0.10.20.80.50.1…0.4

0.10.60.10.10.0…0.8

0.00.40.50.20.8…0.7

0.00.10.90.20.1…0.3

10

DASD - formulation

• Energy function

11

Concept affinityConcept affinity

Detection score of concept ci on test samples Detection score of concept ci on test samples

DASD - formulation (cont.)

• Gradually smooth the function makes the detection scores in accordance with the concept relationships

Detection score smoothing processDetection score smoothing process

12

DASD - formulation (cont.)

• Graph adaptation

Graph adaptation processGraph adaptation process

13

Iteration: 8Iteration: 12

0.16

0.09VEHICLE

0.640.20

0.24

0.29

DESERT

SKY

CLOUDS

WEAPON

0.10

PARKING_LOT

CAR0.16

0.09VEHICLE

0.640.19

0.18

0.32

DESERT

SKY

CLOUDS

WEAPON

0.13

PARKING_LOT

CAR

Iteration: 0Iteration: 4

0.15

0.09VEHICLE

0.640.17

0.12

0.34

DESERT

SKY

CLOUDS

WEAPON

0.16

PARKING_LOT

CAR0.15

0.08VEHICLE

0.640.17

0.05

0.38

DESERT

SKY

CLOUDS

WEAPON

0.19

PARKING_LOT

CAR

Iteration: 16

0.15

0.08VEHICLE

0.640.16

0.00

0.42

DESERT

SKY

CLOUDS

WEAPON

0.24

PARKING_LOT

CAR0.15

0.08VEHICLE

0.640.16

0.00

0.43

DESERT

SKY

CLOUDS

WEAPON

0.27

PARKING_LOT

CAR

Iteration: 20

Graph adaptation - example

Broadcast news video domainBroadcast news video domain Documentary video domainDocumentary video domain

14

Experiments on TV ’05-’07

• Baseline detectors– VIREO-374

• Graph construction:– Ground-truth labels on TRECVID 2005

SPORTSSPORTS WEATHERWEATHER

OFFICEOFFICE BUILDINGBUILDING

DESERTDESERT MOUNTAINMOUNTAIN

WALKINGWALKING

PEOPLE-PEOPLE-MARCHINGMARCHING

EXPLOSION-EXPLOSION-FIREFIRE

MAPMAP

TRUCKTRUCK

CORP. LEADERCORP. LEADER

SPORTS WEATHER OFFICE

DESERTDESERT MOUNTAINMOUNTAIN WATERWATER

POLICEPOLICE MILITARYMILITARY ANIMALANIMAL TWO PEOPLETWO PEOPLE

NIGHT TIMENIGHT TIME TELEPHONETELEPHONE

STREETSTREET

CLASSROOMBUS

TRECVID 05/06 (Broadcast News Videos)TRECVID 05/06 (Broadcast News Videos) TRECVID 07 (Documentary Videos)TRECVID 07 (Documentary Videos)

15

Results on TV ’05-’07

• Performance gain on TRECVID 05-07 Datasets

TRECVID- 2005 2006 2007

# of evaluated concepts 39 20 20

Baseline (MAP) 0.166 0.154 0.099

SD 11.8% 15.6% 12.1%

DASD 11.9% 17.5% 16.2%

16

SD: semantic diffusion (without graph adaptation) Consistent improvement over all 3 data sets

DASD: domain adaptive semantic diffusion Graph adaptation further improves the performance

Results on TV ’05-’07 (cont.)TRECVID 2006 Test Data

17

TRECVID Jiang et al Aytar et al Weng et al DASD

2005 2.2% 4.0% N/A 11.9%

2006 N/A N/A 16.7% 17.5%

Comparison with the state-of-the-artsComparison with the state-of-the-arts

18

Results on TRECVID ’09

30%

10%

5%5%

Results on TRECVID ’09 (cont.)

19

• Quality of contextual detectors (VIREO-374)

Context VIREO-374

TV09 detectors

TV06 detectors

TV07 detectors

18%

16%

5% DASD performance gain

DASD - computational time

• Complexity is O(mn) – m: # concepts; n: # video shots

• Only 2 milliseconds per shot/keyframe!

20

TRECVID 05 TRECVID 06 TRECVID 07

SD 59s 84s 12s

DASD 89s 165s 28s

Summary

21

• A well-designed approach using local features achieves good results for concept detection.

• Context information is helpful !– Domain adaptive semantic diffusion

• effective for enhancing concept detection accuracy

• can alleviate the effect of data domain changes

• highly efficient !

– Future directions include:• detector reliability: diffusion over directed graph

• web data annotation: utilize contextual information to improve the quality of tags

– Source code available for download from DVMM lab research page

Context-based Visual Concept Detection Using Domain Adaptive Semantic Diffusion Yu-Gang Jiang, Jun...

Documents

Transcript of Context-based Visual Concept Detection Using Domain Adaptive Semantic Diffusion Yu-Gang Jiang, Jun...