Large-Scale Image and Video Processing

33
!"#$%& ($" )"#$%&*#$"+,%-*".-*,- !,/&"012,)" 34,&" ,%. 5$."6 7/62"88$%& @ANU ML Workshop, Sept 23, 2011

Transcript of Large-Scale Image and Video Processing

Page 1: Large-Scale Image and Video Processing

!"#$%&'($"'

)"#$%&*#$"+,%-*".-*,-'

!,/&"012,)"'34,&"',%.'5$."6'7/62"88$%&'

@ANU ML Workshop, Sept 23, 2011

Page 2: Large-Scale Image and Video Processing

Obama @ Texas

Page 3: Large-Scale Image and Video Processing

9",)$:;'$%':<$8'6%)$%"'=6/).'

one year of digital life

news broadcast ten channels, one year

1,300 GB, 1,830 hrs

~200 GB?

Oct’09: 4 billion photos 6000+/minute ~ 500 TB

Apr’09 : 15 billion photos +220 million/week ~ 1.5PB

Mar’10 : 24 hrs/minute 10% of internet traffic ~ 12 PB/yr ??

>  ?,%;'6:<"/'"#,4@)"8A'B  C%./$6.'D'37<6%"'E?,@':/,2"8'

B  ?6F$)"'@<6%"'2,))'/"26/.8'

B  C))'"%G$/6%4"%:,)'H",:-/"8'8"%8".'.,$);'6G"/'C-8:/,)$,I'

Page 4: Large-Scale Image and Video Processing

J-:)$%"' '

> !,/&"082,)"K'=<;I'B !"88'@/6%"':6'6G"/L:I'M"N"/'&"%"/,)$O,P6%'B Q,:,'G8'$%:"))$&"%2"'

> 12,)$%&'-@',%'15?'2),88$L"/'

> 7/62"88$%&'26%8$."/,P6%8'> Q,:,0$%:"%8$G"'@/62"88$%&'H/,4"=6/R'

B S,.66@'T',@@)$2,P6%8'

Page 5: Large-Scale Image and Video Processing

Q,:,'IU'76="/'

[Banko and Brill ACL 01]

Task: confusion set disambiguation

the winning approaches and intervals

Page 6: Large-Scale Image and Video Processing

V#,4@)"A'12"%"'W64@)"P6%' '

[Hays and Efros SIGGRAPH07]

“… initial experiments with the GIST descriptor on ten thousand images were very discouraging … however increasing the dataset to one million yielded a qualitative leap.”

Page 7: Large-Scale Image and Video Processing

?!'1;8:"48'

Page 8: Large-Scale Image and Video Processing

5$8-,)'W6%2"@:'Q":"2P6%'

Task: score each image independently w.r.t. a set of pre-defined visual concepts.

Page 9: Large-Scale Image and Video Processing

X."46Y'

Aggregated performance over 50 “core” visual categories [Xie et al’11].

Raw classifier tags (baseline)

Normalized classifier tags Precision-calibrated tags

Taxonomy-refined tags

Number of tags per image

Tagg

ing

Pre

cisi

on (%

)

80% precision, @4 tags per image

“ImageNet-1000”, KNN

“Social 20” KNN-voting [Li, Snoek’09]

“ImageNet-1000”, UIUC-NEC

“ImageNet-1000”, libLin*

Page 10: Large-Scale Image and Video Processing

1-@@6/:'5"2:6/'?,2<$%"8'

Page 11: Large-Scale Image and Video Processing

> Z/,$%$%&'

> Z"8P%&'

15?'[$%'?!'2),88\''

alpha_star = quadprog(diag(z)*y'*y*diag(z), -ones(1, Nf), zeros(1, Nf) … )

Q-,)'H6/4A'C']7'

4,:),F'26."A'

Page 12: Large-Scale Image and Video Processing

^<,:'<;@"/0@,/,4":"/8I'

76);'%64$%,)'."&/""'.I'

<6='4-2<'4$802),88$L2,P6%I'

=<,:'R"/%")I'

Page 13: Large-Scale Image and Video Processing

15?'S;@"/0@,/,4":"/'1")"2P6%'

Page 14: Large-Scale Image and Video Processing

S6=':6'8")"2P6%'W',%.'&,44,I'

XQ-,%'":',)*K'_"-/6264@-P%&'`aabY'

>  7"/H6/4,%2"'8-/H,2"'=*/*:'<;@"/0@,/,4":"/'<$&<);'%6%0/"&-),/'

>  c0H6).'2/6880G,)$.,P6%'8""4':6',@@/6#$4,:"':"8:'"//'="))*'

Page 15: Large-Scale Image and Video Processing

^<"/"K',%.'<6='4-2<':6'8",/2<I'>  !$F15?'E-$."'

>  <N@ADDF)6&*846),*6/&'

>  _6/4,)$O"';6-/'.,:,'

>  W"%:"/'&,44,',/6-%.'d-,%P)"'8:,:8'6H'$%G"/8"'.$8:,%2"'$%':<"'.,:,*'

>  1",/2<'W'$%')6&'8@,2"*'

]A'=<,:'"G,)-,P6%'2/$:"/$,':6'8",/2<'6%I']A'=<,:e8':<"'"//6/'-@@"/'F6-%.'H6/')",G"06%"06-:'2/6880G,)$.,P6%I''

Page 16: Large-Scale Image and Video Processing

> Z/,$%$%&'

> Z"8P%&'

> 7,/,4":"/'8")"2P6%'

15?A'^<"/"'$8':<"'264@-:,P6%'F6N)"'%"2RI']7A'J[_f`\gJ[_fb\'

J[_f`\'F/-:"0H6/2"'W,%'="'"G"%'8:6/"'c'H6/'),/&"'_I'"*&*'_Uh"i'

C..8'-@'d-$2R);A'j'H6).'#'k'G,)-"8'6H'W'#'b'R"/%")':;@"8'#'j'@,/,4":"/'G,)-"8'l''

Page 17: Large-Scale Image and Video Processing

E7m'1@"".'-@'6H'c"/%")'W64@-:,P6%'

>  Z/,$%$%&'8@"".-@'gha#''B  iaaa0.$4"%8$6%'$%@-:'.,:,'

B  m@':6'baaaT':/,$%$%&'$%8:,%2"8'

>  Z"8P%&'8@"".-@'II'B  1<6-).'F"'"G"%'46/"'

<N@ADD4R),F*$P*&/D@/6n"2:DE7m0!3M15?'

Page 18: Large-Scale Image and Video Processing

E7m0Z,$)6/".'15?'XW6N"/'":',)K'cQQ'`ahhY'

> c";'$.",8'B !6,.T86)G"'=6/R$%&'8":'6%'E7m'[8$O"'hi\'

B W)-8:"/'6%'8@,/8$:;'@,N"/%8':6'$4@/6G"'3DJ'

B h8:'6/."/'<"-/$8P28'H6/'2<668$%&','=6/R$%&'8":'

<N@ADDo2*-2<$2,&6*".-Dg26N"/D@/6n"2:8D&:8G4'

Page 19: Large-Scale Image and Video Processing

715?'XW<,%&'":*',)*'_371eapY'

> c";'$.",A'B m8"'3Wq'[$%264@)":"'W<6)"8R;'q,2:6/$O,P6%\':6',@@/6#$4,:"'R"/%")'4,:/$#*'

B 7"/H6/4'26)-4%0F,8".'3Wq',%.'.$8:/$F-:".'4,:/$#'4-)P@)$2,P6%':6',2<$"G"'@,/,))")$O,P6%'

> gkh#'8@"".'-@'-8$%&'ja'4,2<$%"8'B J%'`aac'$4,&"8'

code.google.com/p/psvm

Page 20: Large-Scale Image and Video Processing

22

random subspace bagging

[Yan, Tesic and Smith KDD07]

Features

Training Examples

SVM1 SVM2

Classifiers

Page 21: Large-Scale Image and Video Processing

23

Many approaches for scaling up

!  Large number of models vs. large models !  Some applicable to other models (e.g. graph construction) !  Other issues: normalize input, imbalanced training data, normalize output?

N

N

n n

* N

N

p

p

[Chang et. al. NIPS’07]

[Yan et. al. KDD’07]

Working set on GPU

Page 22: Large-Scale Image and Video Processing

J:<"/'@,/:8'6H':<"'?!'@$@")$%"'

Page 23: Large-Scale Image and Video Processing

V#,4@)"'r!,/&"s'7/62"88$%&'Z,8R8'

> Z,8R'hA'7CWC!'5JW'W<,))"%&"'`aha'B `a':,/&":'2),88"8'B haKhab':/,$%$%&'$4,&"8K'`bcT'6Fn"2:8'

> Z,8R'`A'_31Z'Z9VW'?-)P4".$,'VG"%:'Q":"2P6%'[?VQ\'M"%2<4,/R'B hj'"G"%:8'B hbKaaaT':/,$%$%&'G$."68K'b`KaaaT':"8P%&'8,4@)".'G$."6'H/,4"8A'iiacT':/,$%$%&K'g`?':"8P%&''

Page 24: Large-Scale Image and Video Processing

W6)6/'S$8:6&/,4'

!  M6o2"))$K'.,'5$%2$K'?6%":K'6/'?$/6I'

Page 25: Large-Scale Image and Video Processing

5$8-,)'^6/.8'H6/'34,&"'9":/$"G,)'

>  562,F-),/;'8$O"'hc'g'h?'

>  !62,)'.":"2:6/8'B  Q6EK'S"88$,%0Ct%"K'S,//$80Ct%"K'

?1V9K'S,//$80W6/%"/K'":2*'

>  !62,)'."82/$@:6/8'B  13qZK'7WC013qZK'E31ZK'1m9qK'":2*'

>  Q$u"/"%:'26.$%&'4":<6.8'B  c04",%8K'<,8<$%&K'8@,/8"'26.$%&K'

":2'

Page 26: Large-Scale Image and Video Processing

M,2R06H0:<"0V%G")6@'7),%%$%&'>  1:6/"'/,='.,:,A'giaaEM'G$."68'>  q",:-/"'"#:/,2P6%'["*&*'M6^\'

B M6^A'`'8"2'D'$4,&"'> 5JW'A'gb'<6-/8v''?VQA'ia'W7m'.,;8''

> 1:6/$%&'/,='13qZ'H",:-/"8A'ghZM'H6/'?VQ':/,$%$%&'

B JFn"2:'H",:-/"8A'`a'8"28'D'$4,&"'H6/''`aaT'6Fn"2:8'**''> wh';",/'W7m'P4"'l''b'46%:<8'6%','d-,.026/"'7W'

>  !",/%'46.")'B  1:6/"'h?'#'`c''.$48'$%'4,:),FA'hi*kEMx'B  15?'gh'4$%'@"/']7'y'zW5[j\'y'z'@,/,4'[`k\'

'"'`'<6-/8x'

Page 27: Large-Scale Image and Video Processing

> Z<"'&66.'%"=8'B ?68:'6H':<"8"'264@-:,P6%'$8'.,:,0',%.':,8R0'@,//,)$O,F)"'

B 1:6/"'$%:"/4".$,:"'/"8-):8'

B 4,%;'84,))'L)"8'<,/4H-)'[x\'> r:,/'2HOsv'<,8<$%&'<")@8''

> ^<,:'$H'l'B 7W'.$"8I'B 76="/'6-:,&"I'B _":=6/R'P4"6-:I''

r4,2<$%"')",/%$%&s'"'

Page 28: Large-Scale Image and Video Processing

HBase

MapReduce

Core Avro

HDFS Zoo Keeper

Hive Pig Chukwa

Page 29: Large-Scale Image and Video Processing
Page 30: Large-Scale Image and Video Processing

?,@9".-2"'

Page 31: Large-Scale Image and Video Processing

q,2"F66R'"#,4@)"A'2<"2R0$%'P4"8'

<N@ADD===*H,2"F66R*264D%6:"8DH,2"F66R0"%&$%""/$%&D$%:"/%$%&0,:0H,2"F66R0=<60&6"80=<"/"0=<"%0,%.0=<;0$:04,N"/8Dhahja`{`i{h|i{|`a'

Page 32: Large-Scale Image and Video Processing

1-44,/;'

> !,/&"'.,:,'$8':<"':/"%.'> ?!',)&6/$:<48'"'@/,2P2"''

> C@@)$2,P6%08@"2$L2'@),%%$%&'$8'$4@6/:,%:'

> ?,%;'@,/,))")'264@-P%&'@,/,.$&48':6'<")@''

> 6:<"/':6@$28'l'B !,/&"',46-%:8'6H'8@,/8"'H",:-/"8'B E??K'W9q',%.'6:<"/'46.")8'B 3%G"/:".'$%."#K'<,8<$%&K'$%."#$%&'**''

Page 33: Large-Scale Image and Video Processing

Z<,%R8x']-"8P6%8I'

>  1)$."'2/".$:8'B  S,8P"'F66R'rZ<"'V)"4"%:8'6H'1:,P8P2,)'?,2<$%"'!",/%$%&s''

B  ^$%8:6%'S8-'[_Zm'Z,$=,%\K'96%&'},%'[q,2"F66R\'B  W3c?':-:6/$,)'F;'V.=,/.'W<,%&'[E66&)"\'

B  ^^^ehh':-:6/$,)'F;'C)"#'146),'r&/,@<$2,)'46.")8'H6/':<"'$%:"/%":s''

B  q)$2R/'-8"/'+;-)$;,/:'

>  J:<"/'@6$%:"/8'B  ~6<%'!,%&H6/.'),/&"082,)"':-:6/$,)+'cQQehh''