sfchang/course/svia/papers/brown... · 2003-08-22 · @fi 3ui ghgm0 g¹p 8?3$@f8?3ui ½ ¾?¿
EE 6882 Statistical Methods for Video Indexing and...
Transcript of EE 6882 Statistical Methods for Video Indexing and...
1
1
EE 6882 Statistical Methods for Video Indexing and Analysis
Fall 2004Prof. Shih-Fu Chang
http://www.ee.columbia.edu/~sfchang
Lecture 2 Part A (9/15/04)
2EE6882-Chang
EE E6882 SVIA Lecture 2Review: Image features, color feature, similarity metricsAdditional distance metricsTexture featurePerformance evaluation metricsReview of statistic techniques
Probability, Distribution Functions and Matlab demos Entropy and mutual informationDiscriminant ClassifiersBayesian Classifiers, GMM estimation by Expectation Maximization
ReadingsReadings on the class web site about content based image searchVittorio Castteli, Probability Refresher, notes for EE E6880, Statistical PatternRecognition, Spring 2002.A. Jain et al, "Statistical Pattern Recognition: A Review," IEEE Tran. on Pattern Analysis and Machine Intelligence, vol 22, No 1, Jan. 2000.Digital Image Processing Textbooks: Image classification
Gonzalez and Woods Chap 12, Anil Jain Chap 9.14
2
3EE6882-Chang
Review: Statistical Pattern RecognitionImage/video pre-processing –quality, resolution etcFeature extraction
Color, texture, motion, shape, layout, regions, parts, etc
Feature representationDiscrete vs. continuous, vectorization, dimensionInvariance to scale, rotation, translation …
Feature selectionPCA, MDS, Kernel PCA, etc
Classification modelsDiscriminative vs. generativeMulti-modal fusion, early fusion vs. late fusion
Size of training/test data and manual supervision effortsValidation and evaluation processes
x
Likelihood
Probabilistic
Class 1 Class 2
x0(Height, income, …)
P(x|C=1) > or < P(x|C=2)
C(x0 )=?
x1
Decision Boundary
+++
+ + +
+
+++++
+
+++ +
++
++
++
--
--
-
--
-- --
- -
-
-
---
--
--
---
--
-
--
---
- --
--
-
---
--
x2
Discriminative
+
++
+
+
+ ++
f(x) < 0
f(x) > 0
f(x) discriminant function
4EE6882-Chang
Review: Feature-Based Image Matching
UserUser
User interfaceUser
interface
Image thumbnails
Image thumbnails
Images & videos
Images & videos
NetworkNetwork
QueryserverQueryserver
Image/videoServer
Image/videoServer
IndexIndex
ArchiveArchive
HSI-cone (cylindrical coordinates)
VisualSEEk system: 166 quantized bins in HIS space
−−−=
BGR
VVI
06/16/16/26/16/1
3/13/13/1
2
1
)(tan1
21
VVH −=
2/122
21 )( VVS +=
3
5EE6882-Chang
Review: Similarity MetricsL1 distance
L2 distance
Histogram Intersection
Mohalanobis distance
1 1( , 1) ( ) ( )i ij
D i i H j H j++ = −∑2
2 1( , 1) ( ) ( )i ij
D i i H j H j++ = −∑( )1
1
min ( ), ( )
1
min ( ), ( )
i ij
I
i ij j
H j H j
D
H j H j
+
+
= −
∑
∑ ∑
( ) ( )2 11 2 1 2
Tmah x
x
D x x C x xC : covariance matrix
−= − −o
o oo
xi
xj
ooo
oo
xi
xj
o
12 i jc s s= − 0c =
6EE6882-Chang
Earth Mover’s Distance (EMD)Rubner, Tomasi, Guibas ’98
Transportation Problem [Dantzig’51]
I Jcij
I: set of suppliersJ: set of consumerscij : cost of shipping a unit of supply from i to j
Problem: find the optimal set of flows fij to
0, ,
,
,
i j iji I i I
ij
ij ji I
ij ij J
j ij J
minimize c f s.t.
f i I j J (No reverse shipping)
f y j J (satisfy each consumer need /cacacity)
f x i I (bounded by each supplier's limit)
y x (
∈ ∈
∈
∈
∈
≥ ∈ ∈
= ∈
≤ ∈
≤
∑∑
∑∑
∑i I
feasibility)∈∑
4
7EE6882-Chang
Advantage of EMDEfficient implementations exist (Simplex Method)Also support partial matching (||I|| >< ||J||, e.g., histogram defined in different color spaces, or scales)If the mass of two distributions equal, then EMD is a true metricAllow flexible structures, e.g., matching multiple regions in each image
Multiple region in one image, each region represented by individual feature vector
Region set: {R1, R2, R3} Region set: {R1’, R2’, R3’, R4’}
Cij = dist(Ri, Rj’), which can be based on EMD also
8EE6882-Chang
EMD of Color Histogram( ) ( ) ( ) ( ) ( ) ( )
( ) 1 1
1 1
, ,..., , , ,..., , ( ) ( )
,
j i
M N
ij iji j
M N
iji j
h h 1 h 2 h M g= g 1 g 2 h N assume g j h i
C f
EMD h gf
= =
= =
= ≤
=
∑ ∑
∑∑
∑∑ Earth Hole
1 1 1
/M N N
ij ij ji j j
ij
ij ij
= C f g Fill up each hole
C : distance between color i in color space h and color j in color space g
f : move f units of mass from color i in h to color j in g
= = =∑∑ ∑
Normalization by the denominator termAvoid bias toward low mass distributions (i.e., small images)what’s the difference if both h and g are normalized first?
exact matching of sub-parts is changed.
5
9EE6882-Chang
TextureWhat is texture?
Has structure or repetitious pattern, i.e., checkeredHas statistical pattern, i.e., grass, sand, rocks
Why texture?Application to satellite images, medical images Describes contents of real world images, i.e., clouds, fabrics, surfaces, wood, stone
Challenging issuesRotation and scale invariance (3D)Segmentation/extraction of texture regions from imagesTexture in noise
10EE6882-Chang
6
11EE6882-Chang
Some approaches for texture featuresFourier Domain Energy Distribution
Angular features (directionality)
Radial features (coarseness)
21
1
2
tan
,
),(21
θθ
θθ
≤
≤
=
−
∫∫
uv
where
dudvvuFV
222
1
2
,
),(21
rvurwhere
dudvvuFV rr
<+≤
= ∫∫
xω
yω
φ
xω
yω
r
12EE6882-Chang
Co-occurrence Matrix - (image with m levels)
Popular early texture approach
Approaches to texture
)cos( and )sin( and ],[ and ],[
NW'' ,north'' e.g., pixels, obetween twrelation ),(,
),()0,(
),0()0,0(),(
0101
1100
),(),(
),(),(
),(
θθ
θ
θθ
θθ
θ
dxxdyyjyxIiyxI
dRwhere
mmQmQ
mQQjiQ
dRdR
dRdR
dR
+=+===
=
=
0P
1Pdθ
7
13EE6882-Chang
Co-occurrence Matrix(also called Grey-Level Dependence, SGLD)
Measures on
Energy
Entropy
Correlation
Inertia
Local Homogeneity
),(),( jiQ dR θ
∑∑=i j
dR jiQdE ),(),( ),(θθ
)),(/log(),(
),( ),(∑∑=i j
RdR jiQEE
jiQdH θθ
∑∑ ⋅−−
=i j
Ryx
yx jiQji
dC ),())((
),(σσ
µµθ
∑∑ −=i j
R jiQjidI ),()(),( 2θ
∑∑−+
=i j
R jiQji
dL ),()(1
1),( 2θ
Statistical MeasuresNone corresponds to a visual component.
14EE6882-Chang
Non-Fourier type bassMatched better to intuitive texture featuresExamples of filters (out of total 12)
Laws Filters [1980]
−−−−
−−−−−
14642812820000028128214641
−
−−−−
1020120402000002040210201
−−−−−
−−−−−
−−
1464141624164
6243624641624164
14641
Measure energy of output from each filter
mI12 outputs
8
15EE6882-Chang
Tamura TextureMethods for approximating intuitive texture featuresExample: ‘Coarseness’, others: ‘contrast’, ‘directionality’
Step1: Compute averages at different scales, 1x1, 2x2, 4x4 pixels
Step2: compute neighborhood difference at each scale
Step 3: select the scale with the largest variation
Step 4: compute the coarseness
kBestL yxSEEEEyx 2) ( ), . . . , , max( determine ),( 21k ==∀
∑∑−
−
−
−
+
−=
+
−=
=∀1
1
1
1
2
22
2
2 2),(),( ),,(
k
k
k
k
y
yjk
x
xik
jifyxAyx
),2(),2() ( ),,( 11, yxAyxAyxEyx k
kk
khk−− −−+=∀
∑∑= =
=m
j
n
iBestCRS jiS
MNF
1 1),(1
16EE6882-Chang
Content-based Image and Video Retrieval System
UserUser
User interface
User interface
Image thumbnails
Image thumbnails
Images & videos
Images & videos
NetworkNetwork
QueryserverQueryserver
Image/videoServer
Image/videoServer
IndexIndex
ArchiveArchive
What are the bottlenecks of the system?What functionalities should each component have?
9
17EE6882-Chang
Evaluation
Detection
False Alarms
Misses
Correct Dismissals )/(
)/(
)/(
DBBF
BAAP
CAAR
+=
+=
+=
1-N0 "Irrelevant" 0 Relevant"" 1
==
nVn
BVD
AVC
VB
VA
N
n n
N
n n
K
n n
K
n n
−−=
−=
−=
=
∑∑∑∑
−
=
−
=
−
=
−
−
))1((
)(
)1(
1
0
1
0
1
0
1
0
N Images in DB K ranked returned Result
D B CA
“Returned” “Relevant Ground Truth”
Recall
Precision
Fallout
Combined 2/)(1 RPRPF
+⋅=
18EE6882-Chang
Evaluation MeasuresPrecision Recall Curve
2. Receiver Operating Characteristic (ROC Curve)
3. Relative Operating Characteristic
4. P value
5. 3-point P value
) vs( RPP
R
BA vs
FA vs
)int( offcut at 1
0∑ −
== N
n nk VkP
0.8 0.5 .20at Avg =RP
A(hit)
B (false)
10
19EE6882-Chang
Evaluation Metric: Average Precision
S Ranked list of data in response to a query
3/73/63/53/42/31/21/1Precision0001101truth Ground
DDDDD s......2163815
Average precision: datarelevant ofnumber : ,11
totalRIj
RR
AP j
s
j
j∑=
=
0 1 2 3 4 5 6 7
Precision
j
3∑ iP
AP measures the average of precision values at R relevant data points
0 1 2 3 4 5 6 7
Rj
j
1
2
31.0
20EE6882-Chang
Evaluation Metric: Average PrecisionAlternative Measure
Ranked result are manually inspected to a depth of N1E.g., in TREC VIDEO 2003, N1 =100; in TREC VIDEO 2004, N1 =1000
Observations (AP)AP depends on the rankings of relevant data and the size of the relevant data set. E.g., R=10
Case I: + + + + + + + + + - - - - --+Pre: 1 1 1 1 1 1 1 1 1 0 0 0 0 001 AP=1
Case II: - +Pre: 1/2 AP=1/2
- + - + - + - + - + - + - + - + - +1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2
Case II: Pre:
- - - --- - - -- + + + + + + + + +1/11 2/12 10/20… … AP~0.3
11
21EE6882-Chang
Evaluation Metric: Average PrecisionObservations (AP)
E.g., R=2
AP is different from interpolated average of precision values
Case I: + + - - - -
AP=1
Case II: - - - - + - - - - +
AP=0.2
Precision 0.2 0.2
22EE6882-Chang
Readings available on the class site for content-based image retrieval Consider this topic for class presentation
How to get hands on …Get the image content set from TAGet familiar with programming tools, e.g., Matlab
Introduction to Matlab basic commandshttp://www.ee.columbia.edu/~sfchang/tools/matlab.intro.html
Introduction to basic image processing commands in Matlabhttp://www.ee.columbia.edu/~sfchang/tools/DIPtutorial.m
12
23EE6882-Chang
Paper List for Fall 2004Updated paper list available at the course web siteTopics
Content-based image searchWeb image searchMedia fingerprintingImage classification
Bayesian, Boosting, SVMRelevance feedback
Document clusteringHMM and video classificationLanguage models and applications in multimedia IR
Feel free to propose additional topics