Lecture 8 - Detection and Matched Filters

download Lecture 8 - Detection and Matched Filters

of 72

Transcript of Lecture 8 - Detection and Matched Filters

  • University of Illinois at Urbana-Champaign 07/16/13

    Detection and Template Matching

    CS 598 PS Fall 2013

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    2

    A change in thinking

    So far we did feature extraction Unsupervised learning

    Now well do detection and classificationSupervised learning

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    3

    Todays lecture

    Detection of elements of interest

    Template matching / matched filtersObtaining invariance/robustnessEfficient matchingUsage cases

    Probabilistic interpretation

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    4

    Detection

    Find the presence of a predefined signalFace/pedestrian/car detectionSpeech, heartbeat, gunshot detectionEarthquakes, quasars, aliens, ...

    Many ways to go about itWell start from the easy part

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    5

    A simple example

    Can we detect a known face in a collection?

    Query

    Search set

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    6

    Vectorize the data(and scale to unit norm)

    x = vec( ) =

    Y = vec( ) =

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    7

    And get their dot product

    x Y =0 2 4 6 8 10 12 14 16 18

    0.8

    0.85

    0.9

    0.95

    1

    1.05

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    8

    Dot product for matching

    For two unit length vectors x,yThen x,y are increasingly similar as tends to zeroIf the dot product is maximized we have a closer match

    x y = 1

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    9

    Interpreting the dot product

    Query

    Search set

    0 2 4 6 8 10 12 14 16 180.8

    0.85

    0.9

    0.95

    1

    1.05

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    And it works ok for noisy inputs too10

    Query

    Search set

    0 2 4 6 8 10 12 14 16 18

    0.65

    0.7

    0.75

    0.8

    0.85

    0.9

    0.95

    1

    1.05

    NoisyClean

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    11

    Template matching

    Our query is a templateUse dot products to compare it with inputs

    Simple and very fast detector/classifierBut it wont get you too far!

    Lets signal-ize it

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    12

    A tougher case

    Heartbeat data

    How do we detect the beats?It can be all over the placeHow do we do the dot product?

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    13

    Brute force approach

    Do all possible dot products

    d(t) = x(1) x(N)

    y(t +1)y(t +N)

    x

    yFriday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    14

    Sliding dot product

    d

    d(t) = x(1) x(N)

    y(t +1)y(t +N)

    x

    y

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    15

    A closer look at this

    Writing this as a summation it comes up to be a convolutionBut with the template time-reversed

    d(t) = x(1) x(N) y(t +1)

    y(t +N)

    =

    = x(i)y(t + i)i = x yx(t) = x(t)

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    16

    Matched filter

    This is known as a matched filter

    Why filter? Because we do a convolution

    Also known as cross-correlationVery common tool in communications, sonar, radar, etc

    Template presenceat time t Template Input

    d(t) = x(k)y(tk)k

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    17

    Matched filters in sonar/radar

    Radar/sonar Transmitted signal Object

    Reflection

    Received signal

    Template

    Input

    Matched filter output d x y

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Some post-processing

    The matched filter output is a little messy We mostly care about rough peaks, and its energy

    So we rectify and lowpass filter

    18

    0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

    0.5

    0

    0.5

    Input echo recording

    Time (sec)

    0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

    0.5

    0

    0.5

    1Input echo recording (rectified/smoothed filter)

    Time (sec)

    InputFilter

    InputFilter

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Some post-processing

    The matched filter output is a little messy We mostly care about rough peaks, and its energy

    So we rectify and lowpass filter

    18

    0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

    0.5

    0

    0.5

    Input echo recording

    Time (sec)

    0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

    0.5

    0

    0.5

    1Input echo recording (rectified/smoothed filter)

    Time (sec)

    InputFilter

    InputFilter

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    And thankfully it works well with noise19

    0.5 1 1.5 2 2.5 3 3.5 4 4.5 50.5

    0

    0.5

    1

    Clean echo recording

    Time (sec)

    0.5 1 1.5 2 2.5 3 3.5 4 4.5 50.5

    0

    0.5

    1

    Noisy echo recording

    Time (sec)

    InputFilter

    InputFilter

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    And thankfully it works well with noise19

    0.5 1 1.5 2 2.5 3 3.5 4 4.5 50.5

    0

    0.5

    1

    Clean echo recording

    Time (sec)

    0.5 1 1.5 2 2.5 3 3.5 4 4.5 50.5

    0

    0.5

    1

    Noisy echo recording

    Time (sec)

    InputFilter

    InputFilter

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    20

    Simple communications example

    You send 0/1s over the airBut you pick up some noise

    E.g.:

    How do you recover the original data?

    100 200 300 400 500 600 700 800 900 10000

    0.5

    1Sent data

    100 200 300 400 500 600 700 800 900 10002

    0

    2

    4Received data

    100 200 300 400 500 600 700 800 900 10000

    0.5

    1

    Recovered data

    ConvolutionThresholding

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    21

    Detecting the 1s

    Make a template for them:x = [1 1 1 1 1 1 1 1 1]This will detect a series of 1 symbols

    Convolve template with the inputAnd measure when the result is above threshold

    E.g. greater than 0.5

    Where it is we have a series of 1s

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    22

    Result

    100 200 300 400 500 600 700 800 900 10000

    0.5

    1Sent data

    100 200 300 400 500 600 700 800 900 10002

    0

    2

    4Received data

    100 200 300 400 500 600 700 800 900 10000

    0.5

    1

    Recovered data

    ConvolutionThresholding

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    An interesting connection

    What is this filter? x = [1 1 1 1 1 1 1 1 1] Lets look at its Fourier transform:

    23

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    1

    2

    3

    4

    5

    6

    7

    8

    9

    Resp

    onse

    Normalized frequency

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    An interesting connection

    What is this filter? x = [1 1 1 1 1 1 1 1 1]

    Its a lowpass filter We effectively remove the high frequencies (i.e. the noise)

    Two ways to achieve the same result Template to match our signal Or a denoising filter

    24

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    25

    Computational considerations

    The sliding dot product is slowN multiplications/additions per input sample No fun with large templates

    Fortunately we can use the FFT!Remember that convolution can be performed

    efficiently with the FFT

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    FFT-based matched filtering

    Convolution in the time domain maps to element-wise multiplication in the frequency domain

    But we have three sizes here: template, input, output template is length N, input in M, output is N+M-1 What do we use?

    26

    d = x y F d = F x( ) F y( )

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Approach 1

    Zero pad everything to the largest size (M+N-1) Ensures that we we have enough space to store the output Doesnt change the result since we convolve with more zeros

    Slightly redundant though

    Example: N = 1,000, M = 10,000,000 Time domain takes about 10 GFLOPs FFT convolution takes about 200 MFLOPs

    27

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Approach 2

    Take small snippets of the input, convolve them with the template and add them together (convolution is linear!)

    How small? Same size as template (N) to avoid excessive zero padding But, the output will be 2N-1, so we need to zero pad as much

    Example speedup, N = 1,000, M = 10,000,000: Time: 10 GFLOPS FFT: 200 MFLOPs Fast convolution: 60 MFLOPs

    28

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Matched filtering for event detection29

    Input video

    Extract audio only

    Now we can process long inputs efficiently

    2 4 6 8 10 12 14 16 18 20 22

    0.50

    0.5

    Input scene audio

    Time (sec)

    0.2 1

    0.5

    0

    0.5

    Gunshot template

    Time (sec)

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Matched filtering for event detection29

    Input video

    Extract audio only

    Now we can process long inputs efficiently

    2 4 6 8 10 12 14 16 18 20 22

    0.50

    0.5

    Input scene audio

    Time (sec)

    0.2 1

    0.5

    0

    0.5

    Gunshot template

    Time (sec)

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Matched filtering for event detection29

    Input video

    Extract audio only

    Now we can process long inputs efficiently

    2 4 6 8 10 12 14 16 18 20 22

    0.50

    0.5

    Input scene audio

    Time (sec)

    0.2 1

    0.5

    0

    0.5

    Gunshot template

    Time (sec)

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Matched filtering for event detection29

    Input video

    Extract audio only

    Now we can process long inputs efficiently

    2 4 6 8 10 12 14 16 18 20 22

    0.50

    0.5

    Input scene audio

    Time (sec)

    0.2 1

    0.5

    0

    0.5

    Gunshot template

    Time (sec)

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Matched filter result

    Strong peaks where gunshots took place Processing: Time domain: 1.8 sec, Fast Conv: 0.06 sec

    30

    2 4 6 8 10 12 14 16 18 20 22

    0.5

    0

    0.5

    1

    1.5

    2Input scene audio / matches

    Time (sec)

    Input audioFilter output energy

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Going even faster31

    t s = F F t( ) F s( ) = F ft fs( )t s 2 = F ft fs( )( ) F ft fs( ) = ft fs( ) ft fs( ) = ft fs 2

    Freq

    uenc

    y (kH

    z)

    Gunshot template in Fourier space

    0

    2

    4

    6

    Input scene in Fourier space

    2 4 6 8 10 12 14 16 18 20

    2 4 6 8 10 12 14 16 18 20 22

    Filter responses

    Time (sec)

    Time filterSpectral norms

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    32

    Generalizing to two dimensions

    In 1D the template was time-reversedIn 2D we flip both dimensions

    We now slide across both dimensionsE.g. left-to-right for all heights

    FFT speedup still holds But we predictably use 2D FFTs Same caveats as before apply here as well

    d(i, j) = x(k,l)y(i +k, j +l)lk

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    33

    Digit recognition example

    Given the following input

    Find the digits by analyzing the image

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    34

    Collect templates

    Create a set of templates for all digits

    Filter flipped versions with input imageEach template will peak at digits position

    MATLAB demo

    x1 = x2 =

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    35

    A new problem

    What if we have a change in size?

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    36

    Broadening the search

    Resize the the templates to more sizes and run template matching againHow much resizing?

    This is N times more computationWhen considering N scales

    MATLAB demo

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    37

    Image pyramids

    Scaling the templates is not too efficientInstead resample the input at multiple scales

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    38

    Rotation?

    Same as beforeBrute force search over all potential

    rotations

    One more multiplicative factor in computation

    MATLAB demo

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    39

    Some relief

    Rotation and scale and , thats a lot of searching that needs to be done!

    There are some tricks we can pulle.g. radial filters for rotation

    Better featuresChoose, e.g., a rotation invariant feature

    Lots and lots of papers But some searching will be there

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    For example

    Treat significant pixels as points Get their cross-distances

    And normalize them

    The distances histogram is rotation- and scale- invariant Do matching on that instead

    40

    0 10 20

    5

    10

    15

    20

    25

    Input data Distance matrix

    20 40 60

    20

    40

    60

    0 0.5 10

    100

    200

    300

    400

    Distance histogram

    10 20 30 40

    15

    20

    25

    30

    35

    50 100 150

    50

    100

    150

    0 0.5 10

    1000

    2000

    0 10 20

    5

    10

    15

    20

    25

    50 100 150

    50

    100

    150

    0 0.5 10

    500

    1000

    1500

    2000

    10 20 30 40

    20

    30

    40

    100 200 300

    100

    200

    300

    0 0.5 10

    5000

    10000

    0

    0.5

    1

    0

    0.5

    1

    0

    0.5

    1

    0

    0.5

    1

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    41

    One more problem

    What happened to the normalization?!Remember that vectors were unit length?That doesnt hold with matched filtering

    Heres a challenging case

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    42

    Problems with scaling

    Regular cross-correlation is:

    Output is highly dependent on the local magnitude of the input

    d(t) = x(k)y(tk)kThis is high if this is high

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    43

    In fact, Ive been cheating all this time

    I used white-on-black imagesI made sure that high values were parts of interest

    Do you see a new problem?

    510

    15

    510

    1520

    25300

    0.05

    0.1

    "White on black"

    510

    15

    510

    1520

    25300

    0.05

    0.1

    "Black on white"

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    44

    Solution 1: Normalize

    We really want:

    Where the denominator ensures a unit-length dot product with the template

    Likewise we can generalize to 2-D, etc.But FFT speedup is a little trickier now

    d(t) =x(k)y(tk)

    xy(t)k

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    45

    Solution 2: Gradient filtering

    Detect what mattersRemember what your retina likes? (hint: edges)

    Perform detection on the gradient imageThis abstracts the color and local intensity

    Much more robust detector (for images at least)

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Example46

    10 20 30

    10

    20

    30

    Template

    0

    0.05

    0.1

    10 20 30

    10

    20

    30

    Template gradient

    0.1

    0

    0.1

    10 20 30 40 50

    10203040

    Input

    0.10.20.3

    10 20 30 40 50

    10203040

    Input gradient

    0.200.2

    10 20 30 40 50

    10203040

    Input

    0.10.20.3

    10 20 30 40 50

    10203040

    Input gradient

    0.200.2

    10 20 30 40 50

    510152025303540

    | Matched filter output |, peak = 1.2886

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    10 20 30 40 50

    510152025303540

    | Matched filter output |, peak = 1.2886

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Movie example

    Track the ball movement Convolve in 4D space

    47

    Input movie Ball template

    =Color channel average power

    120 160 3 138

    28 29 3

    120 160 138

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Movie example

    Track the ball movement Convolve in 4D space

    47

    Input movie Ball template

    =Color channel average power

    120 160 3 138

    28 29 3

    120 160 138

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Each channel separately48

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Each channel separately48

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    49

    Template matching applications

    Popular tool in computer visionface/car/pedestrian/footballer detection

    Templates are example images of the above

    Good for event detection in audiogunshot detection, room measurements

    Bio/geological/space/underwater signals etc ...

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    50

    Matching without a template

    Sometime we dont know the templateBut we know it appears a lot

    We can use auto-correlation to find structureUse the full input as a template

    Peaks will denote repeating elements

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Back to the heartbeat example

    We dont know the exact template, but there is a pattern that is repeating in time We can still find structure regardless

    51

    1 2 3 4 5 6 7x 104

    0.2

    0

    0.2

    2000 4000 6000 8000 10000 12000 14000

    0.10

    0.10.2

    Time (samples)

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Autocorrelation

    Use the entire input as a template Match the input on itself

    52

    Point where the two sequences alignwill yield the maximum output

    Correlation between successive beats Correlation between

    beats spaced by two

    0.5 1 1.5 2 2.5x 104

    5

    0

    5

    10

    15

    20

    Time (samples)

    Correlation between oscillations within beat

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    53

    An popular case

    Pitch trackingMusical instruments tend to repeat a similar waveform,

    the rate of the repeat is the pitch

    But it can change over time!!

    0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.0450.30.20.1

    00.10.20.3

    Pitched note

    Time (sec)

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    So we cant really autocorrelate

    Many peaks corresponding to various elements No temporal information

    54

    1 2 3 4 5 6 7 8 9 10

    0.5

    0

    0.5

    Input music

    Time (sec)

    0.08 0.06 0.04 0.02 0 0.02 0.04 0.06 0.081

    0

    1

    2

    3x 104

    Time (sec)

    Zoomed in autocorrelation

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    So we cant really autocorrelate

    Many peaks corresponding to various elements No temporal information

    54

    1 2 3 4 5 6 7 8 9 10

    0.5

    0

    0.5

    Input music

    Time (sec)

    0.08 0.06 0.04 0.02 0 0.02 0.04 0.06 0.081

    0

    1

    2

    3x 104

    Time (sec)

    Zoomed in autocorrelation

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Use localized autocorrelations instead

    For each analysis window perform an autocorrelation Which you can also efficiently compute with the FFT

    55

    1 2 3 4 5 6 7 8 9 10

    0.5

    0

    0.5

    Time (sec)

    Input sound

    Lag

    (mse

    c)

    Time (sec)

    Sliding autocorrelation

    1 2 3 4 5 6 7 8 9

    12345

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    56

    Probing matching criteria

    Revisiting the starting pointSuppose we instead tried to find the minimal

    distance between the template and the input

    This is a simple Euclidean distance D(m,n) = x(im, j n)y(i, j)

    2ji

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    57

    From Euclidean to dot product

    Assume a locally constant input yMinimizing D, maximizes the last term

    Which happens to be the dot product

    With the dot product we are ultimately minimizing a Euclidean distance

    D(m,n) = y(i, j) 2 +ji x(i, j) 2ji 2 y(i, j)x(im, j n)ji

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    58

    Getting to a likelihood

    The Euclidean distance implies a Gaussian

    We can now get a likelihood of match Is Gaussian the right model though?

    We can use various other metrics to define distance, implying a different distributionMore on that later

    P(x; y) e12(xy)(xy)

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    59

    And dont forget features!

    We focused on detection on raw dataWe can instead convolve on the feature weights

    This can buy us noise robustness, invariance, and other desirable propertiesYou will rarely operate on raw data

    All of the previous applies here as well

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    60

    Recap

    Detection of interesting elementsMatched filtering1D to 2DScale/rotation invarianceNormalized cross-correlationDetection on gradientsAutocorrelation

    From filters to likelihoodsFriday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    61

    Next lecture

    Linear classifiers

    Discriminant models

    Multi-class recognition

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    But ...

    The next lecture is next Friday I will be at the IEEE MLSP workshop next Wednesday

    Does the jet-setting lifestyle appeal to you? Do a good final project, we had students from this class in MLSP before

    62

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    63

    Reading

    Fast normalized correlationhttp://scribblethink.org/Work/nvisionInterface/nip.pdf

    Transform invariant templateshttp://www.springerlink.com/content/h345462721557808/http://www.lps.usp.br/~hae/software/forapro/index.html

    Friday, September 20, 13

  • UN I V E R S I T Y O F I L L I N O I S AT UR B A N A -CH A M P A I G N - CS598PS - FA L L 2013

    Reminder

    There is a problem set handed out It is due on Friday Stuck? Tough luck, Ill be out of town :)

    Talk to your TA

    We also have a piazza page: http://piazza.com/illinois/fall2013/cs598psGood way to find project buddies

    64

    Friday, September 20, 13