Image Understanding

72
Image Understanding

description

Image Understanding. Introduction. Computer vision Give machines the ability to see The goal is to duplicate the effect of human visual processing We live in a 3-D world, but camera sensors can only capture 2-D information. - PowerPoint PPT Presentation

Transcript of Image Understanding

Page 1: Image Understanding

Image Understanding

Page 2: Image Understanding

Introduction

Computer visionGive machines the ability to seeThe goal is to duplicate the effect of human

visual processingWe live in a 3-D world, but camera sensors

can only capture 2-D information.Computer vision is the “flip” side of computer

graphics – but much harder!

Page 3: Image Understanding

Introduction

Computer vision is composed of: Image processing Image analysis Image understanding

Page 4: Image Understanding

Introduction

Image processingThe goal is to present the image to the

system in a useful form image capture and early processing remove noise detect luminance differences detect edges enhance image

Page 5: Image Understanding

Introduction

Image analysisThe goal is to extract useful information from

the processed image identify boundaries find connected components label regions segment parts of objects group parts together into whole objects

Page 6: Image Understanding

Introduction

Image understandingThe goal is to make sense of the information.

Draw qualitative, or semantic, conclusions from the quantitative information.

make a decision about the quantitative information classify the parts recognize objects understand the objects’ usage and the meaning of

the scene

Page 7: Image Understanding

Introduction

Image understanding uses techniques and methods from:Physics – models of the visual worldMathematics - statistics and differential

calculusSpatial pattern recognitionArtificial intelligencePsychophysics

Page 8: Image Understanding

Robot: Count all the Chairs

Source: Bülthoff, Max Planck Institute for Biological Cybernetics (MPIK), Tübingen, Germanyhttp://www.ercim.org/publication/Ercim_News/enw53/christensen.html

Page 9: Image Understanding

Robot: Which One is a Car?

Page 10: Image Understanding

The Importance of Context

Are these letters “A” or “H”?

Page 11: Image Understanding

The Importance of Context

Are these letters “A” or “H”?

Page 12: Image Understanding

The Importance of Context

Hwo nmya wrdso cna oyu erad ni ihts entsnece?

Page 13: Image Understanding

Low-level Representations

Low-level: little knowledge about the world The data that is manipulated usually resembles the

data that is captured. For example, if the image is captured using a CCD camera (2-D), the representation can be described by an image function whose value is brightness depending on 2 parameters: the x-y coordinates of the location of the brightness value.

Page 14: Image Understanding

High-level Representations

High-level: incorporate knowledge about the world external to the image Image may be mapped to a formalized model of the

world (model may change dynamically as new information becomes available)

Data to be processed is dramatically reduced: instead of dealing with pixel values, deal with features such as shape, size, relationships, etc

Usually expressed in symbolic form

Page 15: Image Understanding

Low-level Mechanisms

Low-level vision only takes us to the sophistication of a very expensive digital camera

Page 16: Image Understanding

High-level Mechanisms

High-level vision and perception requires brain functions that we do not fully understand yet

Page 17: Image Understanding

High-level Mechanisms

Image from https://plus.google.com/107117483540235115863/posts/MBtyGRBvwkH

Page 18: Image Understanding

Bottom-up or Top-down?Top-Down?Bottom-up?

Inf o

r ma t

ion

flowInform

ation flow

Page 19: Image Understanding

Visual Completion:

Top-down Control

Page 20: Image Understanding

Visual Completion:

Top-down Control

Page 21: Image Understanding

Visual Completion:

Top-down Control

Page 22: Image Understanding

Visual Completion:

Top-down Control

Page 23: Image Understanding

Expectation and Learning

From Palmer (1999)

Page 24: Image Understanding

Occlusion Illusion

Which semi-circle appears larger?

Page 25: Image Understanding

Occlusion Illusion

Which semi-circle appears larger?

Page 26: Image Understanding

The Human Visual System

Optical information from the eyes is transmitted to the primary visual cortex in the occipital lobe at the back of the head.

Page 27: Image Understanding

The Human Visual System

- 20 mm focal length lens- iris controls amount of light entering eye by changing the size of the pupil

Light enters the eye through the cornea, aqueous humor, lens, and vitreous humor before striking the light-sensitive receptors of the retina.

After striking the retina, light is converted into electrochemical signals that are carried to the brain via the optic nerve.

Page 28: Image Understanding

The Human Visual System

image from www.photo.net/photo/edscott/vis00010.htm

Page 29: Image Understanding

The Human Visual System

From Palmer (1999)

The distribution of rods and cones across the retina is highly uneven

The fovea contains the highest concentration of cones for high visual acuity

Page 30: Image Understanding

How much do we really see?

+

Page 31: Image Understanding

How much do we really see?

+ If you can read this you must be cheating

Page 32: Image Understanding

Change Blindness

Lack of attention to an object causes failure to perceive it

People find it difficult to detect major changes in a scene if those changes occur in objects that are not the focus of attention

Our impression that our visual capabilities give us a rich, complete, and detailed representation of the world around us is a grand illusion!

Page 33: Image Understanding

Center-Surround Organization The receptive field of a neuron in the retina can be described as

having a center-surround organization. When light covers the receptive field uniformly, a random pattern of action potentials results. However, if light activates only the central part of the receptive field and not the surrounding area, an elevated response in terms of the firing rate with respect to the random response will result, and the neuron is said to have an on-center/off-surround organization. For this case, light activating only the inhibitory surround will cause a significant decrease in the firing rate. A neuron exhibiting the opposite pattern of activation is said to have an off-center/on-surround organization.

Page 34: Image Understanding

Center-Surround OrganizationStimulus

On-Center/Off-Surround Off-Center/On-Surround

Response Response

Page 35: Image Understanding

Center-Surround Organization and Contrast Sensitivity

Spatial frequency (cycles per degree)

Con

tras

t

1 10 100high

low

Page 36: Image Understanding

Center-Surround Organization and Contrast Sensitivity

Spatial frequency (cycles per degree)

Con

tras

t

1 10 100high

low

Page 37: Image Understanding

1 10 100Spatial frequency (cycles per degree)

Con

tras

t

high

low

Center-Surround Organization and Contrast Sensitivity

Page 38: Image Understanding

Lateral Inhibition

Page 39: Image Understanding

Lateral Inhibition

10

5

Input light level

Output perception

Page 40: Image Understanding

Lateral Inhibition

A biological neural network in which neurons inhibit spatially neighboring neurons. Architecture of first few layers of retina.

Input light level

Receptors

Output Cells

10 10 10 5 5 5

Output perception 3 3 2 7 6 6

10-2-2 = 10-2-2 = 10-2-1 = 5-2-1 = 5-1-1 = 5-1-1 =

+1 +1 +1 +1 +1 +1

-0.2 -0.2-0.2 -0.2-0.2 -0.2 -0.2

10 5

Page 41: Image Understanding

Lateral Inhibition

Page 42: Image Understanding

Lateral Inhibition

+ +----

- -

Lots of inhibition

Not much inhibition

Less lateral inhibition in the fovea as compared to the periphery?

Page 43: Image Understanding

Simultaneous Contrast

Two regions that have identical spectra result in different color (lightness) perceptions due to the spectra of the surrounding regions

Background color can visibly affect the perceived color of the target

Page 44: Image Understanding

Simultaneous Contrast

Page 45: Image Understanding

Simultaneous Contrast

Page 46: Image Understanding

Simultaneous Contrast

Profile

Lightintensity

Horizontal position

left square right square

5

10

0

Page 47: Image Understanding

Simultaneous Contrast

5

10

0

5 10 10 5 5 10 10 0 0 5 5 0 0 5

5 10 10 5 5 10 10 0 0 5 5 0 0 5Light intensity

left square

right square

Excitation (+1)

Left inhibition (-0.2) -1 -1 -2 -2 -1 -1 -2 -2 0 0 -1 -1 0 0

-2 -2 -1 -1 -2 -2 0 0 -1 -1 0 0 -1 -1

2 7 7 2 2 7 8 -2 -1 4 4 -1 -1 4

Right inhibition (-0.2)

Output (Sum)

left square right square

Page 48: Image Understanding

Simultaneous Contrast?

According to simultaneous contrast theory, the gray cross on the left shouldappear lighter than the cross on the right, because it is surrounded by darksquares. Instead, it appears darker. Could it be because we prefer to see agray square floating over a white (black) background, rather than a cross?

Page 49: Image Understanding

Lightness Constancy

Indoors - 100 units of light total.White paper reflects 90 units, and black ink reflects 10 units.

Outdoors - 10,000 units of light total.White paper reflects 9000 units, and black ink reflects 1000 units.

Why does the black ink outside (1000 units reflected) look darkerthan the white page does indoors (only 90 units reflected)?

Page 50: Image Understanding

Color Vision

The objective description of color is that it is thevisible portion of the electro-magnetic spectrum.

image from www.photo.net/photo/edscott/vis00010.htm

Page 51: Image Understanding

Color Vision

“The rays to speak properly are not colored ... Colors in the object are nothing but a disposition to reflect this or that sort of rays more copiously than the rest.”

- Sir Isaac Newton, 1666

Page 52: Image Understanding

Color Vision

S-cones

M-cones

L-cones

The physical description of color is that itis the spectral response of three types of cones.

image from www.photo.net/photo/edscott/vis00010.htm

violet blue green yellow orange red | | | | | |

Page 53: Image Understanding

Color Vision

The psychological description of color is that itis a point in a three-dimensional color space.

hue

lightness

saturation

Page 54: Image Understanding

Theories of Color Vision

Trichromatic theoryPalmer (1777), Sir Thomas Young (1802),

Maxwell (1855), Helmholtz (1867/1927) Opponent Process theory

Hering (1867/1964) Dual Process theory

von Kries (1905), Müller and Schrödinger (1920s), Hurvich and Jameson (1957)

Page 55: Image Understanding

Trichromatic Theory

The pattern of activation across the three receptor types determines the perceived color

Evidence in support of theory 3 colors are sufficient to match any color Explains color blindness

Page 56: Image Understanding

Opponent Process Theory

The three receptor types define a polarity between red/green, blue/yellow, and black/white

Evidence in support of theoryColor experiences are always lost in certain

pairs: red/green or blue/yellowYellow seems to be a primary color - not a

mixture of other colors

Page 57: Image Understanding

Opponent Process Theory

Page 58: Image Understanding

Opponent Process Theory

Page 59: Image Understanding

Dual Process Theory

2 stages - trichromatic stage, followed by a opponent process stage

Evidence in support of theoryThe amount of “blueness” in any given light

can be measured by mixing it with enough “yellow” light to neutralize the blueness (the resulting light looks neither blue nor yellow)

Page 60: Image Understanding

Reparameterizing Color Space

lightness

saturation

M cones

S cones

L cones

Black

White

Yellow

Blue

Red Green

Photoreceptor Responses Color-Opponent Space Hue, Saturation, Lightness

Physical Psychological

hue

Page 61: Image Understanding

Spatial Frequency Analysis

http://www.billcasselman.com/sinewave.gif

Period T = 1 Cycle

Frequency f = 1/T

Amplitude

Page 62: Image Understanding

Spatial Frequency Analysis

Four Cycles of a Single Sinusoidal Wavein the Spatial Domain, Orientation = 0 degrees

Page 63: Image Understanding

Spatial Frequency Analysis

An image consists of the summation of a very large number of sine waves of varying amplitude, frequency, orientation, and phase

Lena image from http://www.ece.rice.edu/~wakin/images/lenaTest3.jpg

Page 64: Image Understanding

Spatial Frequency Analysis

increasing frequency

Amplitude

Many Sinusoidal WavesIn the Frequency Domain

Amplitude Spectrum

Page 65: Image Understanding

Spatial Frequency Analysis

increasing frequency

Phase Phase Spectrum

Many Sinusoidal WavesIn the Frequency Domain

Page 66: Image Understanding

Spatial Frequency Analysis

Amplitude Lena

Amplitude Peppers

Phase Lena Phase Peppers

Phase Wins!Image by: Thomas Kinsman, CIS

Page 67: Image Understanding

Aliasing High-frequency information can be perceived as low-frequency

information if the sampling rate is too low Applies to temporal as well as spatial frequencies

http://www.svi.nl/wikiimg/StFargeaux_kasteel_buiten1_aliased.jpg

Page 68: Image Understanding

Aliasing

A function in the spatial domain, f(x), is band-limited if it has a highest frequency s0 in the frequency domain, F(s).

x

f(x)

s- s0 s0

F(s)

Spatial Domain Frequency (spectral) Domain

Page 69: Image Understanding

Aliasing

If we sample f(x) at equal intervals, , we get multiple copies of the spectrum in the frequency domain:

Multiplying f(x) by a sampling (delta, or “spike”) function is equivalent to a convolution of F(s) with the Fourier transform of the sampling function. This is known as the Convolution Theorem.

s- s0 s0

G(s)

x

g(x)

- 1/ 1/

g(x) is the sampled functionG(s) is the sampled function inthe frequency domain

Page 70: Image Understanding

Aliasing

Can we recover the original function intact from the sample points? In other words, can we recover F(s) from G(s)?Yes, if we eliminate all of the replicas of F(s),

except the central one. To do this, multiply G(s) by a window function,

or convolve the sampled function g(x) with an interpolation function, sinc (x) = sin (x) / x

Page 71: Image Understanding

Aliasing

Restrictions: f(x) must be band-limited (have a highest

frequency) at s0 and:The relationship between the sampling

interval and the band-limit, s0 must be: < = 1 / (2 • s0)

A function sampled at a uniform spacing can be completely recovered from the samples, provided thatwhere the function is band-limited at s0

< = 1 / (2 • s0)

Page 72: Image Understanding

Aliasing

Under-samplingSuppose that Then the replicas will overlap and sum

together

> 1 / (2 • s0)

s- s0 s0

G(s)

x

g(x)

- 1/ 1/

Energy above the frequency s0 is “folded back” below s0, making the high frequency components appear to below-frequency - this is known as aliasing