Segment ácia farebného obrazu

Segmentácia farebného obrazu

Image segmentation

Segmentation

• Segmentation means to divide up the image into a patchwork of regions, each of which is “homogeneous”, that is, the “same” in some sense - intensity, texture, colour, …

• The segmentation operation only subdivides an image;• it does not attempt to recognize the segmented image

parts!

Complete segmentation - divides an image into nonoverlapping regions that match to the real world objects.

Cooperation with higher processing levels which use specific knowledge of the problem domain is necessary.

Complete vs. partial segmentation

Complete vs. partial segmentation

Partial segmentation- in which regions do not correspond directly with image objects.

Image is divided into separate regions that are homogeneous with respect to a chosen property such as brightness, color, texture, etc.

Gestalt (celostné) laws of perceptual organization

The emphasis in the Gestalt approach was on the configuration of the elements.

Proximity: Objects that are closer to one another tend to be grouped

together.

Closure: Humans tend to enclose a space by completinga contour and ignoring gaps.

Gestalt laws of perceptual organization

Similarity: Elements that looksimilar will be perceived as partof the same form. (color, shape,

texture, and motion).

Continuation: Humans tendto continue contours

whenever the elements ofthe pattern establish an

implied direction.

Gestalt laws A series of factors affect whether elements should be

grouped together. Proximity: tokens that are nearby tend to be grouped. Similarity: similar tokens tend to be grouped together. Common fate: tokens that have coherent motion tend to be

grouped together. Common region: tokens that lie inside the same closed

region tend to be grouped together. Parallelism: parallel curves or tokens tend to be grouped

together.

• Closure: tokens or curves that tend to lead to closed curves tend to be grouped together.

• Symmetry: curves that lead to symmetric groups are grouped together.

• Continuity: tokens that lead to “continuous” curves tend to be grouped.

• Familiar configuration: tokens that, when grouped, lead to a familiar object, tend to be grouped together.

Gestalt laws

Gestalt laws

Consequence:Groupings by Invisible Completions

* Images from Steve Lehar’s Gestalt papers: http://cns-alumni.bu.edu/pub/slehar/Lehar.html

Stressing the invisible groupings:

Image segmentation

Segmentation criteria: a segmentation is a partition of an image I into a set of regions S satisfying:1. Si = S Partition covers the

whole image.2. Si Sj = , i j No regions intersect.3. Si, P(Si) = true Homogeneity predicate

is satisfied by each region.

4. P(Si Sj) = false, Union of adjacent regionsi j, Si adjacent Sj does not satisfy it.

Image segmentation

So, all we have to do is to define and implement the similarity predicate.But, what do we want to be similar in each

region?Is there any property that will cause the

regions to be meaningful objects?

Segmetnation methods

Pixel-based• Histogram• Clustering

Region-based• Region growing• Split and merge

Edge-basedModel-basedPhysics-basedGraph-based

Histogram-based segmentation How many “orange” pixels are

in this image? This type of question can be answered

by looking at the histogram.

Histogram-based segmentation

How many modes are there? Solve this by reducing the number of colors to K and

mapping each pixel to the closest color. Here’s what it looks like if we use two colors.

Clustering-based segmentation

How to choose the representative colors?This is a clustering problem!

K-means algorithm can be used for clustering.


K-means clustering of color.

22

K-means clustering using intensity alone and color alone

Image Clusters on intensity Clusters on color

* From Marc Pollefeys COMP 256 2003

Results of K-Means Clustering:

Clustering-based segmentationClustering can also be used with other

features (e.g., texture) in addition to color.Original Images

Color Regions

Texture Regions

Clustering-based segmentationK-means variants:

Different ways to initialize the means.Different stopping criteria.Dynamic methods for determining the right

number of clusters (K) for a given image.Problem: histogram-based and clustering-

based segmentation can produce messy regions.

How can these be fixed?

Clustering-based segmentation Expectation-Maximization (EM) algorithm can be used

as a probabilistic clustering method where each cluster is modeled using a Gaussian.

The clusters are updated iteratively by computing the parameters of the Gaussians.

Example from the UC Berkeley’s Blobworld system.


Examples from the UC Berkeley’s Blobworld system.

Region growing Region growing techniques start with one pixel of a

potential region and try to grow it by adding adjacent pixels till the pixels being compared are too dissimilar.

The first pixel selected can be just the first unlabeled pixel in the image or a set of seed pixels can be chosen from the image.

Usually a statistical test is used to decide which pixels can be added to a region. Region is a population with similar statistics. Use statistical test to see if neighbor on border fits

into the region population.

Region growing

image

segmentation

Split-and-merge1. Start with the whole image.2. If the variance is too high, break into quadrants.3. Merge any adjacent regions that are similar enough.4. Repeat steps 2 and 3, iteratively until no more splitting

or merging occur.

Idea: goodResults: blocky

Split-and-merge

Split-and-merge

A large connected region formed by

merging pixels labeled as residential after

classification.

A satellite image. More compact sub-regions after the split-and-merge procedure.

Mean Shift Segmentation

http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html

34

Mean Shift AlgorithmMean Shift Algorithm

1. Choose a search window size.2. Choose the initial location of the search window.3. Compute the mean location (centroid of the data) in the search window.4. Center the search window at the mean location computed in Step 3.5. Repeat Steps 3 and 4 until convergence.

The mean shift algorithm seeks the “mode” or point of highest density of a data distribution:

Mean Shift Setmentation Algorithm1. Convert the image into tokens (via color, gradients, texture measures etc).2. Choose initial search window locations uniformly in the data.3. Compute the mean shift window location for each initial position.4. Merge windows that end up on the same “peak” or mode.5. The data these merged windows traversed are clustered together.

*Image From: Dorin Comaniciu and Peter Meer, Distribution Free Decomposition of Multivariate Data, Pattern Analysis & Applications (1999)2:22–30

Mean Shift Segmentation

36

Mean Shift Segmentation ExtensionGary Bradski’s internally published agglomerative clustering extension:Mean shift dendrograms1. Place a tiny mean shift window over each data point2. Grow the window and mean shift it3. Track windows that merge along with the data they transversed 4. Until everything is merged into one cluster

Is scale (search window size) sensitive. Solution, use all scales:

Best 4 clusters: Best 2 clusters:

Advantage over agglomerative clustering: Highly parallelizable

Mean Shift Segmentation Results:

http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html

Graph Cut

• First: select a region of interest

Graph Cut

• How to select the object automatically?

?

Graph Cut

Graph Cut• What are graphs?

Nodes• usually pixels• sometimes samples

Edges• weights associated (W(i,j))• E.g. RGB value difference

Graph Cut• What are cuts?

• Each “cut” -> points, W(I,j)• Optimization problem

• W(i,j) = |RGB(i) – RGB(j)|

Graph Cut• Go back to our selected region


• W(i,j) = |RGB(i) – RGB(j)|

Graph Cut• We want highest sum of weights


• W(i,j) = |RGB(i) – RGB(j)|


• Each “cut” -> points, W(I,j)• Optimization problem• W(i,j) = |RGB(i) – RGB(j)|

These cuts give low pointsW(i,j) = |RGB(i) – RGB(j)|is low


• Each “cut” -> points, W(I,j)• Optimization problem• W(i,j) = |RGB(i) – RGB(j)|

These cuts give high pointsW(i,j) = |RGB(i) – RGB(j)|is high

Graph-based segmentationAn image is represented by a graph

whose nodes are pixels or small groups of pixels.

The goal is to partition the nodes into disjoint sets so that the similarity within each set is high and across different sets is low.

http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf

Graph-based segmentation Let G = (V,E) be a graph. Each edge (u,v) has a

weight w(u,v) that represents the similarity between u and v.

Graph G can be broken into 2 disjoint graphs with node sets A and B by removing edges that connect these sets.

Let cut(A,B) = w(u,v).

One way to segment G is to find the minimal cut.

uA, vB

Graph-based segmentation

Graph-based segmentationMinimal cut favors cutting off small node

groups, so Shi and Malik proposed the normalized cut.

cut(A,B) cut(A,B)Ncut(A,B) = --------------- + --------------- assoc(A,V) assoc(B,V)

assoc(A,V) = w(u,t) uA, tV

How much is A connectedto the graph as a whole

Normalizedcut

Solve for Minimum Penalty

Partition A Partition Bcut


2

2 2

2 241 3

2

2 2

3

22

2

1

3 3Ncut(A,B) = ------- + ------ 21 16

A B

Graph Cut• Optimization solver

Solver ExampleRecursion:1. Grow2. If W(i,j) low

1. Stop2. Continue

Graph Cut• Result : Isolation

Image Segmentation and Graph Cuts

• Image Segmentation

• Graph Cuts

The Pipeline

Assign W(i,j)

Solve for minimumpenalty

Cut into 2Subdivide?

Subdivide?

Yes

Yes

No

No

• Input: Image• Output: Segments• Each iteration cuts into 2 pieces

Assign W(i,j)• W(i,j) = |RGB(i) – RGB(j)| is noisy!• Could use brightness and locality

Brightness term

Locality term

X(i) is the spatial location of node iF(i) is the feature vector for node iwhich can be intensity, color, texture, motion…The formula is set up so that w(i,j) is 0 for nodes that are too far apart.

Graph-based segmentation Shi and Malik turned graph cuts into an

eigenvector/eigenvalue problem. Set up a weighted graph G=(V,E).

V is the set of (N) pixels. E is a set of weighted edges (weight wij gives the

similarity between nodes i and j). Length N vector d: di is the sum of the weights from

node i to all other nodes. N x N matrix D: D is a diagonal matrix with d on its

diagonal. N x N symmetric matrix W: Wij = wij.

Graph-based segmentation Let x be a characteristic vector of a set A of nodes.

xi = 1 if node i is in a set A xi = -1 otherwise

Let y be a continuous approximation to x

Solve the system of equations(D – W) y = D y

for the eigenvectors y and eigenvalues . Use the eigenvector y with second smallest eigenvalue

to bipartition the graph (y x A). If further subdivision is merited, repeat recursively.

Extensions: Edge Weights• How to calculate the edge weights?

Intensity

Color (HSV)

Texture

Continued Work: Semantic Segmentation• Incorporating top-down information into low-level segmentation

Interactive Graph Cuts: Yuri Boykov, et al

GrabCut

User Input

Result

Magic Wand (198?)

Intelligent ScissorsMortensen and Barrett (1995)

GrabCutRother et al 2004

Regions Boundary Regions & Boundary

Slides C Rother et al., Microsoft Research, Cambridge

Data Term

Gaussian Mixture Model (typically 5-8 components)

Foreground &Background

Background G

R


Smoothness term

An object is a coherent set of pixels

Iterate until convergence: 1. Compute a configuration given the mixture model. (E-Step) 2. Compute the model parameters given the configuration. (M-Step)


Moderately simple examples

… GrabCut completes automatically


Difficult Examples

Camouflage & Low Contrast No telepathyFine structure

Initial Rectangle

Result


CS 534 – Stereo Imaging - 71

Markov Random Fields (MRF)

A graphical model for describing spatial consistency in images Suppose you want to label image pixels with some labels {l1,…,lk} , e.g.,

segmentation, stereo disparity, foreground-background, etc.

Ref: 1. S. Z. Li. Markov Random Field Modeling in Image Analysis.Springer-Verlag, 19912. S. Geman and D. Geman. Stochastic relaxation, gibbs distributionand bayesian restoration of images. PAMI, 6(6):721–741, 1984.

From Slides by S. Seitz - University of Washington


Definition

MRF Components: A set of sites: P={1,…,m} : each pixel is a site. Neighborhood for each pixel N={Np | p P} A set of random variables (random field), one for each site F={Fp | p P} Denotes the

label at each pixel. Each random variable takes a value fp from the set of labels L={l1,…,lk} We have a joint event {F1=f1,…, Fm=fm} , or a configuration, abbreviated as F=f The joint prob. Of such configuration: Pr(F=f) or Pr(f)



Definition

MRF Components: Pr(fi) > 0 for all variables fi. Markov Property: Each Random variable depends on other RVs only

through its neighbors. Pr(fp | fS-{p})=Pr (fp|fNp), p So, we need to define a neighborhood system: Np (neighbors for site p).

No strict rules for neighborhood definition.

Cliques for this neighborhood



Definition

MRF Components: The joint prob. of such configuration:

Pr(F=f) or Pr(f) Markov Property: Each Random variable depends on other RVs only

through its neighbors. Pr(fp | fS-{p})=Pr (fp|fNp), p So, we need to define a neighborhood system: Np (neighbors for site p)

Hammersley-Clifford Theorem:Pr(f) exp(-C VC(f))

Sum over all cliques in the neighborhood system VC is clique potential

We may decide

1. NOT to include all cliques in a neighborhood; or

2. Use different Vc for different cliques in the same neighborhood



Optimal Configuration

MRF Components: Hammersley-Clifford Theorem:

Pr(f) exp(-C VC(f)) Consider MRF’s with arbitrary cliques among neighboring pixels

Sum over all cliques in the neighborhood system

VC is clique potential: prior probability that elements of the clique C have certain values

Cc cppppc ffVf

...2,1

21...),(exp)Pr(

Typical potential: Potts model:

))(1(),( },{),( qpqpqpqp ffuffV From Slides by S. Seitz - University of Washington


Optimal Configuration

MRF Components: Hammersley-Clifford Theorem:

Pr(f) exp(-C VC(f))

Consider MRF’s with clique potentials of pairs of neighboring pixels

p pNqqpqp

ppp ffVfVf

)(),( ),()(exp)Pr(

Most commonly used….very popular in vision.

p Npq

qpqpp

pp ffVfVfE ),()()( ),(Energy function:

There are two constraints to satisfy:

1. Data Constraint: Labeling should reflect the observation.

2. Smoothness constraint: Labeling should reflect spatial consistency

(pixels close to each other are most likely to have similar labels).


Probabilistic interpretation

The problem is we are not observing the labels but we observe something else that depends on these labels with some noise (eg intensity or disparity)

At each site we have an observation ip

The observed value at each site depends on its label: the prob. of certain observed value given certain label at site p : g(ip,fp)=Pr(ip|Fp=fp)

The overall observation prob. given the labels: Pr(O|f)

We need to infer about the labels given the observation Pr(f|O) Pr(O|f) Pr(f)

p

pp figfO ),()|Pr(

Using MRFs

How to model different problems? Given observations y, and the parameters of the MRF, how to infer the hidden

variables, x? How to learn the parameters of the MRF?

Modeling image pixel labels as MRF

MRF-based segmentation

( , )i ix y

( , )i jx x

1

real image

label image

Slides by R. Huang – Rutgers University


Classifying image pixels into different regions under the constraint of both local observations and spatial relationships

Probabilistic interpretation:

* *

( , )( , ) arg max ( , | )P

xx x y

region labels

image pixels

model param

.


( , )

1( , ) ( , ) ( , )i j i ii j i

P x x x yZ

x y

Model joint probability

label

image

label-labelcompatibility

Functionenforcing

Smoothness constraint

neighboringlabel nodes

local Observations

image-labelcompatibility

Functionenforcing

DataConstraint

* *

( , )( , ) arg max ( , | )P

xx x y

region labels

image pixels

model param

.

How did we factorize?



Probabilistic interpretation

We need to infer about the labels given the observation Pr( f | O ) Pr(O|f ) Pr(f)

MAP estimate of f should minimize the posterior energy

)),(ln(),()( ),( p

ppp Npq

qpqp figffVfE

Data (observation) term: Data Constraint

Neighborhood term: Smoothness Constraint

Applying and learning MRF: Example

*

1

( , ) ( , )2

2

2

2 2

arg max ( | )

1arg max ( , ) ( | ) ( , ) / ( ) ( , )

1arg max ( , ) ( , ) ( , ) ( , ) ( , )

( , ) ( ; , )

( , ) exp( ( ) / )

[ , , ]

i i

i i

i i i j i i i ji i j i i j

i i i x x

i j i j

x x

P

P P P P PZ

x y x x P x y x xZ

x y G y

x x x x

x

x

x

x x y

x y x y x y y x y

x y

( , )i ix y

( , )i jx xSlides by R. Huang – Rutgers University

Segment ácia farebného obrazu

Documents

Transcript of Segment ácia farebného obrazu