Segment ácia farebného obrazu
description
Transcript of Segment ácia farebného obrazu
Segmentácia farebného obrazu
Image segmentation
Image segmentation
Segmentation
• Segmentation means to divide up the image into a patchwork of regions, each of which is “homogeneous”, that is, the “same” in some sense - intensity, texture, colour, …
• The segmentation operation only subdivides an image;• it does not attempt to recognize the segmented image
parts!
Complete segmentation - divides an image into nonoverlapping regions that match to the real world objects.
Cooperation with higher processing levels which use specific knowledge of the problem domain is necessary.
Complete vs. partial segmentation
Complete vs. partial segmentation
Partial segmentation- in which regions do not correspond directly with image objects.
Image is divided into separate regions that are homogeneous with respect to a chosen property such as brightness, color, texture, etc.
Gestalt (celostné) laws of perceptual organization
The emphasis in the Gestalt approach was on the configuration of the elements.
Proximity: Objects that are closer to one another tend to be grouped
together.
Closure: Humans tend to enclose a space by completinga contour and ignoring gaps.
Gestalt laws of perceptual organization
Similarity: Elements that looksimilar will be perceived as partof the same form. (color, shape,
texture, and motion).
Continuation: Humans tendto continue contours
whenever the elements ofthe pattern establish an
implied direction.
Gestalt laws A series of factors affect whether elements should be
grouped together. Proximity: tokens that are nearby tend to be grouped. Similarity: similar tokens tend to be grouped together. Common fate: tokens that have coherent motion tend to be
grouped together. Common region: tokens that lie inside the same closed
region tend to be grouped together. Parallelism: parallel curves or tokens tend to be grouped
together.
• Closure: tokens or curves that tend to lead to closed curves tend to be grouped together.
• Symmetry: curves that lead to symmetric groups are grouped together.
• Continuity: tokens that lead to “continuous” curves tend to be grouped.
• Familiar configuration: tokens that, when grouped, lead to a familiar object, tend to be grouped together.
Gestalt laws
Gestalt laws
Gestalt laws
Consequence:Groupings by Invisible Completions
* Images from Steve Lehar’s Gestalt papers: http://cns-alumni.bu.edu/pub/slehar/Lehar.html
Stressing the invisible groupings:
Image segmentation
Segmentation criteria: a segmentation is a partition of an image I into a set of regions S satisfying:1. Si = S Partition covers the
whole image.2. Si Sj = , i j No regions intersect.3. Si, P(Si) = true Homogeneity predicate
is satisfied by each region.
4. P(Si Sj) = false, Union of adjacent regionsi j, Si adjacent Sj does not satisfy it.
Image segmentation
So, all we have to do is to define and implement the similarity predicate.But, what do we want to be similar in each
region?Is there any property that will cause the
regions to be meaningful objects?
Segmetnation methods
Pixel-based• Histogram• Clustering
Region-based• Region growing• Split and merge
Edge-basedModel-basedPhysics-basedGraph-based
Histogram-based segmentation How many “orange” pixels are
in this image? This type of question can be answered
by looking at the histogram.
Histogram-based segmentation
How many modes are there? Solve this by reducing the number of colors to K and
mapping each pixel to the closest color. Here’s what it looks like if we use two colors.
Clustering-based segmentation
How to choose the representative colors?This is a clustering problem!
K-means algorithm can be used for clustering.
Clustering-based segmentation
K-means clustering of color.
Clustering-based segmentation
K-means clustering of color.
22
K-means clustering using intensity alone and color alone
Image Clusters on intensity Clusters on color
* From Marc Pollefeys COMP 256 2003
Results of K-Means Clustering:
Clustering-based segmentationClustering can also be used with other
features (e.g., texture) in addition to color.Original Images
Color Regions
Texture Regions
Clustering-based segmentationK-means variants:
Different ways to initialize the means.Different stopping criteria.Dynamic methods for determining the right
number of clusters (K) for a given image.Problem: histogram-based and clustering-
based segmentation can produce messy regions.
How can these be fixed?
Clustering-based segmentation Expectation-Maximization (EM) algorithm can be used
as a probabilistic clustering method where each cluster is modeled using a Gaussian.
The clusters are updated iteratively by computing the parameters of the Gaussians.
Example from the UC Berkeley’s Blobworld system.
Clustering-based segmentation
Examples from the UC Berkeley’s Blobworld system.
Region growing Region growing techniques start with one pixel of a
potential region and try to grow it by adding adjacent pixels till the pixels being compared are too dissimilar.
The first pixel selected can be just the first unlabeled pixel in the image or a set of seed pixels can be chosen from the image.
Usually a statistical test is used to decide which pixels can be added to a region. Region is a population with similar statistics. Use statistical test to see if neighbor on border fits
into the region population.
Region growing
image
segmentation
Split-and-merge1. Start with the whole image.2. If the variance is too high, break into quadrants.3. Merge any adjacent regions that are similar enough.4. Repeat steps 2 and 3, iteratively until no more splitting
or merging occur.
Idea: goodResults: blocky
Split-and-merge
Split-and-merge
Split-and-merge
A large connected region formed by
merging pixels labeled as residential after
classification.
A satellite image. More compact sub-regions after the split-and-merge procedure.
Mean Shift Segmentation
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
34
Mean Shift AlgorithmMean Shift Algorithm
1. Choose a search window size.2. Choose the initial location of the search window.3. Compute the mean location (centroid of the data) in the search window.4. Center the search window at the mean location computed in Step 3.5. Repeat Steps 3 and 4 until convergence.
The mean shift algorithm seeks the “mode” or point of highest density of a data distribution:
Mean Shift Setmentation Algorithm1. Convert the image into tokens (via color, gradients, texture measures etc).2. Choose initial search window locations uniformly in the data.3. Compute the mean shift window location for each initial position.4. Merge windows that end up on the same “peak” or mode.5. The data these merged windows traversed are clustered together.
*Image From: Dorin Comaniciu and Peter Meer, Distribution Free Decomposition of Multivariate Data, Pattern Analysis & Applications (1999)2:22–30
Mean Shift Segmentation
36
Mean Shift Segmentation ExtensionGary Bradski’s internally published agglomerative clustering extension:Mean shift dendrograms1. Place a tiny mean shift window over each data point2. Grow the window and mean shift it3. Track windows that merge along with the data they transversed 4. Until everything is merged into one cluster
Is scale (search window size) sensitive. Solution, use all scales:
Best 4 clusters: Best 2 clusters:
Advantage over agglomerative clustering: Highly parallelizable
Mean Shift Segmentation Results:
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
Graph Cut
• First: select a region of interest
Graph Cut
• How to select the object automatically?
?
Graph Cut
Graph Cut• What are graphs?
Nodes• usually pixels• sometimes samples
Edges• weights associated (W(i,j))• E.g. RGB value difference
Graph Cut• What are cuts?
• Each “cut” -> points, W(I,j)• Optimization problem
• W(i,j) = |RGB(i) – RGB(j)|
Graph Cut• Go back to our selected region
• Each “cut” -> points, W(I,j)• Optimization problem
• W(i,j) = |RGB(i) – RGB(j)|
Graph Cut• Go back to our selected region
• Each “cut” -> points, W(I,j)• Optimization problem
• W(i,j) = |RGB(i) – RGB(j)|
Graph Cut• We want highest sum of weights
• Each “cut” -> points, W(I,j)• Optimization problem
• W(i,j) = |RGB(i) – RGB(j)|
Graph Cut• We want highest sum of weights
• Each “cut” -> points, W(I,j)• Optimization problem• W(i,j) = |RGB(i) – RGB(j)|
These cuts give low pointsW(i,j) = |RGB(i) – RGB(j)|is low
Graph Cut• We want highest sum of weights
• Each “cut” -> points, W(I,j)• Optimization problem• W(i,j) = |RGB(i) – RGB(j)|
These cuts give high pointsW(i,j) = |RGB(i) – RGB(j)|is high
Graph-based segmentationAn image is represented by a graph
whose nodes are pixels or small groups of pixels.
The goal is to partition the nodes into disjoint sets so that the similarity within each set is high and across different sets is low.
http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf
Graph-based segmentation Let G = (V,E) be a graph. Each edge (u,v) has a
weight w(u,v) that represents the similarity between u and v.
Graph G can be broken into 2 disjoint graphs with node sets A and B by removing edges that connect these sets.
Let cut(A,B) = w(u,v).
One way to segment G is to find the minimal cut.
uA, vB
Graph-based segmentation
Graph-based segmentationMinimal cut favors cutting off small node
groups, so Shi and Malik proposed the normalized cut.
cut(A,B) cut(A,B)Ncut(A,B) = --------------- + --------------- assoc(A,V) assoc(B,V)
assoc(A,V) = w(u,t) uA, tV
How much is A connectedto the graph as a whole
Normalizedcut
Solve for Minimum Penalty
Partition A Partition Bcut
Graph-based segmentation
2
2 2
2 241 3
2
2 2
3
22
2
1
3 3Ncut(A,B) = ------- + ------ 21 16
A B
Graph Cut• Optimization solver
Solver ExampleRecursion:1. Grow2. If W(i,j) low
1. Stop2. Continue
Graph Cut• Result : Isolation
Image Segmentation and Graph Cuts
• Image Segmentation
• Graph Cuts
The Pipeline
Assign W(i,j)
Solve for minimumpenalty
Cut into 2Subdivide?
Subdivide?
Yes
Yes
No
No
• Input: Image• Output: Segments• Each iteration cuts into 2 pieces
Assign W(i,j)• W(i,j) = |RGB(i) – RGB(j)| is noisy!• Could use brightness and locality
Brightness term
Locality term
X(i) is the spatial location of node iF(i) is the feature vector for node iwhich can be intensity, color, texture, motion…The formula is set up so that w(i,j) is 0 for nodes that are too far apart.
Graph-based segmentation Shi and Malik turned graph cuts into an
eigenvector/eigenvalue problem. Set up a weighted graph G=(V,E).
V is the set of (N) pixels. E is a set of weighted edges (weight wij gives the
similarity between nodes i and j). Length N vector d: di is the sum of the weights from
node i to all other nodes. N x N matrix D: D is a diagonal matrix with d on its
diagonal. N x N symmetric matrix W: Wij = wij.
Graph-based segmentation Let x be a characteristic vector of a set A of nodes.
xi = 1 if node i is in a set A xi = -1 otherwise
Let y be a continuous approximation to x
Solve the system of equations(D – W) y = D y
for the eigenvectors y and eigenvalues . Use the eigenvector y with second smallest eigenvalue
to bipartition the graph (y x A). If further subdivision is merited, repeat recursively.
Extensions: Edge Weights• How to calculate the edge weights?
Intensity
Color (HSV)
Texture
Continued Work: Semantic Segmentation• Incorporating top-down information into low-level segmentation
Interactive Graph Cuts: Yuri Boykov, et al
Graph-based segmentation
GrabCut
User Input
Result
Magic Wand (198?)
Intelligent ScissorsMortensen and Barrett (1995)
GrabCutRother et al 2004
Regions Boundary Regions & Boundary
Slides C Rother et al., Microsoft Research, Cambridge
Data Term
Gaussian Mixture Model (typically 5-8 components)
Foreground &Background
Background G
R
Slides C Rother et al., Microsoft Research, Cambridge
Smoothness term
An object is a coherent set of pixels
Iterate until convergence: 1. Compute a configuration given the mixture model. (E-Step) 2. Compute the model parameters given the configuration. (M-Step)
Slides C Rother et al., Microsoft Research, Cambridge
Moderately simple examples
… GrabCut completes automatically
Slides C Rother et al., Microsoft Research, Cambridge
Difficult Examples
Camouflage & Low Contrast No telepathyFine structure
Initial Rectangle
Result
Slides C Rother et al., Microsoft Research, Cambridge
CS 534 – Stereo Imaging - 71
Markov Random Fields (MRF)
A graphical model for describing spatial consistency in images Suppose you want to label image pixels with some labels {l1,…,lk} , e.g.,
segmentation, stereo disparity, foreground-background, etc.
Ref: 1. S. Z. Li. Markov Random Field Modeling in Image Analysis.Springer-Verlag, 19912. S. Geman and D. Geman. Stochastic relaxation, gibbs distributionand bayesian restoration of images. PAMI, 6(6):721–741, 1984.
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 72
Definition
MRF Components: A set of sites: P={1,…,m} : each pixel is a site. Neighborhood for each pixel N={Np | p P} A set of random variables (random field), one for each site F={Fp | p P} Denotes the
label at each pixel. Each random variable takes a value fp from the set of labels L={l1,…,lk} We have a joint event {F1=f1,…, Fm=fm} , or a configuration, abbreviated as F=f The joint prob. Of such configuration: Pr(F=f) or Pr(f)
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 73
Definition
MRF Components: Pr(fi) > 0 for all variables fi. Markov Property: Each Random variable depends on other RVs only
through its neighbors. Pr(fp | fS-{p})=Pr (fp|fNp), p So, we need to define a neighborhood system: Np (neighbors for site p).
No strict rules for neighborhood definition.
Cliques for this neighborhood
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 74
Definition
MRF Components: The joint prob. of such configuration:
Pr(F=f) or Pr(f) Markov Property: Each Random variable depends on other RVs only
through its neighbors. Pr(fp | fS-{p})=Pr (fp|fNp), p So, we need to define a neighborhood system: Np (neighbors for site p)
Hammersley-Clifford Theorem:Pr(f) exp(-C VC(f))
Sum over all cliques in the neighborhood system VC is clique potential
We may decide
1. NOT to include all cliques in a neighborhood; or
2. Use different Vc for different cliques in the same neighborhood
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 75
Optimal Configuration
MRF Components: Hammersley-Clifford Theorem:
Pr(f) exp(-C VC(f)) Consider MRF’s with arbitrary cliques among neighboring pixels
Sum over all cliques in the neighborhood system
VC is clique potential: prior probability that elements of the clique C have certain values
Cc cppppc ffVf
...2,1
21...),(exp)Pr(
Typical potential: Potts model:
))(1(),( },{),( qpqpqpqp ffuffV From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 76
Optimal Configuration
MRF Components: Hammersley-Clifford Theorem:
Pr(f) exp(-C VC(f))
Consider MRF’s with clique potentials of pairs of neighboring pixels
p pNqqpqp
ppp ffVfVf
)(),( ),()(exp)Pr(
Most commonly used….very popular in vision.
p Npq
qpqpp
pp ffVfVfE ),()()( ),(Energy function:
There are two constraints to satisfy:
1. Data Constraint: Labeling should reflect the observation.
2. Smoothness constraint: Labeling should reflect spatial consistency
(pixels close to each other are most likely to have similar labels).
CS 534 – Stereo Imaging - 77
Probabilistic interpretation
The problem is we are not observing the labels but we observe something else that depends on these labels with some noise (eg intensity or disparity)
At each site we have an observation ip
The observed value at each site depends on its label: the prob. of certain observed value given certain label at site p : g(ip,fp)=Pr(ip|Fp=fp)
The overall observation prob. given the labels: Pr(O|f)
We need to infer about the labels given the observation Pr(f|O) Pr(O|f) Pr(f)
p
pp figfO ),()|Pr(
Using MRFs
How to model different problems? Given observations y, and the parameters of the MRF, how to infer the hidden
variables, x? How to learn the parameters of the MRF?
Modeling image pixel labels as MRF
MRF-based segmentation
( , )i ix y
( , )i jx x
1
real image
label image
Slides by R. Huang – Rutgers University
MRF-based segmentation
Classifying image pixels into different regions under the constraint of both local observations and spatial relationships
Probabilistic interpretation:
* *
( , )( , ) arg max ( , | )P
xx x y
region labels
image pixels
model param
.
Slides by R. Huang – Rutgers University
( , )
1( , ) ( , ) ( , )i j i ii j i
P x x x yZ
x y
Model joint probability
label
image
label-labelcompatibility
Functionenforcing
Smoothness constraint
neighboringlabel nodes
local Observations
image-labelcompatibility
Functionenforcing
DataConstraint
* *
( , )( , ) arg max ( , | )P
xx x y
region labels
image pixels
model param
.
How did we factorize?
Slides by R. Huang – Rutgers University
CS 534 – Stereo Imaging - 82
Probabilistic interpretation
We need to infer about the labels given the observation Pr( f | O ) Pr(O|f ) Pr(f)
MAP estimate of f should minimize the posterior energy
)),(ln(),()( ),( p
ppp Npq
qpqp figffVfE
Data (observation) term: Data Constraint
Neighborhood term: Smoothness Constraint
MRF-based segmentation
EM algorithm E-Step: (Inference)
M-Step: (learning)
Applying and learning MRF
*
1( | , ) ( | , ) ( | )
arg max ( | , )
P P PZP
x
x y y x x
x x y
* arg max ( ( , | )) arg max ( , | ) ( | , )E P P P
x
x y x y x y
Pseduo-likelihood method.Slides by R. Huang – Rutgers University
Applying and learning MRF: Example
*
1
( , ) ( , )2
2
2
2 2
arg max ( | )
1arg max ( , ) ( | ) ( , ) / ( ) ( , )
1arg max ( , ) ( , ) ( , ) ( , ) ( , )
( , ) ( ; , )
( , ) exp( ( ) / )
[ , , ]
i i
i i
i i i j i i i ji i j i i j
i i i x x
i j i j
x x
P
P P P P PZ
x y x x P x y x xZ
x y G y
x x x x
x
x
x
x x y
x y x y x y y x y
x y
( , )i ix y
( , )i jx xSlides by R. Huang – Rutgers University