Interactive Segmentation
description
Transcript of Interactive Segmentation
Interactive Segmentation
Oleg Kraz
“The human brain starts working
the moment you are born and never
stops until you stand up to speak in
public ".
George Jessel
Manual segmentation is tedious and time consuming, lacking in precision.
Fully automated segmentation is very difficult due enormous variety of images.
Solution: Semi-Automated segmentation.◦ Inteructive segmentation
Motivation
Automatic segmentation assisted with human “hints” for extracting objects from background.
what is interactive segmentation?
“hint”
“hint”
Quick and accurate
Intuitive and easy to use – just with a simple gesture of a mouse
Allows the user to edit the computed boundary and make changes, without starting from scratch.
Advantages
Examples
Definitions and reminders◦Dijkstra's algorithm Solves the single source (seed point) shortest
path problem. Creates minimum cost spanning tree.
Intelligent Scissors
The basic idea is to formulate the image as a weighted graph where pixels represent nodeswith weighted edges connecting each pixel with its 8 adjacent neighbors.
How does it work?
Wi
Image Feature Formulation Laplacian Zero-Crossing fZ
Gradient Magnitude fG
Gradient Direction fD
Since we are looking for “shortest paths”, this image features should describe the local cost of transition from pixel a to b.
What are the waights?
Image Feature Formulation Laplacian Zero-Crossing fZ
Gradient Magnitude fG
Gradient Direction fD
Since we are looking for “shortest paths”, this image features should describe the local cost of transition from pixel a to b.
What are the waights?
If p and q are two neighboring pixels in the image then l(p, q) represents the local cost on the directed link (or edge) from p to q. The local cost function is a weighted sum of component cost functions.
l(p,q) = ωZ • fZ (q) + ωG • fG (q) + ωD • fD (p,q)
(Empirically, weights of ωZ = 0.43, ωD = 0.43, and ωG = 0.14 seem to work well in a wide range of images.)
Local Costs
Laplacian Zero-Crossing (fZ)
Convolution with laplacian kernel approximates the 2nd partial derivative of the image
Binary edge feature used for edge localization
Finds maximal (or minimal) gradient magnitude.
Laplacian zero-crossings represent “good” edge properties and should therefore have a low local cost.
Laplacian Zero-Crossing (fZ)
IL(q ) is the laplacian of an image I at pixel q
Laplacian Zero-Crossing
However, application of a discrete laplacian kernel to a digital image produces very few zero-valued pixels.
The resulting feature cost contains wide cost “canyons” used for boundary localization.
Can be done with multiple kernel widths. smaller kernels are more sensitive to fine detail while larger kernels suppress noise
Laplacian Zero-Crossing
-0.5 2
-0.6 0.8
0 1
0 1
Laplacian Zero-Crossing
With and withot laplacian zero crossing:
𝑓 𝐺=max (𝐺 )−𝐺max (𝐺 )
=1− 𝐺max (𝐺 )
Gradient Magnitude (fG)
We can use multiple kernel
sizes and choose the right
for us(for every pixel)
Gradient magnitude - “Edge Strength”
Higher gradient magnitude - lower the cost. Thus gradient is scaled and inverted
Gradient Magnitude (fG)
The gradient direction or orientation adds a smoothness constraint to the boundary by associating a relatively high cost for sharp changes in boundary direction.
Gradient Direction (fD)
D(p) - a unit vector of the gradient direction at a point p
D'(p) - the unit vector perpendicular (rotated 90° clockwise) to D(p)
L(p,q) is positive if the angle between D‘ (p) and the link (p,q) ≤ )
Gradient Direction (fD)
D(p)
D’(p)
Gradient Direction (fD)
The main purpose of including the neighborhood link direction is to associate a high cost with an edge between two neighboring pixels that have similar gradient directions but are perpendicular, or near perpendicular, to the link between them. Therefore, the direction feature cost is low when the gradient direction of the two neighboring pixels are similar to each other and the link between them
Gradient Direction (fD)
Finding path with lowest cost
The graph search algorithm is initialized by placing a start or seed point, s, with a cumulative cost of 0, on an empty list, L (called the active list)
After initialization, the graph search then iteratively generates a minimum cost spanning tree of the image, based on the local function (Dijkstra+Nillson)◦ The active list is sorted with linear complexity
The algorithm
Example with gradient magnitude (for simplicity):
Example
Initial local cost map with the seed point circled
Example (cont.)
Diagonal local costs have been scaled by Euclidean distance
Example (cont.)
47 points expanded
Example (cont.)
Finished cumulative cost and path matrix with two ofmany paths
Interactive movement of the free point by the mouse cursor causes the boundary to behave like a live-wire that follows the optimal path pointers from the free point back to the seed point
Seed point can be “snapped” to wanted edge by placing the mouse pointer close to the edge, using maximum gradient magnitude at specified neighborhood.
Interactive “Live-Wire”
Interactive “Live-Wire”
Interactive “Live-Wire”
Path Cooling More than two seed points are often
required to accurately define an object’s boundary
many paths “coalesce” and share portions of their optimal path with other paths from other pixels
Using boundary cooling, seed points are automatically placed by finding a pixel on the active live-wire segment that has a “stable” history
Path Cooling
Interactive Dynamic Training
Interactive Dynamic Training
On occasion, a section of the desired object boundary may have a weak gradient magnitude relative to a nearby strong gradient edge
Training exploits an object’s boundary segment that is already considered to be good and is performed dynamically as part of the boundary segmentation process.
Interactive Dynamic Training
Interactive Dynamic Training
Interactive Dynamic Training
Results
Results
Results
Results
Results
Lazy snappingInteractive Image cutout tool - technique of removing an object from background. Foreground
Background
Lazy snappingFirst, some reminder from a few weeks ago
cost function:The cost function provides a soft constraint
for segmentation and includes both region and boundary properties .
Let be a binary vector whose components can be either “obj” or “bkg” ,p the set of nods.
),,,,( 1 pp AAAA
pA
)()()( ABARAE
: Intuition The can be seen as the individual
penalties for assigning pixel p to “object” and “background” .For example may reflect on how the
intensity of pixel p fits into a known intensity model (e.g. histogram) of the object and background
comprises the “boundary” properties of segmentation A , Coefficient interpreted as a penalty for a discontinuity between p and q. is large when pixels p and q are similar . Costs may be based on local intensity gradient,
Laplacian zero-crossing.
)()()( ABARAE
)(AR
)(pR
)(AB0},{ qpB
},{ qpB
Implementation
Proceedings of “Internation Conference on Computer Vision”, Vancouver, Canada, July 2001
The general work :1. we create a graph with
two terminals.2. The edge weights reflect
the parameters in the regional and the boundary terms of the cost function,
3. as well as the known positions of seeds in the image. The seeds are O = {v} and B =
{p}
Lazy snapping Consists of 2 steps:
◦ Foreground and Background marking◦ Boundary editing
Lazy snapping Suppose the image is a graph .
◦ - Set of all pixels.◦ - Set of all arcs connecting adjacent nodes (4 or 8)
We want to minimize the cost energy
{foreground=1, background=0}
Lazy snapping
- Likelihood energy, indicating if it belongs to the foreground or background.
- Prior energy. Penalty for assigning 2 adjacent nodes with different labels.
Likelihood energy We have the marked Foreground and Background pixels. Now what?
Likelihood energy The colors in F and B are clustered by K-means (remember?) into 64 clusters, .
Why 64?
Then, for every node I, compute the minimum distance from it’s color C(i) to foreground and background clusters.
Likelihood energyGuaranty user constraints
Encourage similar color to F/B
Prior energy
- L2 norm of RGB color difference of pixels I and j
Minimizing energyTo minimize the energy E(x), we use the Graph-Cut algorithm (Jad will be glad to answer any question on the subject).
This algorithm fails to provide interactive visual feedback
Minimizing energySo what will we do?We will use first the watershed algorithm which is good in building over-segmentation.
Minimizing energyThis is an example of graph we receive after watershed segmentation:
Minimizing energyNow we minimize energy exactly like before, only instead of pixels we have regions. Color of each region is the mean color.
Results
And a little demonstration
Intelligent scissors or Lazy snapping?
Both of the methods give great results. Intelligent scissors often need more user interaction. Lazy snapping takes less time than intelligent scissors.
References Intelligent Scissors for Image Composition, E.N.Mortensen & W.A.Barrett, SIGGRAPH'95. E. N. Mortensen and W. A. Barrett, "Interactive
Segmentation with Intelligent Scissors," Graphical Models and Image Processing IP course - Lesson 12: “Edge Detection”
presentation, Prof. Hagit Hel Or Graph-Cut / Normalized Cut
segmentation, Jad silbak
The End