Anu Document

86
1.1 Introduction to Image Processing A new structure-based interest region detector called Principal Curvature-Based Regions (PCBR) which we use for object class recognition. The PCBR interest operator detects stable watershed regions within the multi-scale principal curvature image. To detect robust watershed regions, we “clean” a principal curvature image by combining a gray scale morphological close with our new “eigenvector flow” hysteresis threshold. Robustness across scales is achieved by selecting the maximally stable regions across consecutive scales. PCBR typically detects distinctive patterns distributed evenly on the objects and it shows significant robustness to local intensity perturbations and intra-class variations. We evaluate PCBR both qualitatively (through visual inspection) and quantitatively (by measuring repeatability and classification accuracy in real-world object-class recognition problems). Experiments on different benchmark datasets show that PCBR is comparable or superior to state-of-art detectors for both feature matching and object recognition. Moreover, we demonstrate the application of PCBR to symmetry detection In many object recognition tasks, within-class changes in pose, lighting, color, and texture can cause considerable variation in local intensities. Consequently, local intensity no longer provides a stable detection cue. As such, intensity-based interest operators (e.g., Harris, Kadir)–and the object recognition systems based on them–often fail to identify discriminative features. An alternative to local intensity cues is to capture semi-local structural cues such as edges

Transcript of Anu Document

Page 1: Anu Document

11 Introduction to Image Processing

A new structure-based interest region detector called Principal Curvature-Based Regions

(PCBR) which we use for object class recognition The PCBR interest operator detects stable

watershed regions within the multi-scale principal curvature image To detect robust

watershed regions we ldquocleanrdquo a principal curvature image by combining a gray scale

morphological close with our new ldquoeigenvector flowrdquo hysteresis threshold Robustness across

scales is achieved by selecting the maximally stable regions across consecutive scales PCBR

typically detects distinctive patterns distributed evenly on the objects and it shows significant

robustness to local intensity perturbations and intra-class variations We evaluate PCBR both

qualitatively (through visual inspection) and quantitatively (by measuring repeatability and

classification accuracy in real-world object-class recognition problems) Experiments on

different benchmark datasets show that PCBR is comparable or superior to state-of-art

detectors for both feature matching and object recognition Moreover we demonstrate the

application of PCBR to symmetry detection In many object recognition tasks within-class

changes in pose lighting color and texture can cause considerable variation in local

intensities Consequently local intensity no longer provides a stable detection cue As such

intensity-based interest operators (eg Harris Kadir)ndashand the object recognition systems

based on themndashoften fail to identify discriminative features An alternative to local intensity

cues is to capture semi-local structural cues such as edges and curvilinear shapes [25] These

structural cues tend to be more robust to intensity color and pose variations As such they

provide the basis for a more stable interest operator which in turn improves object

recognition accuracy This paper introduces a new detector that exploits curvilinear structures

to reliably detect interesting regions The detector called the Principal Curvature-Based

Region (PCBR) detector identifies stable watershed regions within the multi-scale principal

curvature image

Curvilinear structures are lines (either curved or straight) such as roads in aerial or satellite

images or blood vessels in medical scans These curvilinear structures can be detected over a

range of viewpoints scales and illumination changes The PCBR detector employs the first

steps of Stegerrsquos curvilinear detector algorithm [25] It forms an image of the maximum or

minimum eigen value of the Hessian matrix at each pixel We call this the principal curvature

image as it measures the principal curvature of the image intensity surface This process

generates a single response for both lines and edges producing a clearer structural sketch of

an image than is usually provided by the gradient magnitude image We develop a process

that detects structural regions efficiently and robustly using the watershed transform of the

principal curvature image across scale space The watershed algorithm provides a more

efficient mechanism for defining structural regions than previous methods that fit circles

ellipses and parallelograms [8 27] To improve the watershedrsquos robustness to noise and other

small image perturbations we first ldquocleanrdquo the principal curvature image with a gray scale

morphological close operation followed by a new hysteresis thresholding method based on

local eigenvector flow The watershed transform is then applied to the cleaned principal

curvature image and the resulting watershed regions (ie the catchment basins) define the

PCBR regions To achieve robust detections across multiple scales the watershed is applied

to the maxima of three consecutive images in the principal curvature scale spacendashsimilar to

local scale-space extreme used by Lowe [13] Mikolajczyk and Schmidt [17] and othersndashand

we further search for stable PCBR regions across consecutive scalesndashan idea adapted from

the stable regions detected across multiple threshold levels used by the MSER detector [15]

While PCBR shares similar ideas with previous detectors it represents a very different

approach to detecting interest regions Many prior intensity-based detectors search for points

with distinctive local differential geometry such as corners while ignoring image features

such as lines and edges Conversely PCBR utilizes line and edge features to construct

structural interest regions Compared to MSER PCBR differs two important aspects First

MSER does not analyze regions in scale space so it does not provide different levels of

region abstraction Second MSERrsquos intensity-based threshold process cannot overcome local

intensity variations within regions PCBR however overcomes this difficulty by focusing on

region boundaries rather than the appearance of region interiors This work makes two

contributions First we develop a new interest operator that utilizes principal curvature to

extract robust and invariant region structures based on both edge and curvilinear features

Second we introduce an enhanced principle-curvature-based watershed segmentation and

robust region selection process that is robust to intra-class variations and is more efficient

than previous structure-based detectors We demonstrate the value of our PCBR detector by

applying it to object-class recognition problems and symmetry detection

Image Processing is a form of signal processing where images and their properties can be

used to gather and analyze information about the objects in the image Digital image

processing uses digital images and computer algorithms to enhance manipulate or transform

images to obtain the necessary information and make decisions accordingly Examples of

digital image processing include improvements and analysis of the images of the Surveyor

missions to the moon [15] magnetic resonance imaging scans of the brain and electronic face

recognition packages These techniques can be used to assist humans with complex tasks and

make them easier A detailed analysis of an X-ray can help a radiologist to decide whether a

bone is fractured or not Digital image processing can increase the credibility of the decisions

made by humans

12 Introduction to Medical Imaging

Image processing techniques have developed and are applied to various fields like space

programs aerial and satellite imagery and medicine [15] Medical imaging is the set of digital

image processing techniques that create and analyze images of the human body to assist

doctors and medical scientists In medicine imaging is used for planning surgeries X-ray

imaging for bones Magnetic resonance imaging endoscopy and many other useful

applications [31] Digital X-ray imaging is used in this thesis project Figure 11 shows the

applications of digital imaging in medical imaging Since Wilhelm Roentgen discovered X-

rays in 1895 [14] X-ray technology has improved considerably In medicine X-rays help

doctors to see inside a patients body without surgery or any physical damage X-rays can

pass through solid objects without altering the physical state of the object because they have a

small wavelength So when this radiation is passed through a patients body objects of

different density cast shadows of different intensities resulting in black-and-white images

The bone for example will be shown in white as it is opaque and air will be shown in black

The other tissues in the body will be in gray A detailed analysis of the bone structure can be

performed using X-rays and any fractures can be detected Conventionally X-rays were taken

using special photographic films using silver salts [28] Digital X-rays can be taken using

crystal photodiodes Crystal photodiodes contain cadmium tungsten or bismuth germanate to

capture light as electrical pulses The signals are then converted from analogue to digital and

can be viewed on computers

Digital X-rays are very advantageous as they are portable require less energy than normal X-

rays less expensive and are environmentally friendly [28] A radiologist would look at the X-

rays and determine if a bone was fractured or not This system is time consuming and

unreliable because the probability of a fractured bone is low Some fractures are easy to

detect and a system can be developed to automatically detect fractures This will assist the

doctors and radiologists in their work and will improve the accuracy of the results [28]

According to the observations of [27] only 11 of the femur X-rays were showing fractured

bones So the radiologist has to look at a lot of X-rays to find a fractured one An algorithm to

automatically detect bone fractures could help the radiologist to find the fractured bones or at

least confidently sort out the healthy ones But no single algorithm can be used for the whole

body because of the complexity of different bone structures Even though a lot of research

has been done in this field there is no system that completely solves the problem [14] This is

because there are several complicated parts to this problem of fracture detection Digital X-

rays are very detailed and complicated to interpret Bones have different sizes and can differ

in characteristics from person to person So finding a general method to locate the bone and

decide if its fractured or not is a complex problem Some of the main aspects to the problem

of automatic bone fracture detection are bone orientation in the X-ray extracting bone

contour information bone segmentation extraction of relevant features

13 Description of the Problem

This thesis investigates the different ways of separating a bone from an X-ray Meth ods like

edge detection and Active Shape Models are experimented with The aim of this thesis is to

find an efficient and reasonably fast way of separating the bone from the rest of the X-ray

The bone that was used for the analysis is the tibia bone The tibia also known as the

shinbone or shankbone is the larger and stronger of the two bones in the leg below the knee

in vertebrates and connects the knee with the ankle bones Details of the X-ray data used are

provided in the next section

21 Theory Development

A typical digital image processing system consists of image segmentation feature extraction

pattern recognition thresholding and error classification Image processing aims at extracting

the necessary information from the image The image needs to be reduced to certain defining

characteristics and the analysis of these characteristics gives the relevant information Figure

21 shows a process flow diagram of a typical digital image processing system showing the

sequence of the operations Image segmentation is the main focus of this thesis The other

processes are briefly described for completeness and to inform the reader of the processes in

the whole system

211 Image Segmentation

Image segmentation is the process of extracting the regions of interest from an image There

are many operations to segment images and their usage depends on the nature of the region to

be extracted For example if an image has strong edges edge detection techniques can be

used to partition the image into its components using those edges Image segmentation is the

central theme of this thesis and is doneusing several techniques Figure 22 shows how one

of the coins can be separated from the image shows the original image and highlights the

boundary of one of the coins These techniques are analyzed and the best technique to

separate bones from X-rays is suggested When dealing with bone X-ray images contour

detection is an important step in image segmentation According to [31] classical image

segmentation and contour detection can be di_erent Contour detection algorithms extract the

contour of objects whereas image segmentation separates homogeneous sections of the

image A detailed literature review and history of the image segmentation techniques used for

different applications is given in Chapter 3

2 Segmentation of Images - An Overview

Image segmentation can proceed on three diregerent ways

sup2 Manually

sup2 Automatically

sup2 Semiautomatically

21 Manual Segmentation

The pixels belonging to the same intensity range could manually be pointed out but clearly

this is a very time consuming method if the image is large A better choice would be to mark

the contours of the objects This could be done discrete from the keyboard giving high

accuracy but low speed or it could be done with the mouse with higher speed but less

accuracy The manual techniques all have in common the amount of time spent in tracing the

objects and human resources are expensive Tracing algorithms can also make use of

geometrical figures like ellipses to approximate the boundaries of the objects This has been

done a lot for medical purposes but the approximations may not be very good

22 Automatic Segmentation

Fully automatic segmentation is diplusmncult to implement due to the high complexity and

variation of images Most algorithms need some a priori information to carry out the

segmentation and for a method to be automatic this a priori information must be available to

the computer The needed apriori information could for instance be noise level and

of the objects having a special distribution

23 Semiautomatic Segmentation

Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-

mentation By giving some initial information about the structures we can proceed with

automatic methods

sup2 Thresholding

If the distribution of intensities is known thresholding divides the image into two

regions separated by a manually chosen threshold value a as follows

if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B

[YGV] This can be repeated for each region dividing them by the threshold value which

results in four regions etc However a successful segmentation requires that some properties

of the image is known beforehand This method has the drawback of including separated

regions which correctly lie within the limits specified but regionally do not belong to the

selected region These pixels could for instance appear from noise The simplest way of

choosing the threshold value would be a fixed value for instance the mean value of the

image A better choice would be a histogram derived threshold This method includes some

knowledge of the distribution of the image and will result in less misclassimacrcation

Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First

segment the image into two regions according to a temporary chosen threshold value

Then calculate the mean value of the image corresponding to the two segmented

regions Calculate a new threshold value from

thresholdnew = mean(meanregion1 + meanregion2)

and repeat until the threshold value does not change any more Finally choose this

value for the threshold segmentation

To implement the triangle algorithm construct a histogram of intensities vs number

of pixels like in Figure 21 Draw a line between the maximum value of the histogram

hmax and the minimum value hmin and calculate the distance d between the line

and and the histogram Increase hmin and repeat for all h until h = hmax The

threshold value becomes the h for which the distance d is maximised This method

is particularly eregective when the pixels of the object we seek make a weak peak

sup2 Boundary tracking

Edge-macrnding by gradients is the method of selecting a boundary manually and auto-

matically follow this gradient until returning to the same point [YGV] Returning

to the same point can be a major problem of this method Boundary tracking will

wrongly include all interior holes in the region and will meet problems if the gradient

specifying the boundary is varying or is very small A way to overcome this problem

is macrrst to calculate the gradient and then apply a threshold segmentation This will

exclude some wrongly included pixels compared to the threshold method only

Zero-crossing based procedure is a method based on the Laplacian Assume the

boundaries of an object has the property that the Laplacian will change sign across

them Consider a 1D problem where cent = 2

x2 Assume the boundary is blurred

and the gradient will have a shape like in Figure 22 The Laplacian will change

sign just around the assumed edge for position = 0 For noisy images the noise will

produce large second derivatives around zero crossings and the zero-crossing based

procedure needs a smoothing macrlter to produce satisfactory results

sup2 Clustering Methods Clustering methods group pixels into larger regions using

colour codes The colour code for each pixel is usually given as a 3D vector but

212 Feature Extraction

Feature extraction is the process of reducing the segmented image into few numbers

or sets of numbers that de_ne the relevant features of the image These features

must be carefully chosen in such a way that they are a good representation of

the image and encapsulate the necessary information Some examples of features

can be image properties like the mean standard deviation gradient and edges

Generally a combination of features is used to generate a model for the images

Cross validation is done on the images to see which features represent the image

well and those features are used Features can sometimes be assigned weights to

signify the importance of certain features For example the mean in a certain

image may be given a weight of 09 because it is more important than the standard

deviation which may have a weight of 03 assigned to it Weights generally range

from 0 to 1 and they de_ne how important the features are These features and their

respective weights are then used on a test image to get the relevant information

To classify the bone as fractured or not[27]measures the neck-shaft angle from the

segmented femur contour as a feature Texture features of the image such as Gabor

orientation (GO) Markov Random Field (MRF) and intensity gradient direction

(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in

bones These techniques are also used in [20] to look at femur fractures speci_cally

Best parameter values for the features can be found using various techniques

213 Classifiers and Pattern Recognition

After the feature extraction stage the features have to be analyzed and a pat-

tern needs to be recognized For example the features mentioned above like the

neck-shaft angle in a femur X-ray image need to be plotted The patterns can be

recognized if the neck-shaft angles of good femurs are di_erent from those of frac-

tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are

used to classify features and _nd the best values for them For example [22] used

a support vector machine called the Gini-SVM [22] and found the feature values

for GO MRF and IGD that gave the best performance overall Clustering nearest

neighbour approaches can also be used for pattern recognition and classi_cation of

images For example the gradient vector of a healthy long bone X-ray may point

in a certain direction that is very di_erent to the gradient vector of a fractured long

bone X-ray So by observing this fact a bone in an unknown X-ray image can be

classi_ed as healthy or fractured using the gradient vector of the image

214 Thresholding and Error Classi_cation

Thresholding and Error Classi_cation is the _nal stage in the digital image process-

ing system Thresholding an image is a simple technique and can be done at any

stage in the process It can be used at the start to reduce the noise in the image or

it can be used to separate certain sections in an image that has distinct variations

in pixel values Thresholding is done by comparing the value of each pixel in an

image and comparing it to a threshold The image can be separated into regions

or pixels that are greater or lesser than the threshold value Multiple thresholds

can be used to achieve thresholding with many levels Otsus method [21] is a way

of automatically thresholding any image

Thresholding is used at different stages in this thesis It is a simple and useful tool in image

processing The following figures show the effects of thresholding Thresholding of an image

can be done manually by using the histogram of the intensities in an image It is difficult to

threshold noisy images as the background intensity and the foreground intensity may not be

distinctly separate Figure 23 shows an example of an image and its histogram that has the

pixel intensities on the horizontal axis and the number of pixels on the vertical axis

(a) The original image (b) The histogram of the image

Figure 23 Histogram of image [23]

IMAGE ENHANCEMENT TECHNIQUES

Image enhancement techniques improve the quality of an image as perceived by a human

These techniques are most useful because many satellite images when examined on a colour

display give inadequate information for image interpretation There is no conscious effort to

improve the fidelity of

the image with regard to some ideal form of the image There exists a wide variety of

techniques for improving image quality The contrast stretch density slicing edge

enhancement and spatial filtering are the more commonly used techniques Image

enhancement is attempted after the image is corrected for geometric and radiometric

distortions Image enhancement methods are applied separately to each band of a

multispectral image Digital techniques have been found to be most satisfactory than the

photographic technique for image enhancement because of the precision and wide variety of

digital

processes

Contrast

Contrast generally refers to the difference in luminance or grey level values in an image and

is an important characteristic It can be defined as the ratio of the maximum intensity to the

minimum intensity over an image Contrast ratio has a strong bearing on the resolving power

and detectability

of an image Larger this ratio more easy it is to interpret the image Satellite images lack

adequate contrast and require contrast improvement

Contrast Enhancement

Contrast enhancement techniques expand the range of brightness values in an image so that

the image can be efficiently displayed in a manner desired by the analyst The density values

in a scene are literally pulled farther apart that is expanded over a greater range The effect

is to increase the visual

contrast between two areas of different uniform densities This enables the analyst to

discriminate easily between areas initially having a small difference in density

Linear Contrast Stretch

This is the simplest contrast stretch algorithm The grey values in the original image and the

modified image follow a linear relation in this algorithm A density number in the low range

of the original histogram is assigned to extremely black and a value at the high end is

assigned to extremely white The remaining pixel values are distributed linearly between

these extremes The features or details that were obscure on the original image will be clear

in the contrast stretched image Linear contrast stretch operation can be represented

graphically as shown in Fig 4 To provide optimal contrast

and colour variation in colour composites the small range of grey values in each band is

stretched to the full brightness range of the output or display unit

Non-Linear Contrast Enhancement

In these methods the input and output data values follow a non-linear transformation The

general form of the non-linear contrast enhancement is defined by y = f (x) where x is the

input data value and y is the output data value The non-linear contrast enhancement

techniques have been found to be useful for enhancing the colour contrast between the nearly

classes and subclasses of a main class

A type of non linear contrast stretch involves scaling the input data logarithmically This

enhancement has greatest impact on the brightness values found in the darker part of

histogram It could be reversed to enhance values in brighter part of histogram by scaling the

input data using an inverse log

function Histogram equalization is another non-linear contrast enhancement technique In

this technique histogram of the original image is redistributed to produce a uniform

population density This is obtained by grouping certain adjacent grey values Thus the

number of grey levels in the enhanced image is less than the number of grey levels in the

original image

SPATIAL FILTERING

A characteristic of remotely sensed images is a parameter called spatial frequency defined as

number of changes in Brightness Value per unit distance for any particular part of an image

If there are very few changes in Brightness Value once a given area in an image this is

referred to as low frequency area Conversely if the Brightness Value changes dramatically

over short distances this is an area of high frequency Spatial filtering is the process of

dividing the image into its constituent spatial frequencies and selectively altering certain

spatial frequencies to emphasize some image features This technique increases the analystrsquos

ability to discriminate detail The three types of spatial filters used in remote sensor data

processing are Low pass filters Band pass filters and High pass filters

Low-Frequency Filtering in the Spatial Domain

Image enhancements that de-emphasize or block the high spatial frequency detail are low-

frequency or low-pass filters The simplest low-frequency filter evaluates a particular input

pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new

brightness value BVout that is the mean of this convolution The size of the neighbourhood

convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing

operation will however blur the image especially at the edges of objects Blurring becomes

more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass

image being two lines and two columns smaller than the original image Techniques that can

be applied to deal with this problem include (1) artificially extending the original image

beyond its border by repeating the original border pixel brightness values or (2) replicating

the averaged brightness values near the borders based on the image behaviour within a view

pixels of the border The most commonly used low pass filters are mean median and mode

filters

High-Frequency Filtering in the Spatial Domain

High-pass filtering is applied to imagery to remove the slowly varying components and

enhance the high-frequency local variations Brightness values tend to be highly correlated in

a nine-element window Thus the highfrequency filtered image will have a relatively narrow

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 2: Anu Document

principal curvature image across scale space The watershed algorithm provides a more

efficient mechanism for defining structural regions than previous methods that fit circles

ellipses and parallelograms [8 27] To improve the watershedrsquos robustness to noise and other

small image perturbations we first ldquocleanrdquo the principal curvature image with a gray scale

morphological close operation followed by a new hysteresis thresholding method based on

local eigenvector flow The watershed transform is then applied to the cleaned principal

curvature image and the resulting watershed regions (ie the catchment basins) define the

PCBR regions To achieve robust detections across multiple scales the watershed is applied

to the maxima of three consecutive images in the principal curvature scale spacendashsimilar to

local scale-space extreme used by Lowe [13] Mikolajczyk and Schmidt [17] and othersndashand

we further search for stable PCBR regions across consecutive scalesndashan idea adapted from

the stable regions detected across multiple threshold levels used by the MSER detector [15]

While PCBR shares similar ideas with previous detectors it represents a very different

approach to detecting interest regions Many prior intensity-based detectors search for points

with distinctive local differential geometry such as corners while ignoring image features

such as lines and edges Conversely PCBR utilizes line and edge features to construct

structural interest regions Compared to MSER PCBR differs two important aspects First

MSER does not analyze regions in scale space so it does not provide different levels of

region abstraction Second MSERrsquos intensity-based threshold process cannot overcome local

intensity variations within regions PCBR however overcomes this difficulty by focusing on

region boundaries rather than the appearance of region interiors This work makes two

contributions First we develop a new interest operator that utilizes principal curvature to

extract robust and invariant region structures based on both edge and curvilinear features

Second we introduce an enhanced principle-curvature-based watershed segmentation and

robust region selection process that is robust to intra-class variations and is more efficient

than previous structure-based detectors We demonstrate the value of our PCBR detector by

applying it to object-class recognition problems and symmetry detection

Image Processing is a form of signal processing where images and their properties can be

used to gather and analyze information about the objects in the image Digital image

processing uses digital images and computer algorithms to enhance manipulate or transform

images to obtain the necessary information and make decisions accordingly Examples of

digital image processing include improvements and analysis of the images of the Surveyor

missions to the moon [15] magnetic resonance imaging scans of the brain and electronic face

recognition packages These techniques can be used to assist humans with complex tasks and

make them easier A detailed analysis of an X-ray can help a radiologist to decide whether a

bone is fractured or not Digital image processing can increase the credibility of the decisions

made by humans

12 Introduction to Medical Imaging

Image processing techniques have developed and are applied to various fields like space

programs aerial and satellite imagery and medicine [15] Medical imaging is the set of digital

image processing techniques that create and analyze images of the human body to assist

doctors and medical scientists In medicine imaging is used for planning surgeries X-ray

imaging for bones Magnetic resonance imaging endoscopy and many other useful

applications [31] Digital X-ray imaging is used in this thesis project Figure 11 shows the

applications of digital imaging in medical imaging Since Wilhelm Roentgen discovered X-

rays in 1895 [14] X-ray technology has improved considerably In medicine X-rays help

doctors to see inside a patients body without surgery or any physical damage X-rays can

pass through solid objects without altering the physical state of the object because they have a

small wavelength So when this radiation is passed through a patients body objects of

different density cast shadows of different intensities resulting in black-and-white images

The bone for example will be shown in white as it is opaque and air will be shown in black

The other tissues in the body will be in gray A detailed analysis of the bone structure can be

performed using X-rays and any fractures can be detected Conventionally X-rays were taken

using special photographic films using silver salts [28] Digital X-rays can be taken using

crystal photodiodes Crystal photodiodes contain cadmium tungsten or bismuth germanate to

capture light as electrical pulses The signals are then converted from analogue to digital and

can be viewed on computers

Digital X-rays are very advantageous as they are portable require less energy than normal X-

rays less expensive and are environmentally friendly [28] A radiologist would look at the X-

rays and determine if a bone was fractured or not This system is time consuming and

unreliable because the probability of a fractured bone is low Some fractures are easy to

detect and a system can be developed to automatically detect fractures This will assist the

doctors and radiologists in their work and will improve the accuracy of the results [28]

According to the observations of [27] only 11 of the femur X-rays were showing fractured

bones So the radiologist has to look at a lot of X-rays to find a fractured one An algorithm to

automatically detect bone fractures could help the radiologist to find the fractured bones or at

least confidently sort out the healthy ones But no single algorithm can be used for the whole

body because of the complexity of different bone structures Even though a lot of research

has been done in this field there is no system that completely solves the problem [14] This is

because there are several complicated parts to this problem of fracture detection Digital X-

rays are very detailed and complicated to interpret Bones have different sizes and can differ

in characteristics from person to person So finding a general method to locate the bone and

decide if its fractured or not is a complex problem Some of the main aspects to the problem

of automatic bone fracture detection are bone orientation in the X-ray extracting bone

contour information bone segmentation extraction of relevant features

13 Description of the Problem

This thesis investigates the different ways of separating a bone from an X-ray Meth ods like

edge detection and Active Shape Models are experimented with The aim of this thesis is to

find an efficient and reasonably fast way of separating the bone from the rest of the X-ray

The bone that was used for the analysis is the tibia bone The tibia also known as the

shinbone or shankbone is the larger and stronger of the two bones in the leg below the knee

in vertebrates and connects the knee with the ankle bones Details of the X-ray data used are

provided in the next section

21 Theory Development

A typical digital image processing system consists of image segmentation feature extraction

pattern recognition thresholding and error classification Image processing aims at extracting

the necessary information from the image The image needs to be reduced to certain defining

characteristics and the analysis of these characteristics gives the relevant information Figure

21 shows a process flow diagram of a typical digital image processing system showing the

sequence of the operations Image segmentation is the main focus of this thesis The other

processes are briefly described for completeness and to inform the reader of the processes in

the whole system

211 Image Segmentation

Image segmentation is the process of extracting the regions of interest from an image There

are many operations to segment images and their usage depends on the nature of the region to

be extracted For example if an image has strong edges edge detection techniques can be

used to partition the image into its components using those edges Image segmentation is the

central theme of this thesis and is doneusing several techniques Figure 22 shows how one

of the coins can be separated from the image shows the original image and highlights the

boundary of one of the coins These techniques are analyzed and the best technique to

separate bones from X-rays is suggested When dealing with bone X-ray images contour

detection is an important step in image segmentation According to [31] classical image

segmentation and contour detection can be di_erent Contour detection algorithms extract the

contour of objects whereas image segmentation separates homogeneous sections of the

image A detailed literature review and history of the image segmentation techniques used for

different applications is given in Chapter 3

2 Segmentation of Images - An Overview

Image segmentation can proceed on three diregerent ways

sup2 Manually

sup2 Automatically

sup2 Semiautomatically

21 Manual Segmentation

The pixels belonging to the same intensity range could manually be pointed out but clearly

this is a very time consuming method if the image is large A better choice would be to mark

the contours of the objects This could be done discrete from the keyboard giving high

accuracy but low speed or it could be done with the mouse with higher speed but less

accuracy The manual techniques all have in common the amount of time spent in tracing the

objects and human resources are expensive Tracing algorithms can also make use of

geometrical figures like ellipses to approximate the boundaries of the objects This has been

done a lot for medical purposes but the approximations may not be very good

22 Automatic Segmentation

Fully automatic segmentation is diplusmncult to implement due to the high complexity and

variation of images Most algorithms need some a priori information to carry out the

segmentation and for a method to be automatic this a priori information must be available to

the computer The needed apriori information could for instance be noise level and

of the objects having a special distribution

23 Semiautomatic Segmentation

Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-

mentation By giving some initial information about the structures we can proceed with

automatic methods

sup2 Thresholding

If the distribution of intensities is known thresholding divides the image into two

regions separated by a manually chosen threshold value a as follows

if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B

[YGV] This can be repeated for each region dividing them by the threshold value which

results in four regions etc However a successful segmentation requires that some properties

of the image is known beforehand This method has the drawback of including separated

regions which correctly lie within the limits specified but regionally do not belong to the

selected region These pixels could for instance appear from noise The simplest way of

choosing the threshold value would be a fixed value for instance the mean value of the

image A better choice would be a histogram derived threshold This method includes some

knowledge of the distribution of the image and will result in less misclassimacrcation

Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First

segment the image into two regions according to a temporary chosen threshold value

Then calculate the mean value of the image corresponding to the two segmented

regions Calculate a new threshold value from

thresholdnew = mean(meanregion1 + meanregion2)

and repeat until the threshold value does not change any more Finally choose this

value for the threshold segmentation

To implement the triangle algorithm construct a histogram of intensities vs number

of pixels like in Figure 21 Draw a line between the maximum value of the histogram

hmax and the minimum value hmin and calculate the distance d between the line

and and the histogram Increase hmin and repeat for all h until h = hmax The

threshold value becomes the h for which the distance d is maximised This method

is particularly eregective when the pixels of the object we seek make a weak peak

sup2 Boundary tracking

Edge-macrnding by gradients is the method of selecting a boundary manually and auto-

matically follow this gradient until returning to the same point [YGV] Returning

to the same point can be a major problem of this method Boundary tracking will

wrongly include all interior holes in the region and will meet problems if the gradient

specifying the boundary is varying or is very small A way to overcome this problem

is macrrst to calculate the gradient and then apply a threshold segmentation This will

exclude some wrongly included pixels compared to the threshold method only

Zero-crossing based procedure is a method based on the Laplacian Assume the

boundaries of an object has the property that the Laplacian will change sign across

them Consider a 1D problem where cent = 2

x2 Assume the boundary is blurred

and the gradient will have a shape like in Figure 22 The Laplacian will change

sign just around the assumed edge for position = 0 For noisy images the noise will

produce large second derivatives around zero crossings and the zero-crossing based

procedure needs a smoothing macrlter to produce satisfactory results

sup2 Clustering Methods Clustering methods group pixels into larger regions using

colour codes The colour code for each pixel is usually given as a 3D vector but

212 Feature Extraction

Feature extraction is the process of reducing the segmented image into few numbers

or sets of numbers that de_ne the relevant features of the image These features

must be carefully chosen in such a way that they are a good representation of

the image and encapsulate the necessary information Some examples of features

can be image properties like the mean standard deviation gradient and edges

Generally a combination of features is used to generate a model for the images

Cross validation is done on the images to see which features represent the image

well and those features are used Features can sometimes be assigned weights to

signify the importance of certain features For example the mean in a certain

image may be given a weight of 09 because it is more important than the standard

deviation which may have a weight of 03 assigned to it Weights generally range

from 0 to 1 and they de_ne how important the features are These features and their

respective weights are then used on a test image to get the relevant information

To classify the bone as fractured or not[27]measures the neck-shaft angle from the

segmented femur contour as a feature Texture features of the image such as Gabor

orientation (GO) Markov Random Field (MRF) and intensity gradient direction

(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in

bones These techniques are also used in [20] to look at femur fractures speci_cally

Best parameter values for the features can be found using various techniques

213 Classifiers and Pattern Recognition

After the feature extraction stage the features have to be analyzed and a pat-

tern needs to be recognized For example the features mentioned above like the

neck-shaft angle in a femur X-ray image need to be plotted The patterns can be

recognized if the neck-shaft angles of good femurs are di_erent from those of frac-

tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are

used to classify features and _nd the best values for them For example [22] used

a support vector machine called the Gini-SVM [22] and found the feature values

for GO MRF and IGD that gave the best performance overall Clustering nearest

neighbour approaches can also be used for pattern recognition and classi_cation of

images For example the gradient vector of a healthy long bone X-ray may point

in a certain direction that is very di_erent to the gradient vector of a fractured long

bone X-ray So by observing this fact a bone in an unknown X-ray image can be

classi_ed as healthy or fractured using the gradient vector of the image

214 Thresholding and Error Classi_cation

Thresholding and Error Classi_cation is the _nal stage in the digital image process-

ing system Thresholding an image is a simple technique and can be done at any

stage in the process It can be used at the start to reduce the noise in the image or

it can be used to separate certain sections in an image that has distinct variations

in pixel values Thresholding is done by comparing the value of each pixel in an

image and comparing it to a threshold The image can be separated into regions

or pixels that are greater or lesser than the threshold value Multiple thresholds

can be used to achieve thresholding with many levels Otsus method [21] is a way

of automatically thresholding any image

Thresholding is used at different stages in this thesis It is a simple and useful tool in image

processing The following figures show the effects of thresholding Thresholding of an image

can be done manually by using the histogram of the intensities in an image It is difficult to

threshold noisy images as the background intensity and the foreground intensity may not be

distinctly separate Figure 23 shows an example of an image and its histogram that has the

pixel intensities on the horizontal axis and the number of pixels on the vertical axis

(a) The original image (b) The histogram of the image

Figure 23 Histogram of image [23]

IMAGE ENHANCEMENT TECHNIQUES

Image enhancement techniques improve the quality of an image as perceived by a human

These techniques are most useful because many satellite images when examined on a colour

display give inadequate information for image interpretation There is no conscious effort to

improve the fidelity of

the image with regard to some ideal form of the image There exists a wide variety of

techniques for improving image quality The contrast stretch density slicing edge

enhancement and spatial filtering are the more commonly used techniques Image

enhancement is attempted after the image is corrected for geometric and radiometric

distortions Image enhancement methods are applied separately to each band of a

multispectral image Digital techniques have been found to be most satisfactory than the

photographic technique for image enhancement because of the precision and wide variety of

digital

processes

Contrast

Contrast generally refers to the difference in luminance or grey level values in an image and

is an important characteristic It can be defined as the ratio of the maximum intensity to the

minimum intensity over an image Contrast ratio has a strong bearing on the resolving power

and detectability

of an image Larger this ratio more easy it is to interpret the image Satellite images lack

adequate contrast and require contrast improvement

Contrast Enhancement

Contrast enhancement techniques expand the range of brightness values in an image so that

the image can be efficiently displayed in a manner desired by the analyst The density values

in a scene are literally pulled farther apart that is expanded over a greater range The effect

is to increase the visual

contrast between two areas of different uniform densities This enables the analyst to

discriminate easily between areas initially having a small difference in density

Linear Contrast Stretch

This is the simplest contrast stretch algorithm The grey values in the original image and the

modified image follow a linear relation in this algorithm A density number in the low range

of the original histogram is assigned to extremely black and a value at the high end is

assigned to extremely white The remaining pixel values are distributed linearly between

these extremes The features or details that were obscure on the original image will be clear

in the contrast stretched image Linear contrast stretch operation can be represented

graphically as shown in Fig 4 To provide optimal contrast

and colour variation in colour composites the small range of grey values in each band is

stretched to the full brightness range of the output or display unit

Non-Linear Contrast Enhancement

In these methods the input and output data values follow a non-linear transformation The

general form of the non-linear contrast enhancement is defined by y = f (x) where x is the

input data value and y is the output data value The non-linear contrast enhancement

techniques have been found to be useful for enhancing the colour contrast between the nearly

classes and subclasses of a main class

A type of non linear contrast stretch involves scaling the input data logarithmically This

enhancement has greatest impact on the brightness values found in the darker part of

histogram It could be reversed to enhance values in brighter part of histogram by scaling the

input data using an inverse log

function Histogram equalization is another non-linear contrast enhancement technique In

this technique histogram of the original image is redistributed to produce a uniform

population density This is obtained by grouping certain adjacent grey values Thus the

number of grey levels in the enhanced image is less than the number of grey levels in the

original image

SPATIAL FILTERING

A characteristic of remotely sensed images is a parameter called spatial frequency defined as

number of changes in Brightness Value per unit distance for any particular part of an image

If there are very few changes in Brightness Value once a given area in an image this is

referred to as low frequency area Conversely if the Brightness Value changes dramatically

over short distances this is an area of high frequency Spatial filtering is the process of

dividing the image into its constituent spatial frequencies and selectively altering certain

spatial frequencies to emphasize some image features This technique increases the analystrsquos

ability to discriminate detail The three types of spatial filters used in remote sensor data

processing are Low pass filters Band pass filters and High pass filters

Low-Frequency Filtering in the Spatial Domain

Image enhancements that de-emphasize or block the high spatial frequency detail are low-

frequency or low-pass filters The simplest low-frequency filter evaluates a particular input

pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new

brightness value BVout that is the mean of this convolution The size of the neighbourhood

convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing

operation will however blur the image especially at the edges of objects Blurring becomes

more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass

image being two lines and two columns smaller than the original image Techniques that can

be applied to deal with this problem include (1) artificially extending the original image

beyond its border by repeating the original border pixel brightness values or (2) replicating

the averaged brightness values near the borders based on the image behaviour within a view

pixels of the border The most commonly used low pass filters are mean median and mode

filters

High-Frequency Filtering in the Spatial Domain

High-pass filtering is applied to imagery to remove the slowly varying components and

enhance the high-frequency local variations Brightness values tend to be highly correlated in

a nine-element window Thus the highfrequency filtered image will have a relatively narrow

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 3: Anu Document

make them easier A detailed analysis of an X-ray can help a radiologist to decide whether a

bone is fractured or not Digital image processing can increase the credibility of the decisions

made by humans

12 Introduction to Medical Imaging

Image processing techniques have developed and are applied to various fields like space

programs aerial and satellite imagery and medicine [15] Medical imaging is the set of digital

image processing techniques that create and analyze images of the human body to assist

doctors and medical scientists In medicine imaging is used for planning surgeries X-ray

imaging for bones Magnetic resonance imaging endoscopy and many other useful

applications [31] Digital X-ray imaging is used in this thesis project Figure 11 shows the

applications of digital imaging in medical imaging Since Wilhelm Roentgen discovered X-

rays in 1895 [14] X-ray technology has improved considerably In medicine X-rays help

doctors to see inside a patients body without surgery or any physical damage X-rays can

pass through solid objects without altering the physical state of the object because they have a

small wavelength So when this radiation is passed through a patients body objects of

different density cast shadows of different intensities resulting in black-and-white images

The bone for example will be shown in white as it is opaque and air will be shown in black

The other tissues in the body will be in gray A detailed analysis of the bone structure can be

performed using X-rays and any fractures can be detected Conventionally X-rays were taken

using special photographic films using silver salts [28] Digital X-rays can be taken using

crystal photodiodes Crystal photodiodes contain cadmium tungsten or bismuth germanate to

capture light as electrical pulses The signals are then converted from analogue to digital and

can be viewed on computers

Digital X-rays are very advantageous as they are portable require less energy than normal X-

rays less expensive and are environmentally friendly [28] A radiologist would look at the X-

rays and determine if a bone was fractured or not This system is time consuming and

unreliable because the probability of a fractured bone is low Some fractures are easy to

detect and a system can be developed to automatically detect fractures This will assist the

doctors and radiologists in their work and will improve the accuracy of the results [28]

According to the observations of [27] only 11 of the femur X-rays were showing fractured

bones So the radiologist has to look at a lot of X-rays to find a fractured one An algorithm to

automatically detect bone fractures could help the radiologist to find the fractured bones or at

least confidently sort out the healthy ones But no single algorithm can be used for the whole

body because of the complexity of different bone structures Even though a lot of research

has been done in this field there is no system that completely solves the problem [14] This is

because there are several complicated parts to this problem of fracture detection Digital X-

rays are very detailed and complicated to interpret Bones have different sizes and can differ

in characteristics from person to person So finding a general method to locate the bone and

decide if its fractured or not is a complex problem Some of the main aspects to the problem

of automatic bone fracture detection are bone orientation in the X-ray extracting bone

contour information bone segmentation extraction of relevant features

13 Description of the Problem

This thesis investigates the different ways of separating a bone from an X-ray Meth ods like

edge detection and Active Shape Models are experimented with The aim of this thesis is to

find an efficient and reasonably fast way of separating the bone from the rest of the X-ray

The bone that was used for the analysis is the tibia bone The tibia also known as the

shinbone or shankbone is the larger and stronger of the two bones in the leg below the knee

in vertebrates and connects the knee with the ankle bones Details of the X-ray data used are

provided in the next section

21 Theory Development

A typical digital image processing system consists of image segmentation feature extraction

pattern recognition thresholding and error classification Image processing aims at extracting

the necessary information from the image The image needs to be reduced to certain defining

characteristics and the analysis of these characteristics gives the relevant information Figure

21 shows a process flow diagram of a typical digital image processing system showing the

sequence of the operations Image segmentation is the main focus of this thesis The other

processes are briefly described for completeness and to inform the reader of the processes in

the whole system

211 Image Segmentation

Image segmentation is the process of extracting the regions of interest from an image There

are many operations to segment images and their usage depends on the nature of the region to

be extracted For example if an image has strong edges edge detection techniques can be

used to partition the image into its components using those edges Image segmentation is the

central theme of this thesis and is doneusing several techniques Figure 22 shows how one

of the coins can be separated from the image shows the original image and highlights the

boundary of one of the coins These techniques are analyzed and the best technique to

separate bones from X-rays is suggested When dealing with bone X-ray images contour

detection is an important step in image segmentation According to [31] classical image

segmentation and contour detection can be di_erent Contour detection algorithms extract the

contour of objects whereas image segmentation separates homogeneous sections of the

image A detailed literature review and history of the image segmentation techniques used for

different applications is given in Chapter 3

2 Segmentation of Images - An Overview

Image segmentation can proceed on three diregerent ways

sup2 Manually

sup2 Automatically

sup2 Semiautomatically

21 Manual Segmentation

The pixels belonging to the same intensity range could manually be pointed out but clearly

this is a very time consuming method if the image is large A better choice would be to mark

the contours of the objects This could be done discrete from the keyboard giving high

accuracy but low speed or it could be done with the mouse with higher speed but less

accuracy The manual techniques all have in common the amount of time spent in tracing the

objects and human resources are expensive Tracing algorithms can also make use of

geometrical figures like ellipses to approximate the boundaries of the objects This has been

done a lot for medical purposes but the approximations may not be very good

22 Automatic Segmentation

Fully automatic segmentation is diplusmncult to implement due to the high complexity and

variation of images Most algorithms need some a priori information to carry out the

segmentation and for a method to be automatic this a priori information must be available to

the computer The needed apriori information could for instance be noise level and

of the objects having a special distribution

23 Semiautomatic Segmentation

Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-

mentation By giving some initial information about the structures we can proceed with

automatic methods

sup2 Thresholding

If the distribution of intensities is known thresholding divides the image into two

regions separated by a manually chosen threshold value a as follows

if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B

[YGV] This can be repeated for each region dividing them by the threshold value which

results in four regions etc However a successful segmentation requires that some properties

of the image is known beforehand This method has the drawback of including separated

regions which correctly lie within the limits specified but regionally do not belong to the

selected region These pixels could for instance appear from noise The simplest way of

choosing the threshold value would be a fixed value for instance the mean value of the

image A better choice would be a histogram derived threshold This method includes some

knowledge of the distribution of the image and will result in less misclassimacrcation

Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First

segment the image into two regions according to a temporary chosen threshold value

Then calculate the mean value of the image corresponding to the two segmented

regions Calculate a new threshold value from

thresholdnew = mean(meanregion1 + meanregion2)

and repeat until the threshold value does not change any more Finally choose this

value for the threshold segmentation

To implement the triangle algorithm construct a histogram of intensities vs number

of pixels like in Figure 21 Draw a line between the maximum value of the histogram

hmax and the minimum value hmin and calculate the distance d between the line

and and the histogram Increase hmin and repeat for all h until h = hmax The

threshold value becomes the h for which the distance d is maximised This method

is particularly eregective when the pixels of the object we seek make a weak peak

sup2 Boundary tracking

Edge-macrnding by gradients is the method of selecting a boundary manually and auto-

matically follow this gradient until returning to the same point [YGV] Returning

to the same point can be a major problem of this method Boundary tracking will

wrongly include all interior holes in the region and will meet problems if the gradient

specifying the boundary is varying or is very small A way to overcome this problem

is macrrst to calculate the gradient and then apply a threshold segmentation This will

exclude some wrongly included pixels compared to the threshold method only

Zero-crossing based procedure is a method based on the Laplacian Assume the

boundaries of an object has the property that the Laplacian will change sign across

them Consider a 1D problem where cent = 2

x2 Assume the boundary is blurred

and the gradient will have a shape like in Figure 22 The Laplacian will change

sign just around the assumed edge for position = 0 For noisy images the noise will

produce large second derivatives around zero crossings and the zero-crossing based

procedure needs a smoothing macrlter to produce satisfactory results

sup2 Clustering Methods Clustering methods group pixels into larger regions using

colour codes The colour code for each pixel is usually given as a 3D vector but

212 Feature Extraction

Feature extraction is the process of reducing the segmented image into few numbers

or sets of numbers that de_ne the relevant features of the image These features

must be carefully chosen in such a way that they are a good representation of

the image and encapsulate the necessary information Some examples of features

can be image properties like the mean standard deviation gradient and edges

Generally a combination of features is used to generate a model for the images

Cross validation is done on the images to see which features represent the image

well and those features are used Features can sometimes be assigned weights to

signify the importance of certain features For example the mean in a certain

image may be given a weight of 09 because it is more important than the standard

deviation which may have a weight of 03 assigned to it Weights generally range

from 0 to 1 and they de_ne how important the features are These features and their

respective weights are then used on a test image to get the relevant information

To classify the bone as fractured or not[27]measures the neck-shaft angle from the

segmented femur contour as a feature Texture features of the image such as Gabor

orientation (GO) Markov Random Field (MRF) and intensity gradient direction

(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in

bones These techniques are also used in [20] to look at femur fractures speci_cally

Best parameter values for the features can be found using various techniques

213 Classifiers and Pattern Recognition

After the feature extraction stage the features have to be analyzed and a pat-

tern needs to be recognized For example the features mentioned above like the

neck-shaft angle in a femur X-ray image need to be plotted The patterns can be

recognized if the neck-shaft angles of good femurs are di_erent from those of frac-

tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are

used to classify features and _nd the best values for them For example [22] used

a support vector machine called the Gini-SVM [22] and found the feature values

for GO MRF and IGD that gave the best performance overall Clustering nearest

neighbour approaches can also be used for pattern recognition and classi_cation of

images For example the gradient vector of a healthy long bone X-ray may point

in a certain direction that is very di_erent to the gradient vector of a fractured long

bone X-ray So by observing this fact a bone in an unknown X-ray image can be

classi_ed as healthy or fractured using the gradient vector of the image

214 Thresholding and Error Classi_cation

Thresholding and Error Classi_cation is the _nal stage in the digital image process-

ing system Thresholding an image is a simple technique and can be done at any

stage in the process It can be used at the start to reduce the noise in the image or

it can be used to separate certain sections in an image that has distinct variations

in pixel values Thresholding is done by comparing the value of each pixel in an

image and comparing it to a threshold The image can be separated into regions

or pixels that are greater or lesser than the threshold value Multiple thresholds

can be used to achieve thresholding with many levels Otsus method [21] is a way

of automatically thresholding any image

Thresholding is used at different stages in this thesis It is a simple and useful tool in image

processing The following figures show the effects of thresholding Thresholding of an image

can be done manually by using the histogram of the intensities in an image It is difficult to

threshold noisy images as the background intensity and the foreground intensity may not be

distinctly separate Figure 23 shows an example of an image and its histogram that has the

pixel intensities on the horizontal axis and the number of pixels on the vertical axis

(a) The original image (b) The histogram of the image

Figure 23 Histogram of image [23]

IMAGE ENHANCEMENT TECHNIQUES

Image enhancement techniques improve the quality of an image as perceived by a human

These techniques are most useful because many satellite images when examined on a colour

display give inadequate information for image interpretation There is no conscious effort to

improve the fidelity of

the image with regard to some ideal form of the image There exists a wide variety of

techniques for improving image quality The contrast stretch density slicing edge

enhancement and spatial filtering are the more commonly used techniques Image

enhancement is attempted after the image is corrected for geometric and radiometric

distortions Image enhancement methods are applied separately to each band of a

multispectral image Digital techniques have been found to be most satisfactory than the

photographic technique for image enhancement because of the precision and wide variety of

digital

processes

Contrast

Contrast generally refers to the difference in luminance or grey level values in an image and

is an important characteristic It can be defined as the ratio of the maximum intensity to the

minimum intensity over an image Contrast ratio has a strong bearing on the resolving power

and detectability

of an image Larger this ratio more easy it is to interpret the image Satellite images lack

adequate contrast and require contrast improvement

Contrast Enhancement

Contrast enhancement techniques expand the range of brightness values in an image so that

the image can be efficiently displayed in a manner desired by the analyst The density values

in a scene are literally pulled farther apart that is expanded over a greater range The effect

is to increase the visual

contrast between two areas of different uniform densities This enables the analyst to

discriminate easily between areas initially having a small difference in density

Linear Contrast Stretch

This is the simplest contrast stretch algorithm The grey values in the original image and the

modified image follow a linear relation in this algorithm A density number in the low range

of the original histogram is assigned to extremely black and a value at the high end is

assigned to extremely white The remaining pixel values are distributed linearly between

these extremes The features or details that were obscure on the original image will be clear

in the contrast stretched image Linear contrast stretch operation can be represented

graphically as shown in Fig 4 To provide optimal contrast

and colour variation in colour composites the small range of grey values in each band is

stretched to the full brightness range of the output or display unit

Non-Linear Contrast Enhancement

In these methods the input and output data values follow a non-linear transformation The

general form of the non-linear contrast enhancement is defined by y = f (x) where x is the

input data value and y is the output data value The non-linear contrast enhancement

techniques have been found to be useful for enhancing the colour contrast between the nearly

classes and subclasses of a main class

A type of non linear contrast stretch involves scaling the input data logarithmically This

enhancement has greatest impact on the brightness values found in the darker part of

histogram It could be reversed to enhance values in brighter part of histogram by scaling the

input data using an inverse log

function Histogram equalization is another non-linear contrast enhancement technique In

this technique histogram of the original image is redistributed to produce a uniform

population density This is obtained by grouping certain adjacent grey values Thus the

number of grey levels in the enhanced image is less than the number of grey levels in the

original image

SPATIAL FILTERING

A characteristic of remotely sensed images is a parameter called spatial frequency defined as

number of changes in Brightness Value per unit distance for any particular part of an image

If there are very few changes in Brightness Value once a given area in an image this is

referred to as low frequency area Conversely if the Brightness Value changes dramatically

over short distances this is an area of high frequency Spatial filtering is the process of

dividing the image into its constituent spatial frequencies and selectively altering certain

spatial frequencies to emphasize some image features This technique increases the analystrsquos

ability to discriminate detail The three types of spatial filters used in remote sensor data

processing are Low pass filters Band pass filters and High pass filters

Low-Frequency Filtering in the Spatial Domain

Image enhancements that de-emphasize or block the high spatial frequency detail are low-

frequency or low-pass filters The simplest low-frequency filter evaluates a particular input

pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new

brightness value BVout that is the mean of this convolution The size of the neighbourhood

convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing

operation will however blur the image especially at the edges of objects Blurring becomes

more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass

image being two lines and two columns smaller than the original image Techniques that can

be applied to deal with this problem include (1) artificially extending the original image

beyond its border by repeating the original border pixel brightness values or (2) replicating

the averaged brightness values near the borders based on the image behaviour within a view

pixels of the border The most commonly used low pass filters are mean median and mode

filters

High-Frequency Filtering in the Spatial Domain

High-pass filtering is applied to imagery to remove the slowly varying components and

enhance the high-frequency local variations Brightness values tend to be highly correlated in

a nine-element window Thus the highfrequency filtered image will have a relatively narrow

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 4: Anu Document

has been done in this field there is no system that completely solves the problem [14] This is

because there are several complicated parts to this problem of fracture detection Digital X-

rays are very detailed and complicated to interpret Bones have different sizes and can differ

in characteristics from person to person So finding a general method to locate the bone and

decide if its fractured or not is a complex problem Some of the main aspects to the problem

of automatic bone fracture detection are bone orientation in the X-ray extracting bone

contour information bone segmentation extraction of relevant features

13 Description of the Problem

This thesis investigates the different ways of separating a bone from an X-ray Meth ods like

edge detection and Active Shape Models are experimented with The aim of this thesis is to

find an efficient and reasonably fast way of separating the bone from the rest of the X-ray

The bone that was used for the analysis is the tibia bone The tibia also known as the

shinbone or shankbone is the larger and stronger of the two bones in the leg below the knee

in vertebrates and connects the knee with the ankle bones Details of the X-ray data used are

provided in the next section

21 Theory Development

A typical digital image processing system consists of image segmentation feature extraction

pattern recognition thresholding and error classification Image processing aims at extracting

the necessary information from the image The image needs to be reduced to certain defining

characteristics and the analysis of these characteristics gives the relevant information Figure

21 shows a process flow diagram of a typical digital image processing system showing the

sequence of the operations Image segmentation is the main focus of this thesis The other

processes are briefly described for completeness and to inform the reader of the processes in

the whole system

211 Image Segmentation

Image segmentation is the process of extracting the regions of interest from an image There

are many operations to segment images and their usage depends on the nature of the region to

be extracted For example if an image has strong edges edge detection techniques can be

used to partition the image into its components using those edges Image segmentation is the

central theme of this thesis and is doneusing several techniques Figure 22 shows how one

of the coins can be separated from the image shows the original image and highlights the

boundary of one of the coins These techniques are analyzed and the best technique to

separate bones from X-rays is suggested When dealing with bone X-ray images contour

detection is an important step in image segmentation According to [31] classical image

segmentation and contour detection can be di_erent Contour detection algorithms extract the

contour of objects whereas image segmentation separates homogeneous sections of the

image A detailed literature review and history of the image segmentation techniques used for

different applications is given in Chapter 3

2 Segmentation of Images - An Overview

Image segmentation can proceed on three diregerent ways

sup2 Manually

sup2 Automatically

sup2 Semiautomatically

21 Manual Segmentation

The pixels belonging to the same intensity range could manually be pointed out but clearly

this is a very time consuming method if the image is large A better choice would be to mark

the contours of the objects This could be done discrete from the keyboard giving high

accuracy but low speed or it could be done with the mouse with higher speed but less

accuracy The manual techniques all have in common the amount of time spent in tracing the

objects and human resources are expensive Tracing algorithms can also make use of

geometrical figures like ellipses to approximate the boundaries of the objects This has been

done a lot for medical purposes but the approximations may not be very good

22 Automatic Segmentation

Fully automatic segmentation is diplusmncult to implement due to the high complexity and

variation of images Most algorithms need some a priori information to carry out the

segmentation and for a method to be automatic this a priori information must be available to

the computer The needed apriori information could for instance be noise level and

of the objects having a special distribution

23 Semiautomatic Segmentation

Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-

mentation By giving some initial information about the structures we can proceed with

automatic methods

sup2 Thresholding

If the distribution of intensities is known thresholding divides the image into two

regions separated by a manually chosen threshold value a as follows

if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B

[YGV] This can be repeated for each region dividing them by the threshold value which

results in four regions etc However a successful segmentation requires that some properties

of the image is known beforehand This method has the drawback of including separated

regions which correctly lie within the limits specified but regionally do not belong to the

selected region These pixels could for instance appear from noise The simplest way of

choosing the threshold value would be a fixed value for instance the mean value of the

image A better choice would be a histogram derived threshold This method includes some

knowledge of the distribution of the image and will result in less misclassimacrcation

Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First

segment the image into two regions according to a temporary chosen threshold value

Then calculate the mean value of the image corresponding to the two segmented

regions Calculate a new threshold value from

thresholdnew = mean(meanregion1 + meanregion2)

and repeat until the threshold value does not change any more Finally choose this

value for the threshold segmentation

To implement the triangle algorithm construct a histogram of intensities vs number

of pixels like in Figure 21 Draw a line between the maximum value of the histogram

hmax and the minimum value hmin and calculate the distance d between the line

and and the histogram Increase hmin and repeat for all h until h = hmax The

threshold value becomes the h for which the distance d is maximised This method

is particularly eregective when the pixels of the object we seek make a weak peak

sup2 Boundary tracking

Edge-macrnding by gradients is the method of selecting a boundary manually and auto-

matically follow this gradient until returning to the same point [YGV] Returning

to the same point can be a major problem of this method Boundary tracking will

wrongly include all interior holes in the region and will meet problems if the gradient

specifying the boundary is varying or is very small A way to overcome this problem

is macrrst to calculate the gradient and then apply a threshold segmentation This will

exclude some wrongly included pixels compared to the threshold method only

Zero-crossing based procedure is a method based on the Laplacian Assume the

boundaries of an object has the property that the Laplacian will change sign across

them Consider a 1D problem where cent = 2

x2 Assume the boundary is blurred

and the gradient will have a shape like in Figure 22 The Laplacian will change

sign just around the assumed edge for position = 0 For noisy images the noise will

produce large second derivatives around zero crossings and the zero-crossing based

procedure needs a smoothing macrlter to produce satisfactory results

sup2 Clustering Methods Clustering methods group pixels into larger regions using

colour codes The colour code for each pixel is usually given as a 3D vector but

212 Feature Extraction

Feature extraction is the process of reducing the segmented image into few numbers

or sets of numbers that de_ne the relevant features of the image These features

must be carefully chosen in such a way that they are a good representation of

the image and encapsulate the necessary information Some examples of features

can be image properties like the mean standard deviation gradient and edges

Generally a combination of features is used to generate a model for the images

Cross validation is done on the images to see which features represent the image

well and those features are used Features can sometimes be assigned weights to

signify the importance of certain features For example the mean in a certain

image may be given a weight of 09 because it is more important than the standard

deviation which may have a weight of 03 assigned to it Weights generally range

from 0 to 1 and they de_ne how important the features are These features and their

respective weights are then used on a test image to get the relevant information

To classify the bone as fractured or not[27]measures the neck-shaft angle from the

segmented femur contour as a feature Texture features of the image such as Gabor

orientation (GO) Markov Random Field (MRF) and intensity gradient direction

(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in

bones These techniques are also used in [20] to look at femur fractures speci_cally

Best parameter values for the features can be found using various techniques

213 Classifiers and Pattern Recognition

After the feature extraction stage the features have to be analyzed and a pat-

tern needs to be recognized For example the features mentioned above like the

neck-shaft angle in a femur X-ray image need to be plotted The patterns can be

recognized if the neck-shaft angles of good femurs are di_erent from those of frac-

tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are

used to classify features and _nd the best values for them For example [22] used

a support vector machine called the Gini-SVM [22] and found the feature values

for GO MRF and IGD that gave the best performance overall Clustering nearest

neighbour approaches can also be used for pattern recognition and classi_cation of

images For example the gradient vector of a healthy long bone X-ray may point

in a certain direction that is very di_erent to the gradient vector of a fractured long

bone X-ray So by observing this fact a bone in an unknown X-ray image can be

classi_ed as healthy or fractured using the gradient vector of the image

214 Thresholding and Error Classi_cation

Thresholding and Error Classi_cation is the _nal stage in the digital image process-

ing system Thresholding an image is a simple technique and can be done at any

stage in the process It can be used at the start to reduce the noise in the image or

it can be used to separate certain sections in an image that has distinct variations

in pixel values Thresholding is done by comparing the value of each pixel in an

image and comparing it to a threshold The image can be separated into regions

or pixels that are greater or lesser than the threshold value Multiple thresholds

can be used to achieve thresholding with many levels Otsus method [21] is a way

of automatically thresholding any image

Thresholding is used at different stages in this thesis It is a simple and useful tool in image

processing The following figures show the effects of thresholding Thresholding of an image

can be done manually by using the histogram of the intensities in an image It is difficult to

threshold noisy images as the background intensity and the foreground intensity may not be

distinctly separate Figure 23 shows an example of an image and its histogram that has the

pixel intensities on the horizontal axis and the number of pixels on the vertical axis

(a) The original image (b) The histogram of the image

Figure 23 Histogram of image [23]

IMAGE ENHANCEMENT TECHNIQUES

Image enhancement techniques improve the quality of an image as perceived by a human

These techniques are most useful because many satellite images when examined on a colour

display give inadequate information for image interpretation There is no conscious effort to

improve the fidelity of

the image with regard to some ideal form of the image There exists a wide variety of

techniques for improving image quality The contrast stretch density slicing edge

enhancement and spatial filtering are the more commonly used techniques Image

enhancement is attempted after the image is corrected for geometric and radiometric

distortions Image enhancement methods are applied separately to each band of a

multispectral image Digital techniques have been found to be most satisfactory than the

photographic technique for image enhancement because of the precision and wide variety of

digital

processes

Contrast

Contrast generally refers to the difference in luminance or grey level values in an image and

is an important characteristic It can be defined as the ratio of the maximum intensity to the

minimum intensity over an image Contrast ratio has a strong bearing on the resolving power

and detectability

of an image Larger this ratio more easy it is to interpret the image Satellite images lack

adequate contrast and require contrast improvement

Contrast Enhancement

Contrast enhancement techniques expand the range of brightness values in an image so that

the image can be efficiently displayed in a manner desired by the analyst The density values

in a scene are literally pulled farther apart that is expanded over a greater range The effect

is to increase the visual

contrast between two areas of different uniform densities This enables the analyst to

discriminate easily between areas initially having a small difference in density

Linear Contrast Stretch

This is the simplest contrast stretch algorithm The grey values in the original image and the

modified image follow a linear relation in this algorithm A density number in the low range

of the original histogram is assigned to extremely black and a value at the high end is

assigned to extremely white The remaining pixel values are distributed linearly between

these extremes The features or details that were obscure on the original image will be clear

in the contrast stretched image Linear contrast stretch operation can be represented

graphically as shown in Fig 4 To provide optimal contrast

and colour variation in colour composites the small range of grey values in each band is

stretched to the full brightness range of the output or display unit

Non-Linear Contrast Enhancement

In these methods the input and output data values follow a non-linear transformation The

general form of the non-linear contrast enhancement is defined by y = f (x) where x is the

input data value and y is the output data value The non-linear contrast enhancement

techniques have been found to be useful for enhancing the colour contrast between the nearly

classes and subclasses of a main class

A type of non linear contrast stretch involves scaling the input data logarithmically This

enhancement has greatest impact on the brightness values found in the darker part of

histogram It could be reversed to enhance values in brighter part of histogram by scaling the

input data using an inverse log

function Histogram equalization is another non-linear contrast enhancement technique In

this technique histogram of the original image is redistributed to produce a uniform

population density This is obtained by grouping certain adjacent grey values Thus the

number of grey levels in the enhanced image is less than the number of grey levels in the

original image

SPATIAL FILTERING

A characteristic of remotely sensed images is a parameter called spatial frequency defined as

number of changes in Brightness Value per unit distance for any particular part of an image

If there are very few changes in Brightness Value once a given area in an image this is

referred to as low frequency area Conversely if the Brightness Value changes dramatically

over short distances this is an area of high frequency Spatial filtering is the process of

dividing the image into its constituent spatial frequencies and selectively altering certain

spatial frequencies to emphasize some image features This technique increases the analystrsquos

ability to discriminate detail The three types of spatial filters used in remote sensor data

processing are Low pass filters Band pass filters and High pass filters

Low-Frequency Filtering in the Spatial Domain

Image enhancements that de-emphasize or block the high spatial frequency detail are low-

frequency or low-pass filters The simplest low-frequency filter evaluates a particular input

pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new

brightness value BVout that is the mean of this convolution The size of the neighbourhood

convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing

operation will however blur the image especially at the edges of objects Blurring becomes

more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass

image being two lines and two columns smaller than the original image Techniques that can

be applied to deal with this problem include (1) artificially extending the original image

beyond its border by repeating the original border pixel brightness values or (2) replicating

the averaged brightness values near the borders based on the image behaviour within a view

pixels of the border The most commonly used low pass filters are mean median and mode

filters

High-Frequency Filtering in the Spatial Domain

High-pass filtering is applied to imagery to remove the slowly varying components and

enhance the high-frequency local variations Brightness values tend to be highly correlated in

a nine-element window Thus the highfrequency filtered image will have a relatively narrow

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 5: Anu Document

used to partition the image into its components using those edges Image segmentation is the

central theme of this thesis and is doneusing several techniques Figure 22 shows how one

of the coins can be separated from the image shows the original image and highlights the

boundary of one of the coins These techniques are analyzed and the best technique to

separate bones from X-rays is suggested When dealing with bone X-ray images contour

detection is an important step in image segmentation According to [31] classical image

segmentation and contour detection can be di_erent Contour detection algorithms extract the

contour of objects whereas image segmentation separates homogeneous sections of the

image A detailed literature review and history of the image segmentation techniques used for

different applications is given in Chapter 3

2 Segmentation of Images - An Overview

Image segmentation can proceed on three diregerent ways

sup2 Manually

sup2 Automatically

sup2 Semiautomatically

21 Manual Segmentation

The pixels belonging to the same intensity range could manually be pointed out but clearly

this is a very time consuming method if the image is large A better choice would be to mark

the contours of the objects This could be done discrete from the keyboard giving high

accuracy but low speed or it could be done with the mouse with higher speed but less

accuracy The manual techniques all have in common the amount of time spent in tracing the

objects and human resources are expensive Tracing algorithms can also make use of

geometrical figures like ellipses to approximate the boundaries of the objects This has been

done a lot for medical purposes but the approximations may not be very good

22 Automatic Segmentation

Fully automatic segmentation is diplusmncult to implement due to the high complexity and

variation of images Most algorithms need some a priori information to carry out the

segmentation and for a method to be automatic this a priori information must be available to

the computer The needed apriori information could for instance be noise level and

of the objects having a special distribution

23 Semiautomatic Segmentation

Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-

mentation By giving some initial information about the structures we can proceed with

automatic methods

sup2 Thresholding

If the distribution of intensities is known thresholding divides the image into two

regions separated by a manually chosen threshold value a as follows

if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B

[YGV] This can be repeated for each region dividing them by the threshold value which

results in four regions etc However a successful segmentation requires that some properties

of the image is known beforehand This method has the drawback of including separated

regions which correctly lie within the limits specified but regionally do not belong to the

selected region These pixels could for instance appear from noise The simplest way of

choosing the threshold value would be a fixed value for instance the mean value of the

image A better choice would be a histogram derived threshold This method includes some

knowledge of the distribution of the image and will result in less misclassimacrcation

Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First

segment the image into two regions according to a temporary chosen threshold value

Then calculate the mean value of the image corresponding to the two segmented

regions Calculate a new threshold value from

thresholdnew = mean(meanregion1 + meanregion2)

and repeat until the threshold value does not change any more Finally choose this

value for the threshold segmentation

To implement the triangle algorithm construct a histogram of intensities vs number

of pixels like in Figure 21 Draw a line between the maximum value of the histogram

hmax and the minimum value hmin and calculate the distance d between the line

and and the histogram Increase hmin and repeat for all h until h = hmax The

threshold value becomes the h for which the distance d is maximised This method

is particularly eregective when the pixels of the object we seek make a weak peak

sup2 Boundary tracking

Edge-macrnding by gradients is the method of selecting a boundary manually and auto-

matically follow this gradient until returning to the same point [YGV] Returning

to the same point can be a major problem of this method Boundary tracking will

wrongly include all interior holes in the region and will meet problems if the gradient

specifying the boundary is varying or is very small A way to overcome this problem

is macrrst to calculate the gradient and then apply a threshold segmentation This will

exclude some wrongly included pixels compared to the threshold method only

Zero-crossing based procedure is a method based on the Laplacian Assume the

boundaries of an object has the property that the Laplacian will change sign across

them Consider a 1D problem where cent = 2

x2 Assume the boundary is blurred

and the gradient will have a shape like in Figure 22 The Laplacian will change

sign just around the assumed edge for position = 0 For noisy images the noise will

produce large second derivatives around zero crossings and the zero-crossing based

procedure needs a smoothing macrlter to produce satisfactory results

sup2 Clustering Methods Clustering methods group pixels into larger regions using

colour codes The colour code for each pixel is usually given as a 3D vector but

212 Feature Extraction

Feature extraction is the process of reducing the segmented image into few numbers

or sets of numbers that de_ne the relevant features of the image These features

must be carefully chosen in such a way that they are a good representation of

the image and encapsulate the necessary information Some examples of features

can be image properties like the mean standard deviation gradient and edges

Generally a combination of features is used to generate a model for the images

Cross validation is done on the images to see which features represent the image

well and those features are used Features can sometimes be assigned weights to

signify the importance of certain features For example the mean in a certain

image may be given a weight of 09 because it is more important than the standard

deviation which may have a weight of 03 assigned to it Weights generally range

from 0 to 1 and they de_ne how important the features are These features and their

respective weights are then used on a test image to get the relevant information

To classify the bone as fractured or not[27]measures the neck-shaft angle from the

segmented femur contour as a feature Texture features of the image such as Gabor

orientation (GO) Markov Random Field (MRF) and intensity gradient direction

(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in

bones These techniques are also used in [20] to look at femur fractures speci_cally

Best parameter values for the features can be found using various techniques

213 Classifiers and Pattern Recognition

After the feature extraction stage the features have to be analyzed and a pat-

tern needs to be recognized For example the features mentioned above like the

neck-shaft angle in a femur X-ray image need to be plotted The patterns can be

recognized if the neck-shaft angles of good femurs are di_erent from those of frac-

tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are

used to classify features and _nd the best values for them For example [22] used

a support vector machine called the Gini-SVM [22] and found the feature values

for GO MRF and IGD that gave the best performance overall Clustering nearest

neighbour approaches can also be used for pattern recognition and classi_cation of

images For example the gradient vector of a healthy long bone X-ray may point

in a certain direction that is very di_erent to the gradient vector of a fractured long

bone X-ray So by observing this fact a bone in an unknown X-ray image can be

classi_ed as healthy or fractured using the gradient vector of the image

214 Thresholding and Error Classi_cation

Thresholding and Error Classi_cation is the _nal stage in the digital image process-

ing system Thresholding an image is a simple technique and can be done at any

stage in the process It can be used at the start to reduce the noise in the image or

it can be used to separate certain sections in an image that has distinct variations

in pixel values Thresholding is done by comparing the value of each pixel in an

image and comparing it to a threshold The image can be separated into regions

or pixels that are greater or lesser than the threshold value Multiple thresholds

can be used to achieve thresholding with many levels Otsus method [21] is a way

of automatically thresholding any image

Thresholding is used at different stages in this thesis It is a simple and useful tool in image

processing The following figures show the effects of thresholding Thresholding of an image

can be done manually by using the histogram of the intensities in an image It is difficult to

threshold noisy images as the background intensity and the foreground intensity may not be

distinctly separate Figure 23 shows an example of an image and its histogram that has the

pixel intensities on the horizontal axis and the number of pixels on the vertical axis

(a) The original image (b) The histogram of the image

Figure 23 Histogram of image [23]

IMAGE ENHANCEMENT TECHNIQUES

Image enhancement techniques improve the quality of an image as perceived by a human

These techniques are most useful because many satellite images when examined on a colour

display give inadequate information for image interpretation There is no conscious effort to

improve the fidelity of

the image with regard to some ideal form of the image There exists a wide variety of

techniques for improving image quality The contrast stretch density slicing edge

enhancement and spatial filtering are the more commonly used techniques Image

enhancement is attempted after the image is corrected for geometric and radiometric

distortions Image enhancement methods are applied separately to each band of a

multispectral image Digital techniques have been found to be most satisfactory than the

photographic technique for image enhancement because of the precision and wide variety of

digital

processes

Contrast

Contrast generally refers to the difference in luminance or grey level values in an image and

is an important characteristic It can be defined as the ratio of the maximum intensity to the

minimum intensity over an image Contrast ratio has a strong bearing on the resolving power

and detectability

of an image Larger this ratio more easy it is to interpret the image Satellite images lack

adequate contrast and require contrast improvement

Contrast Enhancement

Contrast enhancement techniques expand the range of brightness values in an image so that

the image can be efficiently displayed in a manner desired by the analyst The density values

in a scene are literally pulled farther apart that is expanded over a greater range The effect

is to increase the visual

contrast between two areas of different uniform densities This enables the analyst to

discriminate easily between areas initially having a small difference in density

Linear Contrast Stretch

This is the simplest contrast stretch algorithm The grey values in the original image and the

modified image follow a linear relation in this algorithm A density number in the low range

of the original histogram is assigned to extremely black and a value at the high end is

assigned to extremely white The remaining pixel values are distributed linearly between

these extremes The features or details that were obscure on the original image will be clear

in the contrast stretched image Linear contrast stretch operation can be represented

graphically as shown in Fig 4 To provide optimal contrast

and colour variation in colour composites the small range of grey values in each band is

stretched to the full brightness range of the output or display unit

Non-Linear Contrast Enhancement

In these methods the input and output data values follow a non-linear transformation The

general form of the non-linear contrast enhancement is defined by y = f (x) where x is the

input data value and y is the output data value The non-linear contrast enhancement

techniques have been found to be useful for enhancing the colour contrast between the nearly

classes and subclasses of a main class

A type of non linear contrast stretch involves scaling the input data logarithmically This

enhancement has greatest impact on the brightness values found in the darker part of

histogram It could be reversed to enhance values in brighter part of histogram by scaling the

input data using an inverse log

function Histogram equalization is another non-linear contrast enhancement technique In

this technique histogram of the original image is redistributed to produce a uniform

population density This is obtained by grouping certain adjacent grey values Thus the

number of grey levels in the enhanced image is less than the number of grey levels in the

original image

SPATIAL FILTERING

A characteristic of remotely sensed images is a parameter called spatial frequency defined as

number of changes in Brightness Value per unit distance for any particular part of an image

If there are very few changes in Brightness Value once a given area in an image this is

referred to as low frequency area Conversely if the Brightness Value changes dramatically

over short distances this is an area of high frequency Spatial filtering is the process of

dividing the image into its constituent spatial frequencies and selectively altering certain

spatial frequencies to emphasize some image features This technique increases the analystrsquos

ability to discriminate detail The three types of spatial filters used in remote sensor data

processing are Low pass filters Band pass filters and High pass filters

Low-Frequency Filtering in the Spatial Domain

Image enhancements that de-emphasize or block the high spatial frequency detail are low-

frequency or low-pass filters The simplest low-frequency filter evaluates a particular input

pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new

brightness value BVout that is the mean of this convolution The size of the neighbourhood

convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing

operation will however blur the image especially at the edges of objects Blurring becomes

more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass

image being two lines and two columns smaller than the original image Techniques that can

be applied to deal with this problem include (1) artificially extending the original image

beyond its border by repeating the original border pixel brightness values or (2) replicating

the averaged brightness values near the borders based on the image behaviour within a view

pixels of the border The most commonly used low pass filters are mean median and mode

filters

High-Frequency Filtering in the Spatial Domain

High-pass filtering is applied to imagery to remove the slowly varying components and

enhance the high-frequency local variations Brightness values tend to be highly correlated in

a nine-element window Thus the highfrequency filtered image will have a relatively narrow

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 6: Anu Document

Semiautomatic segmentation combines the benemacrts of both manual and automatic seg-

mentation By giving some initial information about the structures we can proceed with

automatic methods

sup2 Thresholding

If the distribution of intensities is known thresholding divides the image into two

regions separated by a manually chosen threshold value a as follows

if B(i j) cedil a B(i j) = 1 (object) else B(i j) = 0 (background) for all i j over the image B

[YGV] This can be repeated for each region dividing them by the threshold value which

results in four regions etc However a successful segmentation requires that some properties

of the image is known beforehand This method has the drawback of including separated

regions which correctly lie within the limits specified but regionally do not belong to the

selected region These pixels could for instance appear from noise The simplest way of

choosing the threshold value would be a fixed value for instance the mean value of the

image A better choice would be a histogram derived threshold This method includes some

knowledge of the distribution of the image and will result in less misclassimacrcation

Isodata algorithm is an iterative process for macrnding the threshold value [YGV] First

segment the image into two regions according to a temporary chosen threshold value

Then calculate the mean value of the image corresponding to the two segmented

regions Calculate a new threshold value from

thresholdnew = mean(meanregion1 + meanregion2)

and repeat until the threshold value does not change any more Finally choose this

value for the threshold segmentation

To implement the triangle algorithm construct a histogram of intensities vs number

of pixels like in Figure 21 Draw a line between the maximum value of the histogram

hmax and the minimum value hmin and calculate the distance d between the line

and and the histogram Increase hmin and repeat for all h until h = hmax The

threshold value becomes the h for which the distance d is maximised This method

is particularly eregective when the pixels of the object we seek make a weak peak

sup2 Boundary tracking

Edge-macrnding by gradients is the method of selecting a boundary manually and auto-

matically follow this gradient until returning to the same point [YGV] Returning

to the same point can be a major problem of this method Boundary tracking will

wrongly include all interior holes in the region and will meet problems if the gradient

specifying the boundary is varying or is very small A way to overcome this problem

is macrrst to calculate the gradient and then apply a threshold segmentation This will

exclude some wrongly included pixels compared to the threshold method only

Zero-crossing based procedure is a method based on the Laplacian Assume the

boundaries of an object has the property that the Laplacian will change sign across

them Consider a 1D problem where cent = 2

x2 Assume the boundary is blurred

and the gradient will have a shape like in Figure 22 The Laplacian will change

sign just around the assumed edge for position = 0 For noisy images the noise will

produce large second derivatives around zero crossings and the zero-crossing based

procedure needs a smoothing macrlter to produce satisfactory results

sup2 Clustering Methods Clustering methods group pixels into larger regions using

colour codes The colour code for each pixel is usually given as a 3D vector but

212 Feature Extraction

Feature extraction is the process of reducing the segmented image into few numbers

or sets of numbers that de_ne the relevant features of the image These features

must be carefully chosen in such a way that they are a good representation of

the image and encapsulate the necessary information Some examples of features

can be image properties like the mean standard deviation gradient and edges

Generally a combination of features is used to generate a model for the images

Cross validation is done on the images to see which features represent the image

well and those features are used Features can sometimes be assigned weights to

signify the importance of certain features For example the mean in a certain

image may be given a weight of 09 because it is more important than the standard

deviation which may have a weight of 03 assigned to it Weights generally range

from 0 to 1 and they de_ne how important the features are These features and their

respective weights are then used on a test image to get the relevant information

To classify the bone as fractured or not[27]measures the neck-shaft angle from the

segmented femur contour as a feature Texture features of the image such as Gabor

orientation (GO) Markov Random Field (MRF) and intensity gradient direction

(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in

bones These techniques are also used in [20] to look at femur fractures speci_cally

Best parameter values for the features can be found using various techniques

213 Classifiers and Pattern Recognition

After the feature extraction stage the features have to be analyzed and a pat-

tern needs to be recognized For example the features mentioned above like the

neck-shaft angle in a femur X-ray image need to be plotted The patterns can be

recognized if the neck-shaft angles of good femurs are di_erent from those of frac-

tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are

used to classify features and _nd the best values for them For example [22] used

a support vector machine called the Gini-SVM [22] and found the feature values

for GO MRF and IGD that gave the best performance overall Clustering nearest

neighbour approaches can also be used for pattern recognition and classi_cation of

images For example the gradient vector of a healthy long bone X-ray may point

in a certain direction that is very di_erent to the gradient vector of a fractured long

bone X-ray So by observing this fact a bone in an unknown X-ray image can be

classi_ed as healthy or fractured using the gradient vector of the image

214 Thresholding and Error Classi_cation

Thresholding and Error Classi_cation is the _nal stage in the digital image process-

ing system Thresholding an image is a simple technique and can be done at any

stage in the process It can be used at the start to reduce the noise in the image or

it can be used to separate certain sections in an image that has distinct variations

in pixel values Thresholding is done by comparing the value of each pixel in an

image and comparing it to a threshold The image can be separated into regions

or pixels that are greater or lesser than the threshold value Multiple thresholds

can be used to achieve thresholding with many levels Otsus method [21] is a way

of automatically thresholding any image

Thresholding is used at different stages in this thesis It is a simple and useful tool in image

processing The following figures show the effects of thresholding Thresholding of an image

can be done manually by using the histogram of the intensities in an image It is difficult to

threshold noisy images as the background intensity and the foreground intensity may not be

distinctly separate Figure 23 shows an example of an image and its histogram that has the

pixel intensities on the horizontal axis and the number of pixels on the vertical axis

(a) The original image (b) The histogram of the image

Figure 23 Histogram of image [23]

IMAGE ENHANCEMENT TECHNIQUES

Image enhancement techniques improve the quality of an image as perceived by a human

These techniques are most useful because many satellite images when examined on a colour

display give inadequate information for image interpretation There is no conscious effort to

improve the fidelity of

the image with regard to some ideal form of the image There exists a wide variety of

techniques for improving image quality The contrast stretch density slicing edge

enhancement and spatial filtering are the more commonly used techniques Image

enhancement is attempted after the image is corrected for geometric and radiometric

distortions Image enhancement methods are applied separately to each band of a

multispectral image Digital techniques have been found to be most satisfactory than the

photographic technique for image enhancement because of the precision and wide variety of

digital

processes

Contrast

Contrast generally refers to the difference in luminance or grey level values in an image and

is an important characteristic It can be defined as the ratio of the maximum intensity to the

minimum intensity over an image Contrast ratio has a strong bearing on the resolving power

and detectability

of an image Larger this ratio more easy it is to interpret the image Satellite images lack

adequate contrast and require contrast improvement

Contrast Enhancement

Contrast enhancement techniques expand the range of brightness values in an image so that

the image can be efficiently displayed in a manner desired by the analyst The density values

in a scene are literally pulled farther apart that is expanded over a greater range The effect

is to increase the visual

contrast between two areas of different uniform densities This enables the analyst to

discriminate easily between areas initially having a small difference in density

Linear Contrast Stretch

This is the simplest contrast stretch algorithm The grey values in the original image and the

modified image follow a linear relation in this algorithm A density number in the low range

of the original histogram is assigned to extremely black and a value at the high end is

assigned to extremely white The remaining pixel values are distributed linearly between

these extremes The features or details that were obscure on the original image will be clear

in the contrast stretched image Linear contrast stretch operation can be represented

graphically as shown in Fig 4 To provide optimal contrast

and colour variation in colour composites the small range of grey values in each band is

stretched to the full brightness range of the output or display unit

Non-Linear Contrast Enhancement

In these methods the input and output data values follow a non-linear transformation The

general form of the non-linear contrast enhancement is defined by y = f (x) where x is the

input data value and y is the output data value The non-linear contrast enhancement

techniques have been found to be useful for enhancing the colour contrast between the nearly

classes and subclasses of a main class

A type of non linear contrast stretch involves scaling the input data logarithmically This

enhancement has greatest impact on the brightness values found in the darker part of

histogram It could be reversed to enhance values in brighter part of histogram by scaling the

input data using an inverse log

function Histogram equalization is another non-linear contrast enhancement technique In

this technique histogram of the original image is redistributed to produce a uniform

population density This is obtained by grouping certain adjacent grey values Thus the

number of grey levels in the enhanced image is less than the number of grey levels in the

original image

SPATIAL FILTERING

A characteristic of remotely sensed images is a parameter called spatial frequency defined as

number of changes in Brightness Value per unit distance for any particular part of an image

If there are very few changes in Brightness Value once a given area in an image this is

referred to as low frequency area Conversely if the Brightness Value changes dramatically

over short distances this is an area of high frequency Spatial filtering is the process of

dividing the image into its constituent spatial frequencies and selectively altering certain

spatial frequencies to emphasize some image features This technique increases the analystrsquos

ability to discriminate detail The three types of spatial filters used in remote sensor data

processing are Low pass filters Band pass filters and High pass filters

Low-Frequency Filtering in the Spatial Domain

Image enhancements that de-emphasize or block the high spatial frequency detail are low-

frequency or low-pass filters The simplest low-frequency filter evaluates a particular input

pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new

brightness value BVout that is the mean of this convolution The size of the neighbourhood

convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing

operation will however blur the image especially at the edges of objects Blurring becomes

more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass

image being two lines and two columns smaller than the original image Techniques that can

be applied to deal with this problem include (1) artificially extending the original image

beyond its border by repeating the original border pixel brightness values or (2) replicating

the averaged brightness values near the borders based on the image behaviour within a view

pixels of the border The most commonly used low pass filters are mean median and mode

filters

High-Frequency Filtering in the Spatial Domain

High-pass filtering is applied to imagery to remove the slowly varying components and

enhance the high-frequency local variations Brightness values tend to be highly correlated in

a nine-element window Thus the highfrequency filtered image will have a relatively narrow

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 7: Anu Document

specifying the boundary is varying or is very small A way to overcome this problem

is macrrst to calculate the gradient and then apply a threshold segmentation This will

exclude some wrongly included pixels compared to the threshold method only

Zero-crossing based procedure is a method based on the Laplacian Assume the

boundaries of an object has the property that the Laplacian will change sign across

them Consider a 1D problem where cent = 2

x2 Assume the boundary is blurred

and the gradient will have a shape like in Figure 22 The Laplacian will change

sign just around the assumed edge for position = 0 For noisy images the noise will

produce large second derivatives around zero crossings and the zero-crossing based

procedure needs a smoothing macrlter to produce satisfactory results

sup2 Clustering Methods Clustering methods group pixels into larger regions using

colour codes The colour code for each pixel is usually given as a 3D vector but

212 Feature Extraction

Feature extraction is the process of reducing the segmented image into few numbers

or sets of numbers that de_ne the relevant features of the image These features

must be carefully chosen in such a way that they are a good representation of

the image and encapsulate the necessary information Some examples of features

can be image properties like the mean standard deviation gradient and edges

Generally a combination of features is used to generate a model for the images

Cross validation is done on the images to see which features represent the image

well and those features are used Features can sometimes be assigned weights to

signify the importance of certain features For example the mean in a certain

image may be given a weight of 09 because it is more important than the standard

deviation which may have a weight of 03 assigned to it Weights generally range

from 0 to 1 and they de_ne how important the features are These features and their

respective weights are then used on a test image to get the relevant information

To classify the bone as fractured or not[27]measures the neck-shaft angle from the

segmented femur contour as a feature Texture features of the image such as Gabor

orientation (GO) Markov Random Field (MRF) and intensity gradient direction

(IGD) are used by [22] to generate a combination of classi_ers to detect fractures in

bones These techniques are also used in [20] to look at femur fractures speci_cally

Best parameter values for the features can be found using various techniques

213 Classifiers and Pattern Recognition

After the feature extraction stage the features have to be analyzed and a pat-

tern needs to be recognized For example the features mentioned above like the

neck-shaft angle in a femur X-ray image need to be plotted The patterns can be

recognized if the neck-shaft angles of good femurs are di_erent from those of frac-

tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are

used to classify features and _nd the best values for them For example [22] used

a support vector machine called the Gini-SVM [22] and found the feature values

for GO MRF and IGD that gave the best performance overall Clustering nearest

neighbour approaches can also be used for pattern recognition and classi_cation of

images For example the gradient vector of a healthy long bone X-ray may point

in a certain direction that is very di_erent to the gradient vector of a fractured long

bone X-ray So by observing this fact a bone in an unknown X-ray image can be

classi_ed as healthy or fractured using the gradient vector of the image

214 Thresholding and Error Classi_cation

Thresholding and Error Classi_cation is the _nal stage in the digital image process-

ing system Thresholding an image is a simple technique and can be done at any

stage in the process It can be used at the start to reduce the noise in the image or

it can be used to separate certain sections in an image that has distinct variations

in pixel values Thresholding is done by comparing the value of each pixel in an

image and comparing it to a threshold The image can be separated into regions

or pixels that are greater or lesser than the threshold value Multiple thresholds

can be used to achieve thresholding with many levels Otsus method [21] is a way

of automatically thresholding any image

Thresholding is used at different stages in this thesis It is a simple and useful tool in image

processing The following figures show the effects of thresholding Thresholding of an image

can be done manually by using the histogram of the intensities in an image It is difficult to

threshold noisy images as the background intensity and the foreground intensity may not be

distinctly separate Figure 23 shows an example of an image and its histogram that has the

pixel intensities on the horizontal axis and the number of pixels on the vertical axis

(a) The original image (b) The histogram of the image

Figure 23 Histogram of image [23]

IMAGE ENHANCEMENT TECHNIQUES

Image enhancement techniques improve the quality of an image as perceived by a human

These techniques are most useful because many satellite images when examined on a colour

display give inadequate information for image interpretation There is no conscious effort to

improve the fidelity of

the image with regard to some ideal form of the image There exists a wide variety of

techniques for improving image quality The contrast stretch density slicing edge

enhancement and spatial filtering are the more commonly used techniques Image

enhancement is attempted after the image is corrected for geometric and radiometric

distortions Image enhancement methods are applied separately to each band of a

multispectral image Digital techniques have been found to be most satisfactory than the

photographic technique for image enhancement because of the precision and wide variety of

digital

processes

Contrast

Contrast generally refers to the difference in luminance or grey level values in an image and

is an important characteristic It can be defined as the ratio of the maximum intensity to the

minimum intensity over an image Contrast ratio has a strong bearing on the resolving power

and detectability

of an image Larger this ratio more easy it is to interpret the image Satellite images lack

adequate contrast and require contrast improvement

Contrast Enhancement

Contrast enhancement techniques expand the range of brightness values in an image so that

the image can be efficiently displayed in a manner desired by the analyst The density values

in a scene are literally pulled farther apart that is expanded over a greater range The effect

is to increase the visual

contrast between two areas of different uniform densities This enables the analyst to

discriminate easily between areas initially having a small difference in density

Linear Contrast Stretch

This is the simplest contrast stretch algorithm The grey values in the original image and the

modified image follow a linear relation in this algorithm A density number in the low range

of the original histogram is assigned to extremely black and a value at the high end is

assigned to extremely white The remaining pixel values are distributed linearly between

these extremes The features or details that were obscure on the original image will be clear

in the contrast stretched image Linear contrast stretch operation can be represented

graphically as shown in Fig 4 To provide optimal contrast

and colour variation in colour composites the small range of grey values in each band is

stretched to the full brightness range of the output or display unit

Non-Linear Contrast Enhancement

In these methods the input and output data values follow a non-linear transformation The

general form of the non-linear contrast enhancement is defined by y = f (x) where x is the

input data value and y is the output data value The non-linear contrast enhancement

techniques have been found to be useful for enhancing the colour contrast between the nearly

classes and subclasses of a main class

A type of non linear contrast stretch involves scaling the input data logarithmically This

enhancement has greatest impact on the brightness values found in the darker part of

histogram It could be reversed to enhance values in brighter part of histogram by scaling the

input data using an inverse log

function Histogram equalization is another non-linear contrast enhancement technique In

this technique histogram of the original image is redistributed to produce a uniform

population density This is obtained by grouping certain adjacent grey values Thus the

number of grey levels in the enhanced image is less than the number of grey levels in the

original image

SPATIAL FILTERING

A characteristic of remotely sensed images is a parameter called spatial frequency defined as

number of changes in Brightness Value per unit distance for any particular part of an image

If there are very few changes in Brightness Value once a given area in an image this is

referred to as low frequency area Conversely if the Brightness Value changes dramatically

over short distances this is an area of high frequency Spatial filtering is the process of

dividing the image into its constituent spatial frequencies and selectively altering certain

spatial frequencies to emphasize some image features This technique increases the analystrsquos

ability to discriminate detail The three types of spatial filters used in remote sensor data

processing are Low pass filters Band pass filters and High pass filters

Low-Frequency Filtering in the Spatial Domain

Image enhancements that de-emphasize or block the high spatial frequency detail are low-

frequency or low-pass filters The simplest low-frequency filter evaluates a particular input

pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new

brightness value BVout that is the mean of this convolution The size of the neighbourhood

convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing

operation will however blur the image especially at the edges of objects Blurring becomes

more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass

image being two lines and two columns smaller than the original image Techniques that can

be applied to deal with this problem include (1) artificially extending the original image

beyond its border by repeating the original border pixel brightness values or (2) replicating

the averaged brightness values near the borders based on the image behaviour within a view

pixels of the border The most commonly used low pass filters are mean median and mode

filters

High-Frequency Filtering in the Spatial Domain

High-pass filtering is applied to imagery to remove the slowly varying components and

enhance the high-frequency local variations Brightness values tend to be highly correlated in

a nine-element window Thus the highfrequency filtered image will have a relatively narrow

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 8: Anu Document

bones These techniques are also used in [20] to look at femur fractures speci_cally

Best parameter values for the features can be found using various techniques

213 Classifiers and Pattern Recognition

After the feature extraction stage the features have to be analyzed and a pat-

tern needs to be recognized For example the features mentioned above like the

neck-shaft angle in a femur X-ray image need to be plotted The patterns can be

recognized if the neck-shaft angles of good femurs are di_erent from those of frac-

tured femurs Classifiers like Bayesian classifiers and Support Vector Machines are

used to classify features and _nd the best values for them For example [22] used

a support vector machine called the Gini-SVM [22] and found the feature values

for GO MRF and IGD that gave the best performance overall Clustering nearest

neighbour approaches can also be used for pattern recognition and classi_cation of

images For example the gradient vector of a healthy long bone X-ray may point

in a certain direction that is very di_erent to the gradient vector of a fractured long

bone X-ray So by observing this fact a bone in an unknown X-ray image can be

classi_ed as healthy or fractured using the gradient vector of the image

214 Thresholding and Error Classi_cation

Thresholding and Error Classi_cation is the _nal stage in the digital image process-

ing system Thresholding an image is a simple technique and can be done at any

stage in the process It can be used at the start to reduce the noise in the image or

it can be used to separate certain sections in an image that has distinct variations

in pixel values Thresholding is done by comparing the value of each pixel in an

image and comparing it to a threshold The image can be separated into regions

or pixels that are greater or lesser than the threshold value Multiple thresholds

can be used to achieve thresholding with many levels Otsus method [21] is a way

of automatically thresholding any image

Thresholding is used at different stages in this thesis It is a simple and useful tool in image

processing The following figures show the effects of thresholding Thresholding of an image

can be done manually by using the histogram of the intensities in an image It is difficult to

threshold noisy images as the background intensity and the foreground intensity may not be

distinctly separate Figure 23 shows an example of an image and its histogram that has the

pixel intensities on the horizontal axis and the number of pixels on the vertical axis

(a) The original image (b) The histogram of the image

Figure 23 Histogram of image [23]

IMAGE ENHANCEMENT TECHNIQUES

Image enhancement techniques improve the quality of an image as perceived by a human

These techniques are most useful because many satellite images when examined on a colour

display give inadequate information for image interpretation There is no conscious effort to

improve the fidelity of

the image with regard to some ideal form of the image There exists a wide variety of

techniques for improving image quality The contrast stretch density slicing edge

enhancement and spatial filtering are the more commonly used techniques Image

enhancement is attempted after the image is corrected for geometric and radiometric

distortions Image enhancement methods are applied separately to each band of a

multispectral image Digital techniques have been found to be most satisfactory than the

photographic technique for image enhancement because of the precision and wide variety of

digital

processes

Contrast

Contrast generally refers to the difference in luminance or grey level values in an image and

is an important characteristic It can be defined as the ratio of the maximum intensity to the

minimum intensity over an image Contrast ratio has a strong bearing on the resolving power

and detectability

of an image Larger this ratio more easy it is to interpret the image Satellite images lack

adequate contrast and require contrast improvement

Contrast Enhancement

Contrast enhancement techniques expand the range of brightness values in an image so that

the image can be efficiently displayed in a manner desired by the analyst The density values

in a scene are literally pulled farther apart that is expanded over a greater range The effect

is to increase the visual

contrast between two areas of different uniform densities This enables the analyst to

discriminate easily between areas initially having a small difference in density

Linear Contrast Stretch

This is the simplest contrast stretch algorithm The grey values in the original image and the

modified image follow a linear relation in this algorithm A density number in the low range

of the original histogram is assigned to extremely black and a value at the high end is

assigned to extremely white The remaining pixel values are distributed linearly between

these extremes The features or details that were obscure on the original image will be clear

in the contrast stretched image Linear contrast stretch operation can be represented

graphically as shown in Fig 4 To provide optimal contrast

and colour variation in colour composites the small range of grey values in each band is

stretched to the full brightness range of the output or display unit

Non-Linear Contrast Enhancement

In these methods the input and output data values follow a non-linear transformation The

general form of the non-linear contrast enhancement is defined by y = f (x) where x is the

input data value and y is the output data value The non-linear contrast enhancement

techniques have been found to be useful for enhancing the colour contrast between the nearly

classes and subclasses of a main class

A type of non linear contrast stretch involves scaling the input data logarithmically This

enhancement has greatest impact on the brightness values found in the darker part of

histogram It could be reversed to enhance values in brighter part of histogram by scaling the

input data using an inverse log

function Histogram equalization is another non-linear contrast enhancement technique In

this technique histogram of the original image is redistributed to produce a uniform

population density This is obtained by grouping certain adjacent grey values Thus the

number of grey levels in the enhanced image is less than the number of grey levels in the

original image

SPATIAL FILTERING

A characteristic of remotely sensed images is a parameter called spatial frequency defined as

number of changes in Brightness Value per unit distance for any particular part of an image

If there are very few changes in Brightness Value once a given area in an image this is

referred to as low frequency area Conversely if the Brightness Value changes dramatically

over short distances this is an area of high frequency Spatial filtering is the process of

dividing the image into its constituent spatial frequencies and selectively altering certain

spatial frequencies to emphasize some image features This technique increases the analystrsquos

ability to discriminate detail The three types of spatial filters used in remote sensor data

processing are Low pass filters Band pass filters and High pass filters

Low-Frequency Filtering in the Spatial Domain

Image enhancements that de-emphasize or block the high spatial frequency detail are low-

frequency or low-pass filters The simplest low-frequency filter evaluates a particular input

pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new

brightness value BVout that is the mean of this convolution The size of the neighbourhood

convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing

operation will however blur the image especially at the edges of objects Blurring becomes

more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass

image being two lines and two columns smaller than the original image Techniques that can

be applied to deal with this problem include (1) artificially extending the original image

beyond its border by repeating the original border pixel brightness values or (2) replicating

the averaged brightness values near the borders based on the image behaviour within a view

pixels of the border The most commonly used low pass filters are mean median and mode

filters

High-Frequency Filtering in the Spatial Domain

High-pass filtering is applied to imagery to remove the slowly varying components and

enhance the high-frequency local variations Brightness values tend to be highly correlated in

a nine-element window Thus the highfrequency filtered image will have a relatively narrow

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 9: Anu Document

(a) The original image (b) The histogram of the image

Figure 23 Histogram of image [23]

IMAGE ENHANCEMENT TECHNIQUES

Image enhancement techniques improve the quality of an image as perceived by a human

These techniques are most useful because many satellite images when examined on a colour

display give inadequate information for image interpretation There is no conscious effort to

improve the fidelity of

the image with regard to some ideal form of the image There exists a wide variety of

techniques for improving image quality The contrast stretch density slicing edge

enhancement and spatial filtering are the more commonly used techniques Image

enhancement is attempted after the image is corrected for geometric and radiometric

distortions Image enhancement methods are applied separately to each band of a

multispectral image Digital techniques have been found to be most satisfactory than the

photographic technique for image enhancement because of the precision and wide variety of

digital

processes

Contrast

Contrast generally refers to the difference in luminance or grey level values in an image and

is an important characteristic It can be defined as the ratio of the maximum intensity to the

minimum intensity over an image Contrast ratio has a strong bearing on the resolving power

and detectability

of an image Larger this ratio more easy it is to interpret the image Satellite images lack

adequate contrast and require contrast improvement

Contrast Enhancement

Contrast enhancement techniques expand the range of brightness values in an image so that

the image can be efficiently displayed in a manner desired by the analyst The density values

in a scene are literally pulled farther apart that is expanded over a greater range The effect

is to increase the visual

contrast between two areas of different uniform densities This enables the analyst to

discriminate easily between areas initially having a small difference in density

Linear Contrast Stretch

This is the simplest contrast stretch algorithm The grey values in the original image and the

modified image follow a linear relation in this algorithm A density number in the low range

of the original histogram is assigned to extremely black and a value at the high end is

assigned to extremely white The remaining pixel values are distributed linearly between

these extremes The features or details that were obscure on the original image will be clear

in the contrast stretched image Linear contrast stretch operation can be represented

graphically as shown in Fig 4 To provide optimal contrast

and colour variation in colour composites the small range of grey values in each band is

stretched to the full brightness range of the output or display unit

Non-Linear Contrast Enhancement

In these methods the input and output data values follow a non-linear transformation The

general form of the non-linear contrast enhancement is defined by y = f (x) where x is the

input data value and y is the output data value The non-linear contrast enhancement

techniques have been found to be useful for enhancing the colour contrast between the nearly

classes and subclasses of a main class

A type of non linear contrast stretch involves scaling the input data logarithmically This

enhancement has greatest impact on the brightness values found in the darker part of

histogram It could be reversed to enhance values in brighter part of histogram by scaling the

input data using an inverse log

function Histogram equalization is another non-linear contrast enhancement technique In

this technique histogram of the original image is redistributed to produce a uniform

population density This is obtained by grouping certain adjacent grey values Thus the

number of grey levels in the enhanced image is less than the number of grey levels in the

original image

SPATIAL FILTERING

A characteristic of remotely sensed images is a parameter called spatial frequency defined as

number of changes in Brightness Value per unit distance for any particular part of an image

If there are very few changes in Brightness Value once a given area in an image this is

referred to as low frequency area Conversely if the Brightness Value changes dramatically

over short distances this is an area of high frequency Spatial filtering is the process of

dividing the image into its constituent spatial frequencies and selectively altering certain

spatial frequencies to emphasize some image features This technique increases the analystrsquos

ability to discriminate detail The three types of spatial filters used in remote sensor data

processing are Low pass filters Band pass filters and High pass filters

Low-Frequency Filtering in the Spatial Domain

Image enhancements that de-emphasize or block the high spatial frequency detail are low-

frequency or low-pass filters The simplest low-frequency filter evaluates a particular input

pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new

brightness value BVout that is the mean of this convolution The size of the neighbourhood

convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing

operation will however blur the image especially at the edges of objects Blurring becomes

more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass

image being two lines and two columns smaller than the original image Techniques that can

be applied to deal with this problem include (1) artificially extending the original image

beyond its border by repeating the original border pixel brightness values or (2) replicating

the averaged brightness values near the borders based on the image behaviour within a view

pixels of the border The most commonly used low pass filters are mean median and mode

filters

High-Frequency Filtering in the Spatial Domain

High-pass filtering is applied to imagery to remove the slowly varying components and

enhance the high-frequency local variations Brightness values tend to be highly correlated in

a nine-element window Thus the highfrequency filtered image will have a relatively narrow

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 10: Anu Document

of an image Larger this ratio more easy it is to interpret the image Satellite images lack

adequate contrast and require contrast improvement

Contrast Enhancement

Contrast enhancement techniques expand the range of brightness values in an image so that

the image can be efficiently displayed in a manner desired by the analyst The density values

in a scene are literally pulled farther apart that is expanded over a greater range The effect

is to increase the visual

contrast between two areas of different uniform densities This enables the analyst to

discriminate easily between areas initially having a small difference in density

Linear Contrast Stretch

This is the simplest contrast stretch algorithm The grey values in the original image and the

modified image follow a linear relation in this algorithm A density number in the low range

of the original histogram is assigned to extremely black and a value at the high end is

assigned to extremely white The remaining pixel values are distributed linearly between

these extremes The features or details that were obscure on the original image will be clear

in the contrast stretched image Linear contrast stretch operation can be represented

graphically as shown in Fig 4 To provide optimal contrast

and colour variation in colour composites the small range of grey values in each band is

stretched to the full brightness range of the output or display unit

Non-Linear Contrast Enhancement

In these methods the input and output data values follow a non-linear transformation The

general form of the non-linear contrast enhancement is defined by y = f (x) where x is the

input data value and y is the output data value The non-linear contrast enhancement

techniques have been found to be useful for enhancing the colour contrast between the nearly

classes and subclasses of a main class

A type of non linear contrast stretch involves scaling the input data logarithmically This

enhancement has greatest impact on the brightness values found in the darker part of

histogram It could be reversed to enhance values in brighter part of histogram by scaling the

input data using an inverse log

function Histogram equalization is another non-linear contrast enhancement technique In

this technique histogram of the original image is redistributed to produce a uniform

population density This is obtained by grouping certain adjacent grey values Thus the

number of grey levels in the enhanced image is less than the number of grey levels in the

original image

SPATIAL FILTERING

A characteristic of remotely sensed images is a parameter called spatial frequency defined as

number of changes in Brightness Value per unit distance for any particular part of an image

If there are very few changes in Brightness Value once a given area in an image this is

referred to as low frequency area Conversely if the Brightness Value changes dramatically

over short distances this is an area of high frequency Spatial filtering is the process of

dividing the image into its constituent spatial frequencies and selectively altering certain

spatial frequencies to emphasize some image features This technique increases the analystrsquos

ability to discriminate detail The three types of spatial filters used in remote sensor data

processing are Low pass filters Band pass filters and High pass filters

Low-Frequency Filtering in the Spatial Domain

Image enhancements that de-emphasize or block the high spatial frequency detail are low-

frequency or low-pass filters The simplest low-frequency filter evaluates a particular input

pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new

brightness value BVout that is the mean of this convolution The size of the neighbourhood

convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing

operation will however blur the image especially at the edges of objects Blurring becomes

more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass

image being two lines and two columns smaller than the original image Techniques that can

be applied to deal with this problem include (1) artificially extending the original image

beyond its border by repeating the original border pixel brightness values or (2) replicating

the averaged brightness values near the borders based on the image behaviour within a view

pixels of the border The most commonly used low pass filters are mean median and mode

filters

High-Frequency Filtering in the Spatial Domain

High-pass filtering is applied to imagery to remove the slowly varying components and

enhance the high-frequency local variations Brightness values tend to be highly correlated in

a nine-element window Thus the highfrequency filtered image will have a relatively narrow

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 11: Anu Document

number of grey levels in the enhanced image is less than the number of grey levels in the

original image

SPATIAL FILTERING

A characteristic of remotely sensed images is a parameter called spatial frequency defined as

number of changes in Brightness Value per unit distance for any particular part of an image

If there are very few changes in Brightness Value once a given area in an image this is

referred to as low frequency area Conversely if the Brightness Value changes dramatically

over short distances this is an area of high frequency Spatial filtering is the process of

dividing the image into its constituent spatial frequencies and selectively altering certain

spatial frequencies to emphasize some image features This technique increases the analystrsquos

ability to discriminate detail The three types of spatial filters used in remote sensor data

processing are Low pass filters Band pass filters and High pass filters

Low-Frequency Filtering in the Spatial Domain

Image enhancements that de-emphasize or block the high spatial frequency detail are low-

frequency or low-pass filters The simplest low-frequency filter evaluates a particular input

pixel brightness value BVin and the pixels surrounding the input pixel and outputs a new

brightness value BVout that is the mean of this convolution The size of the neighbourhood

convolution mask or kernel (n) is usually 3x3 5x5 7x7 or 9x9 The simple smoothing

operation will however blur the image especially at the edges of objects Blurring becomes

more severe as the size of the kernel increases Using a 3x3 kernel can result in the low-pass

image being two lines and two columns smaller than the original image Techniques that can

be applied to deal with this problem include (1) artificially extending the original image

beyond its border by repeating the original border pixel brightness values or (2) replicating

the averaged brightness values near the borders based on the image behaviour within a view

pixels of the border The most commonly used low pass filters are mean median and mode

filters

High-Frequency Filtering in the Spatial Domain

High-pass filtering is applied to imagery to remove the slowly varying components and

enhance the high-frequency local variations Brightness values tend to be highly correlated in

a nine-element window Thus the highfrequency filtered image will have a relatively narrow

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 12: Anu Document

intensity histogram This suggests that the output from most high-frequency filtered images

must be contrast stretched prior to visual analysis

Edge Enhancement in the Spatial Domain

For many remote sensing earth science applications the most valuable information that may

be derived from an image is contained in the edges surrounding various objects of interest

Edge enhancement delineates these edges and makes the shapes and details comprising the

image more conspicuous and perhaps easier to analyze Generally what the eyes see as

pictorial edges are simply sharp changes in brightness value between two adjacent pixels The

edges may be enhanced using either linear or nonlinear edge enhancement techniques

Linear Edge Enhancement

A straightforward method of extracting edges in remotely sensed imagery is the application

of a directional first-difference algorithm and approximates the first derivative between two

adjacent pixels The algorithm produces the first difference of the image input in the

horizontal vertical and diagonal directions

The Laplacian operator generally highlights point lines and edges in the image and

suppresses uniform and smoothly varying regions Human vision physiological research

suggests that we see objects in much the same way Hence the use of this operation has a

more natural look than many of the other edge-enhanced images

Band ratioing

Sometimes differences in brightness values from identical surface materials are caused by

topographic slope and aspect shadows or seasonal changes in sunlight illumination angle

and intensity These conditions may hamper the ability of an interpreter or classification

algorithm to identify correctly surface materials or land use in a remotely sensed image

Fortunately ratio transformations of the remotely sensed data can in certain instances be

applied to reduce the effects of such environmental conditions In addition to minimizing the

effects of environmental factors ratios may also provide unique information not available in

any single band that is useful for discriminating between soils and vegetation

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 13: Anu Document

Chapter 3

Literature Review and History

The _rst section in this chapter describes the work that is related to the topic Many

papers use the same image segmentation techniques for di_erent problems This

section explains the methods discussed in this thesis used by researchers to solve

similar problems The subsequent section describes the workings of the common

methods of image segmentation These methods were investigated in this thesis and

are also used in other papers They include techniques like Active Shape Models

Active ContourSnake Models Texture analysis edge detection and some methods

that are only relevant for the X-ray data

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 14: Anu Document

31 Previous Research

311 Summary of Previous Research

According to [14] compared to other areas in medical imaging bone fracture detec-

tion is not well researched and published Research has been done by the National

University of Singapore to segment and detect fractures in femurs (the thigh bone)

[27] uses modi_ed Canny edge detector to detect the edges in femurs to separate it

from the X-ray The X-rays were also segmented using Snakes or Active Contour

Models (discussed in 34) and Gradient Vector Flow According to the experiments

done by [27] their algorithm achieves a classi_cation with an accuracy of 945

Canny edge detectors and Gradient Vector Flow is also used by [29] to _nd bones in

X-rays [31] proposes two methods to extract femur contours from X-rays The _rst

is a semi-automatic method which gives priority to reliability and accuracy This

method tries to _t a model of the femur contour to a femur in the X-ray The second

method is automatic and uses active contour models This method breaks down the

shape of the femur into a couple of parallel or roughly parallel lines and a circle at

the top representing the head of the femur The method detects the strong edges in

the circle and locates the turning point using the point of in_ection in the second

derivative of the image Finally it optimizes the femur contour by applying shapeconstraints

to the model

Hough and Radon transforms are used by [14] to approximate the edges of long

bones [14] also uses clustering-based algorithms also known as bi-level or localized

thresholding methods and the global segmentation algorithms to segment X-rays

Clustering-based algorithms categorize each pixel of the image as either a part of

the background or as a part of the object hence the name bi-level thresholding

based on a speci_ed threshold Global segmentation algorithms take the whole

image into consideration and sometimes work better than the clustering-based algo-

rithms Global segmentation algorithms include methods like edge detection region

extraction and deformable models (discussed in 34)

Active Contour Models initially proposed by [19] fall under the class of deformable

models and are used widely as an image segmentation tool Active Contour Models

are used to extract femur contours in X-ray images by [31] after doing edge detection

on the image using a modi_ed Canny _lter Gradient Vector Flow is also used by

[31] to extract contours and the results are compared to that of the Active Contour

Model [3] uses an Active Contour Model with curvature constraints to detect

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 15: Anu Document

femur fractures as the original Active Contour Model is susceptible to noise and

other undesired edges This method successfully extracts the femur contour with a

small restriction on shape size and orientation of the image

Active Shape Models introduced by Cootes and Taylor [9] is another widely used

statistical model for image segmentation Cootes and Taylor and their colleagues

[5 6 7 11 12 10] released a series of papers that completed the de_nition of the

original ASMs by modifying it also called classical ASMs by [24] These papers

investigated the performance of the model with gray-level variation di_erent reso-

lutions and made the model more _exible and adaptable ASMs are used by [24] to

detect facial features Some modi_cations to the original model were suggested and

experimented with The relationships between landmark points computing time

and the number of images in the training data were observed for di_erent sets of

data The results in this thesis are compared to the results in [24] The work done

in this thesis is similar to [24] as the same model is used for a di_erent application

[18] and [1] analyzed the performance of ASMs using the aspects of the de_nition

of the shape and the gray level analysis of grayscale images The data used was

facial data from a face database and it was concluded that ASMs are an accurate

way of modeling the shape and gray level appearance It was observed that the

model allows for _exibility while being constrained on the shape of the object to

be segmented This is relevant for the problem of bone segmentation as X-rays are

grayscale and the structure and shape of bones can di_er slightly The _exibility

of the model will be useful for separating bones from X-rays even though one tibia

bone di_ers from another tibia bone

The lsquoworking mechanisms of the methods discussed above are explained in detail in

312 Common Limitations of the Previous Research

As mentioned in previous chapters bone segmentation and fracture detection are both

complicated problems There are many limitations and problems in the seg- mentation

methods used Some methods and models are too limited or constrained to match the bone

accurately Accuracy of results and computing time are conflict- ing variables

It is observed in [14] that there is no automatic method of segmenting bones [14] also

recognizes the need for good initial conditions for Active Contour Models to produce a good

segmentation of bones from X-rays If the initial conditions are not good the final results will

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 16: Anu Document

be inaccurate Manual definition of the initial conditions such as the scaling or orientation of

the contour is needed so the process is not automatic [14] tries to detect fractures in long

shaft bones using Computer Aided Design (CAD) techniques

The tradeo_ between automizing the algorithm and the accuracy of the results using the

Active Shape and Active Contour Models is examined in [31] If the model is made fully

automatic by estimating the initial conditions the accuracy will be lower than when the

initial conditions of the model are defined by user inputs [31] implements both manual and

automatic approaches and identifies that automatically segmenting bone structures from noisy

X-ray images is a complex problem This thesis project tackles these limitations The manual

and automatic approaches

are tried using Active Shape Models The relationship between the size of the

training set computation time and error are studied

32 Edge Detection

Edge detection falls under the category of feature detection of images which includes other

methods like ridge detection blob detection interest point detection and scale space models

In digital imaging edges are de_ned as a set of connected pixels that

lie on the boundary between two regions in an image where the image intensity changes

formally known as discontinuities [15] The pixels or a set of pixels that form the edge are

generally of the same or close to the same intensities Edge detection can be used to segment

images with respect to these edges and display the edges separately [26][15] Edge detection

can be used in separating tibia bones from X-rays as bones have strong boundaries or edges

Figure 31 is an example of

basic edge detection in images

321 Sobel Edge Detector

The Sobel operator used to do the edge detection calculates the gradient of the image

intensity at each pixel The gradient of a 2D image is a 2D vector with the partial horizontal

and vertical derivatives as its components The gradient vector can also be seen as a

magnitude and an angle If Dx and Dy are the derivatives in the x and y direction

respectively equations 31 and 32 show the magnitude and angle(direction) representation of

the gradient vector rD It is a measure of the rate of change in an image from light to dark

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 17: Anu Document

pixel in case of grayscale images at every point At each point in the image the direction of

the gradient vector shows the direction of the largest increase in the intensity of the image

while the magnitude of the gradient vector denotes the rate of change in that direction [15]

[26] This implies that the result of the Sobel operator at an image point which is in a region

of constant image intensity is a zero vector and at a point on an edge is a vector which points

across the edge from darker to brighter values Mathematically Sobel edge detection is

implemented using two 33 convolution masks or kernels one for horizontal direction and

the other for vertical direction in an image that approximate the derivative in the horizontal

and vertical directions The derivatives in the x and y directions are calculated by 2D

convolution of the original image and the convolution masks If A is the original image and

Dx and Dy are the derivatives in the x and y direction respectively equations 33 and 34

show how the directional derivatives are calculated [26] The matrices are a representation of

the convolution kernels that are used

322 Prewitt Edge Detector

The Prewitt edge detector is similar to the Sobel detector because it also approxi- mates the

derivatives using convolution kernels to find the localized orientation of each pixel in an

image The convolution kernels used in Prewitt are different from those in Sobel Prewitt is

more prone to noise than Sobel as it does not give weight- ing to the current pixel while

calculating the directional derivative at that point [15][26] This is the reason why Sobel has a

weight of 2 in the middle column and Prewitt has a 1 [26] The equations 35 and 36 show

the difference between the Prewitt and Sobel detectors by giving the kernels for Prewitt The

same variables as in the Sobel case are used The kernels to calculate the directional

derivatives are different

323 Roberts Edge Detector

The Roberts edge detectors also known as the Roberts Cross operator finds edges

by calculating the sum of the squares of the differences between diagonally adjacent

pixels [26][15] So in simple terms it calculates the magnitude between the pixel in question

and its diagonally adjacent pixels It is one of the oldest methods of edge detection and its

performance decreases if the images are noisy But this method is still used as it is simple

easy to implement and its faster than other methods The implementation is done by

convolving the input image with 2 2 kernels

324 Canny Edge Detector

Canny edge detector is considered as a very effective edge detecting technique as it

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 18: Anu Document

detects faint edges even when the image is noisy This is because in the beginning

of the process the data is convolved with a Gaussian filter The Gaussian filtering results in a

blurred image so the output of the filter does not depend on a single noisy pixel also known

as an outlier Then the gradient of the image is calculated same as in other filters like Sobel

and Prewitt Non-maximal suppression is applied after the gradient so that the pixels that are

below a certain threshold are suppressed A multi-level thresholding technique same as the

example in 24 involving two levels is then used on the data If the pixel value is less than the

lower threshold then it is set to 0 and if its greater than the higher threshold then it is set to 1

If a pixel falls in between the two thresholds and is adjacent or diagonally adjacent to a high-

value pixel then it is set to 1 Otherwise it is set to 0 [26] Figure 35 shows

the X-ray image and the image after Canny edge detection

33 Image Segmentation

331 Texture Analysis

Texture analysis attempts to use the texture of the image to analyze it Texture analysis

attempts to quantify the visual or other simple characteristics so that the image can be

analyzed according to them [23] For example the visible properties of

an image like the roughness or the smoothness can be converted into numbers that describe

the pixel layout or brightness intensity in the region in question In the bone segmentation

problem image processing using texture can be used as bones are expected to have more

texture than the mesh Range filtering and standard deviation filtering were the texture

analysis techniques used in this thesis Range filtering calculates the local range of an image

3 Principal curvature-based Region Detector

31 Principal Curvature Image

Two types of structures have high curvature in one direction and low curvature in the

orthogonal direction lines

(ie straight or nearly straight curvilinear features) and edges Viewing an image as an

intensity surface the curvilinear structures correspond to ridges and valleys of this surface

The local shape characteristics of the surface at a particular point can be described by the

Hessian matrix

H(x σD) =

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 19: Anu Document

middot

Ixx(x σD) Ixy(x σD)

Ixy(x σD) Iyy(x σD)

cedil

(1)

where Ixx Ixy and Iyy are the second-order partial derivatives of the image evaluated at the

point x and σD is the

Gaussian scale of the partial derivatives We note that both the Hessian matrix and the related

second moment matrix have been applied in several other interest operators (eg the Harris

[7] Harris-affine [19] and Hessian-affine [18] detectors) to find image positions where the

local image geometry is changing in more than one direction Likewise Lowersquos maximal

difference-of-Gaussian (DoG) detector [13] also uses components of the Hessian matrix (or at

least approximates the sum of the diagonal elements) to find points of interest However our

PCBR detector is quite different from these other methods and is complementary to them

Rather than finding extremal ldquopointsrdquo our detector applies the watershed algorithm to ridges

valleys and cliffs of the image principal-curvature surface to find ldquoregionsrdquo As with

extremal points the ridges valleys and cliffs can be detected over a range of viewpoints

scales and appearance changes Many previous interest point detectors [7 19 18] apply the

Harris measure (or a similar metric [13]) to determine a pointrsquos saliency The Harris measure

is given by det(A) minus k middot tr2(A) gt threshold where det is the determinant tr is the trace and

the matrix A is either the Hessian matrix or the second moment matrix One advantage of the

Harris metric is that it does not require explicit computation of the eigenvalues However

computing the eigenvalues for a 2times2 matrix requires only a single Jacobi rotation to eliminate

the off-diagonal term Ixy as noted by Steger [25] The Harris measure produces low values

for ldquolongrdquo structures that have a small first or second derivative in one particular direction

Our PCBR detector compliments previous interest point detectors in that we abandon the

Harris measure and exploit those very long structures as detection cues The principal

curvature image is given by either

P (x) =max(λ1(x) 0) (2)

or

P (x) =min(λ2(x) 0) (3)

where λ1(x) and λ2(x) are the maximum and minimum eigenvalues respectively of H at x

Eq 2 provides a high response only for dark lines on a light background (or on the dark side

of edges) while Eq 3 is used to detect light lines against a darker background Like SIFT [13]

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 20: Anu Document

and other detectors principal curvature images are calculated in scale space We first double

the size of the original image to produce our initial image I11

and then produce increasingly Gaussian smoothed images I1j with scales of σ = kjminus1 where

k = 213 and j = 26 This set of images spans the first octave consisting of six images I11 to

I16 Image I14 is down sampled to half its size to produce image I21 which becomes the first

image in the second octave We apply the same smoothing process to build the second

octave and continue to create a total of n = log2(min(w h)) minus 3 octaves where w and h are

the width and height of the doubled image respectively Finally we calculate a principal

curvature image Pij for each smoothed image by computing the maximum eigenvalue (Eq

2) of the Hessian matrix at each pixel For computational efficiency each smoothed image

and its corresponding Hessian image is computed from the previous smoothed image using

an incremental Gaussian scale Given the principal curvature scale space images we calculate

the maximum curvature over each set of three consecutive principal curvature images to form

the following set of four images in each of the n octaves

MP12 MP13 MP14 MP15

MP22 MP23 MP24 MP25

MPn2 MPn3 MPn4 MPn5 (4)

where MPij =max(Pijminus1 Pij Pij+1)

Figure 2(b) shows one of the maximum curvature images MP created by maximizing the

principal curvature at each pixel over three consecutive principal curvature images From

these maximum principal curvature images we find the stable regions via our watershed

algorithm

32 EnhancedWatershed Regions Detections

The watershed transform is an efficient technique that is widely employed for image

segmentation It is normally applied either to an intensity image directly or to the gradient

magnitude of an image We instead apply the watershed transform to the principal curvature

image However the watershed transform is sensitive to noise (and other small perturbations)

in the intensity image A consequence of this is that the small image variations form local

minima that result in many small watershed regions Figure 3(a) shows the over

segmentation results when the watershed algorithm is applied directly to the principal

curvature image in Figure 2(b)) To achieve a more stable watershed segmentation we first

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 21: Anu Document

apply a grayscale morphological closing followed by hysteresis thresholding The grayscale

morphological closing operation is defined as f bull b = (f oplus b) ordf b where f is the image MP from

Eq 4 b is a 5 times 5 disk-shaped structuring element and oplus and ordf are the grayscale dilation and

erosion respectively The closing operation removes small ldquopotholesrdquo in the principal

curvature terrain thus eliminating many local minima that result from noise and that would

otherwise produce watershed catchment basins Beyond the small (in terms of area of

influence) local minima there are other variations that have larger zones of influence and that

are not reclaimed by the morphological closing To further eliminate spurious or unstable

watershed regions we threshold the principal curvature image to create a clean binarized

principal curvature image However rather than apply a straight threshold or even hysteresis

thresholdingndashboth of which can still miss weak image structuresndashwe apply a more robust

eigenvector-guided hysteresis thresholding to help link structural cues and remove

perturbations Since the eigenvalues of the Hessian matrix are directly related to the signal

strength (ie the line or edge contrast) the principal curvature image may at times become

weak due to low contrast portions of an edge or curvilinear structure These low contrast

segments may potentially cause gaps in the thresholded principal curvature image which in

turn cause watershed regions to merge that should otherwise be separate However the

directions of the eigenvectors provide a strong indication of where curvilinear structures

appear and they are more robust to these intensity perturbations than is the eigenvalue

magnitude In eigenvector-flow hysteresis thresholding there are two thresholds (high and

low) just as in traditional hysteresis thresholding The high threshold (set at 004) indicates a

strong principal curvature response Pixels with a strong response act as seeds that expand to

include connected pixels that are above the low threshold Unlike traditional hysteresis

thresholding our low threshold is a function of the support that each pixelrsquos major

eigenvector receives from neighboring pixels Each pixelrsquos low threshold is set by comparing

the direction of the major (or minor) eigenvector to the direction of the 8 adjacent pixelsrsquo

major (or minor) eigenvectors This can be done by taking the absolute value of the inner

product of a pixelrsquos normalized eigenvector with that of each neighbor If the average dot

product over all neighbors is high enough we set the low-to-high threshold ratio to 02 (for a

low threshold of 004 middot 02 = 0008) otherwise the low-to-high ratio is set to 07 (giving a

low threshold of 0028) The threshold values are based on visual inspection of detection

results on many images Figure 4 illustrates how the eigenvector flow supports an otherwise

weak region The red arrows are the major eigenvectors and the yellow arrows are the minor

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 22: Anu Document

eigenvectors To improve visibility we draw them at every fourth pixel At the point

indicated by the large white arrow we see that the eigenvalue magnitudes are small and the

ridge there is almost invisible Nonetheless the directions of the eigenvectors are quite

uniform This eigenvector-based active thresholding process yields better performance in

building continuous ridges and in handling perturbations which results in more stable regions

(Fig 3(b)) The final step is to perform the watershed transform on the clean binary image

(Fig 2(c)) Since the image is binary all black (or 0-valued) pixels become catchment basins

and themidlines of the thresholded white ridge pixels become watershed lines if they separate

two distinct catchment basins To define the interest regions of the PCBR detector in one

scale the resulting segmented regions are fit with ellipses via PCA that have the same

second-moment as the watershed regions (Fig 2(e))

33 Stable Regions Across Scale

Computing the maximum principal curvature image (as in Eq 4) is only one way to achieve

stable region detections To further improve robustness we adopt a key idea from MSER and

keep only those regions that can be detected in at least three consecutive scales Similar to the

process of selecting stable regions via thresholding in MSER we select regions that are stable

across local scale changes To achieve this we compute the overlap error of the detected

regions across each triplet of consecutive scales in every octave The overlap error is

calculated the same as in [19] Overlapping regions that are detected at different scales

normally exhibit some variation This variation is valuable for object recognition because it

provides multiple descriptions of the same pattern An object category normally exhibits

large within-class variation in the same area Since detectors have difficulty locating the

interest area accurately rather than attempt to detect the ldquocorrectrdquo region and extract a single

descriptor vector it is better to extract multiple descriptors for several overlapping regions

provided that these descriptors are handled properly by the classifier

2 BACKGROUND AND RELATED WORK

Consider an RGB image of a passage in a painting consisting of open brush strokes that is

where lower layer strokes are visible The task of recovering layers of strokes involves

mainly three steps

1 Partition the image into regions with consistent colorsshapes corresponding to

di_erent layers of strokes

2 Identify the current top layer

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 23: Anu Document

3 inpaint the regions of the top layer

The three steps are repeated whenever there are more than two layers remained

21 De-pict Algorithm

Given an image as input De-pict algorithm starts by applying k-means and complete-linkage

as a clustering step to obtain chromatic consistent regions Under the assumption that brush

strokes of the same layer of painting are of similar colors such regions in different clusters

are good representatives of brush strokes at different layers in the painting as shown in Fig

3c Note that after the clustering step each pixel of the image is assigned a label

corresponding to its assigned cluster and each label can be described by the mean chromatic

feature vector Then the top layer is identified by human experts based on visual occlusion

cues etc Ideally this step should be fully automatic but this step challenging is not the focus

of our current work Lastly the regions of the top layer are removed and inpainted by k

nearest-neighbor algorithm

31 Spatially coherent segmentation

We improve the layer segmentation by incorporating k-means and spatial coherence

regularity in an iterative E-M way10 11 We model the appearances of brush strokes of

different layers by a set of feature centers (mean chromatic vectors as in k-means) In other

words we assume that each layer is modeled as independent Gaussians with same

covariances that only differs in the means Given the initial models ie the k mean chromatic

vectors we can refine the segmentation with spatial coherent priors by minimizing the

following energy function (E-step)

min

L

X

p

jjfp 1048576 cLp jj22

+ _

X

fpqg2N

T

jepqj

[Lp 6= Lq] (1)

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 24: Anu Document

where Lp 2 f1 kg is cluster label of pixel p fp is the color feature of pixel p ci is the

color model for clusters i jepqj is the edge length between pq and T is the delta function

The first term in Eq(1) measures the appearance similarity between the pixels and the clusters

they are assigned to And the second term penalizes the situation where pixels in the

neighborhood belong to different clusters By _xing the k appearance models the

minimization problem can be solved with graph-cut algorithm12 The solution gives us under

spatial regularization the optimal labeling of pixels to different clusters After spatial coherent

refinement we can re-estimate k models as the mean chromatic vectors (M-step) Then we

iterate the E and M step until convergence or a predefined number of iterations is reached

32 Curvature-based inpainting

Unlike exemplar-based inpainting method curvature-based inpainting methods focus on

reconstructing the ge- ometric structure of (chromatic) intensities which is usually

represented by level lines5 7 Here level lines can be contours that connect pixels of the

same graychromatic intensity in an image Therefore such methods are well-suited for

inpainting on images with no or very few textures due to the fact that level lines capture

concisely the structure and information of the texture-less regions For van Goghs painting

the brush strokes at each layer are close to textureless Therefore curvature-based inpainting

can be superior to exemplar-based methods (for instance in De-pict and Criminisi et al3) for

recovering the structures of underlying brush strokes In this paper we evaluate the recent

method proposed by Schoenemann et al7 that formulates the curvature- based inpainting as a

linear program Unlike other methods this method is independent of initialization andcan

handle general inpainting regions eg regions with holes In the following we briey review

Schoenemann et als method in details To formulate the problem as linear program in this

approach curvature is modeled in a discrete sense (where a possible reconstruction of the

level line with intensity 100 is shown) Specifically we impose an discrete grid of certain

connectivity (8-connectivity in Fig 4) on the image The edges constitute line segments and

line segment pairs that are used to represent level lines And the basic regions represent the

pixels Then for each potential discrete level line the curvature is approximated by the sum

of angle changes at all vertices along the level line with proper weighting of the edge length

To ensure that regions and the level lines are consistent (for instance level lines should be

continuous) two sets of linear constraints ie surface continuation constraints and boundary

continuation constraints are imposed on the variables Finally the boundary condition (the

intensities of the boundary pixels) of the damage region can also be easily formulated as

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 25: Anu Document

linear constraints With proper handling all these constraints the inpainting problem can be

solved as linear program To handle color images we simply formulate and solve a linear

program for each chromatic channel independently

II MATERIALS AND METHODS

A Data Retrieval

In this study data was collected from Dr Siyami Ersek thoracic and cardiovascular surgery

training and research hospital All pulmonary computed tomographic angiography exams

performed with 16 detectors CT (Somatom Sensation 16 Siemens AG Erlanger Germany)

equipment Patients were informed about the examination and also for breath holding

Imaging performed with Bolus tracking program After scenogram single slice is taken at the

level of pulmonary truncus A bolus tracking is placed at pulmonary truncus and trigger is

adjusted to 100 HU (Hounsfield Unit) 70ml nonionic contrast agent at the rate of 4mLsec

with an automated syringe (Optistat Contrast Delivery System Liebel-Flarsheim USA) is

used When opacification is

reached at the pre-adjusted level exam performed from the supraclavicular region to the

diaphragms Contrast injection performed via 18-20G intra venous cannula that was placed at

antecubital vein Scaning parameters were 120 kV 80- 120 mA slice thickness 1 mm pitch

10-12 Images reconstructed with 1mm and 5mm thickness and evaluated at mediastinal

window (WW 300 WL 50) with advanced workstation (Wizard Siemens AG Erlanger

Germany) in coronal sagittal and axial planes Oblique plans used if needed Each exam

consists of 400-500 images with 512x512 resolution

B Method

The stages which have been followed while doing lung segmentation from CTA images at

this work are shown in figure 1

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 26: Anu Document

CTA images which are in hands are 250 as being 2D The first step is thresholding the

image A thoracic CT contains two main groups of pixels 1) highndashintensity pixels located in

the body (body pixels) and 2) lowndashintensity pixels that are in the lung and the surrounding

air (nonndashbody pixels) Due to the large difference in intensity between these two groups

thresholding leads to a good separation In this study thresholding has been tried out for the

first time in a way that contains bigger parts than 700 HU At the end of thresholding the new

images are going to be in logical value

Thresh=imagegt700

In each of these new images subsegment vessels exist in lung region At the second step this

method has been used to get rid of these vessels firstly each of 2D images has been

considered one by one and each of components in the image have labeled with ldquoconnected

component labelling algorithmrdquo Then looking at the size of each labeled piece items whose

pixel numbers are under 1000 were removed from the image Figure 3

Next the image in Figure 3 has been labeled with ldquoconnected component labeling

algorithmrdquo The biggest size

which is logical 1 is the patientrsquos body This biggest size has been taken and the other parts

have been removed from the image And then the opposite of it has been gotten So all of ldquo0rdquo

turn into ldquo1rdquo and all of ldquo1rdquo turn into ldquo0rdquo Figure 4

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 27: Anu Document

As the 1 or 512 pixels of the parts out of the body in the image which shown in Figure 4 is

going to be logical 1 the parts that achieve this condition have been removed and lung and

airway are appeared like in Figure 5 Fig5 segmentation of lung and airway Due to the fact

that airway in Figure 5 is going to be very small compared to the lung size each of images

have been labeled with ldquoconnected component labeling algorithmrdquo and the component whose

number of pixels are below 1000 have been determined as airways and then removed from

the image The last image in hand is the segmented form of target lung Before airways

removed finding the edges of the image with sobel algoritm it has been gathered to original

image and the edges of lung and airway region have been shown in the original image Figure

6 (b) Also by multiplying defined lung region with the original CTA lung image original

segmented lung image has been carried out

Figure 6 (c)

MATLAB

MATLAB and the Image Processing Toolbox provide a wide range of advanced image

processing functions and interactive tools for enhancing and analyzing digital images The

interactive tools allowed us to perform spatial image transformations morphological

operations such as edge detection and noise removal region-of-interest processing filtering

basic statistics curve fitting FFT DCT and Radon Transform Making graphics objects

semitransparent is a useful technique in 3-D visualization which furnishes more information

about spatial relationships of different structures The toolbox functions implemented in the

open MATLAB language has also been used to develop the customized algorithms

MATLAB is a high-level technical language and interactive environment for data analysis

and mathematical computing functions such as signal processing optimization partial

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 28: Anu Document

differential equation solving etc It provides interactive tools including threshold

correlation Fourier analysis filtering basic statistics curve fitting matrix analysis 2D and

3D plotting functions The operations for image processing allowed us to perform noise

reduction and image enhancement image transforms colormap manipulation colorspace

conversions region-of interest processing and geometric operation [4] The toolbox

functions implemented in the open MATLAB language can be used to develop the

customized algorithms

An X-ray Computed Tomography (CT) image is composed of pixels whose brightness

corresponds to the absorption of X-rays in a thin rectangular slab of the cross-section which

is called a rsquorsquovoxelrsquorsquo [13] The Pixel Region tool provided by MATLAB 701 superimposes

the pixel region rectangle over the image displayed in the Image Tool defining the group of

pixels that are displayed in extreme close-up view in the Pixel Region tool window The

Pixel Region tool shows the pixels at high magnification overlaying each pixel with its

numeric value [25] For RGB images we find three numeric values one for each band of the

image We can also determine the current position of the pixel region in the target image by

using the pixel information given at the bottom of the tool In this way we found the x- and y-

coordinates of pixels in the target image coordinate system The Adjust Contrast tool displays

a histogram which represents the dynamic range of the X-ray CT image (Figure1)

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 29: Anu Document

Figure 1 Pixel Region of an X-ray CT scan and the Adjust Contrast Tool

The Image Processing Toolbox provide a reference-standard algorithms and graphical tools for image analysis tasks including edge-detection and image segmentation algorithms image transformation measuring image features and statistical functions such as mean median standard deviation range etc (Figure 2

3 PLOT TOOLS MATLAB provides a collection of plotting tools to generate various types of graphs displaying the image histogram or plotting the profile of intensity values (Fig 3ab) Figure

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 30: Anu Document

Figure 3 a - The Histogram of X-ray CT image and the plot fits (significant digits 2) A cubic fitting function is the best-fit model for histogram data plot The fit curve was plotted as a magenta line through the data plot Area Graph of X-ray CT brain scan displays the elements in a variable as one or more curves and fills the area beneath each curve

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 31: Anu Document

Figure 3b Area Graph of X-ray CT brain scan

The 3-D Surface Plot generates a matrix as a surface (Figures 4 a b c d) We can also make the faces of a surface transparent to a varying degree Transparency (referred to as the alpha value) can be specified for the whole 3D-object or can be based on an alphamap which behaves in a way analogous to colormaps (Figures 4 a b)

Figure 4a 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(0)

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 32: Anu Document

Figure 4b - 3D Surface Plot of x-ray CT brain scan generated with histogram values alpha(4)

The rsquorsquomeshgridrsquo function is extremely useful for computing a function of two Cartesian coordinates It transforms the domain specified by a single vector or two vectors x and y into matrices X and Y for use in evaluating functions of two variables The rows of X are copies of the vector x and the columns of Y are copies of the vector y (Figure 4c)

Figure 4c - 3D Surface Plot of x-ray CT brain scan generated with histogram values mesh

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 33: Anu Document

3-D Surface Plot with Contour (Surfc) displays a matrix as a surface with contour plot below rsquorsquoLightingrsquorsquo is the technique of illuminating an object with a directional light source This technique can make subtle differences in surface shape easier to see rsquorsquoLightingrsquorsquo processing can also be used to add realism to three-dimensional graphs This example uses the same surface as the previous examples but colors it yellow and removes the mesh lines (Figure 4d)

Figure 4d Surface Plot of x-ray CT brain scan generated with histogram values lightening

The rsquorsquoImagersquorsquo creates an X-ray CT image graphics object by interpreting each element in a matrix as an index into the figures colormap or directly as RGB values depending on the data specified (Figure 5a)The rsquorsquoImagersquorsquo with Colormap Scaling (rsquorsquoimagescrsquorsquo function) displays an X-ray CT image and scale to use full colormap MATLAB supports a number of colormaps A colormap is an m-by-3 matrix of real numbers between 00 and 10 Each row is an RGB vector that defines one color Jetrsquorsquo ranges from blue to red and passes through the colors cyan yellow and orange It is a variation of the hsv (hue saturation value) colormap (Figure 5b)

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 34: Anu Document

Contour Plot is useful for delineating organ boundaries in images It displays isolines of a surface represented by a matrix (Figure 6) For example Figure

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 35: Anu Document

Figure 6 - Contour Plot of X-ray CT brain scan

The ezsurfc(f) or surfc function creates a graph of f(xy) where f is a string that represents a mathematical function of two variables such as x and y (Figure 7)

Figure 7 ndash Surfc on X-ray CT brain scan

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 36: Anu Document

The contour3 function creates a three-dimensional contour plot of a surface defined on a rectangular grid (Figure 8)

Figure 8 ndash Contour3 on X-ray CT brain scan

3-D Lit Surface Plot (Surface plot with colormap-based lighting surfl function) displays a shaded surface based on a combination of ambient diffuse and specular lighting models (Figure 9)

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 37: Anu Document

Figure 9 - 3D Lit Surface Plot of X-ray CT brain scan

The 3-D Ribbon Graph of Matrix displays a matrix by graphing the columns as segmented strips (Figure 10)

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 38: Anu Document

Figure 10 - The 3-D Ribbon Graph of X-ray CT brain scan

4 FILTER VISSUALIZATION TOOL (FVTool) Filter Visualization Tool (FVTool) computes the magnitude response of the digital filter defined with numerator b and denominator a By using FVTool we can display the phase response group delay response impulse response step response polezero plot filter coefficients and round-off noise power spectrum(Figures 11 12 13 14 15 16 and 17)

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 39: Anu Document

Figure 11 Magnitude and Phase Response - Frequency scale a) linear b) log

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 40: Anu Document

Figure 12 Group Delay Response - Frequency scale a) linear b) log

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 41: Anu Document

Figure 13 Phase Delay Response - Frequency scale a) linear b) log

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 42: Anu Document

Figure 14 (a) Impulse Response (b) PoleZero Plot

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 43: Anu Document

Figure 15 Step Response (a) Default (b) Specify Length 50

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 44: Anu Document

Figure 16 Magnitude Response Estimate - Frequency scale a) linear b) log

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 45: Anu Document

Figure 17 Magnitude Response and Round-off Noise Power Spectrum - Frequency scale a) linear b) log

Chapter 4

This chapter describes the workings of a typical ASM Although there are many ex- tensions

and modi_cations made the basic ASM model work the same way Cootes and Taylor [9]

gives a complete description of the classical ASM Section 41 in- troduces shapes and shape

models in general Section 42 describes the workings and the components of the ASM The

parameters and variations that affect the performance of the ASM are explained in Section

43 The experiments that are performed in this thesis to improve the performance of the

model are also described

in this section The problem of initialization of the model in a test image is tackled in Section

44 Section 45 elaborates on the training of the ASM and the definition of an error function

The performance of the ASM on bone X-rays will be judged according to this error function

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 46: Anu Document

41 Shape Models

A shape is a collection of points As shown in Figure 41 a shape can be represented by a

diagram showing the points or as a n _ 2 array where the n rows represent the number of

points and the two columns represent the x and y co-ordinates of the points respectively In

this thesis and in the code used a shape will be defined as a 2n _ 1 vector where the y co-

ordinates are enlisted after the x co-ordinates as shown in 41c A shape is the basic block of

any ASM as it stays the same even if

it is scaled rotated or translated The lines connecting the points are not part of the shape but

they are shown to make the shape and order of the points more clear [24]

Figure 41 Example of a shape

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 47: Anu Document

The distance between two points is the Euclidean distance between them Equa-

tion 41 gives the formula for Euclidean distance between two points (x1 y1) and

x2 y2 The distance between two shapes can be de_ned as the distance between

their corresponding points [24] There are other ways of de_ning distances between

two points like the Procrustes distance but in this thesis the distance means the

Euclidean distance

radic (y2 - y1)2 + (x2 - x1)2

The centroid x of a shape x can be de_ned as the mean of the point positions

[24] The centroid can be useful while aligning shapes or _nding an automatic

initialization technique (discussed in 44) The size of the shape is the root mean

distance between the points and the centroid This can be used in measuring the

size of the test image which will help with the automatic initialization (discussed in

44)

Algorithm 1 Aligning shapes

Input set of unaligned shapes

1 Choose a reference shape (usually the 1st shape)

2 Translate each shape so that it is centered on the origin

3 Scale the reference shape to unit size Call this shape x0 the initial mean

shape

4 repeat

(a) Align all shapes to the mean shape

(b) Recalculate the mean shape from the aligned shapes

(c) Constrain the current mean shape (align to x0 scale to unit size)

5 until convergence (ie mean shape does not change much)

output set of aligned shapes and mean shape

42 Active Shape Models

The ASM has to be trained using training images In this project the tibia bone

was separated from a full-body X-ray (as shown in 12) and then those images were

re-sized to the same dimensions This ensured uniformity in the quality of data

being used The training on the images was done by manually selecting landmarks

Landmarks were placed at approximately equal intervals and were distributed uni-

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 48: Anu Document

formly over the bone boundary Such images are called hand annotated or manually

landmarked training images

Figure 43 shows the original image and the manually landmarked image for training

While performing tests using different number of landmark points a subset of these

landmarks points is chosen

After the training images have been landmarked the ASM produces two types of

sub-models [24] These are the profile model and the shape model

1 The profile model analyzes the landmark points and stores the behaviour of the

image around the landmark points So during training the algorithm learns

the characteristics of the area around the landmark points and builds a profile

model for each landmark point accordingly When searching for the shape in

the test image the area near the tentative landmarks is examined and the model moves the

shape to an area that fits closely to the profile model The

tentative location of the landmarks is obtained from the suggested shape

2 The shape model defines the permissible relative positions of landmarks This

introduces a constraint on the shape So as the profile model tries to find the

area in the test image that tries to fit the model the shape model ensures that

the mean shape is not changed The profile model acts on individual landmarks

whereas the shape acts globally on the image So both the models try to correct

each other until no further improvements in matching are possible

421 The ASM Model

The aim of the model is to try to convert the shape proposed by the individual

profiles into an allowable shape So it tries to find the area in the image that closely

matches the profiles of the individual landmarks while keeping the overall shape

constant

The shape is learnt from manually landmarked training images These images are

aligned and a mean shape is formulated with the permissible variations in it [24]

^x = x + ₵b where

^x is the generated shape vector by the model

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 49: Anu Document

x is the mean shape the average of the aligned training shapes

xi

422 Generating shapes from the model

As seen in Equation 43 different shapes can be generated by changing the value of

b The model is varied in height and width finding optimum values for landmarks

Figure 44 shows the mean shape and its whisker profiles superimposed on the bone X-ray

image The points that are perpendicular to the model are called _whiskers and they help the

profile model in analyzing the area around the landmark points

The shape created by the landmark points are used for the shape model and the

whisker profiles around the landmark points are used for the profile model A profile

and a covariance matrix is built for each landmark It is assumed that the profiles

are distributed as a multivariate Gaussian and so they can be described by their

mean pro_le g and the covariance matrix Sg

423 Searching the test image

After the training is over the shape is searched in the test image The mean shape

calculated from the training images is imposed on the image and the profiles around

the landmark points are search and examined The profiles are offset 3 pixels

along the whisker which is perpendicular to the shape to get the accurate area

that closely resembles the mean shape [24] The distance between the test profile g

and the mean profile g is calculated using the Mahalanobis distance given by

If the model is initialized correctly (discussed in 44) one of the profiles will have the

lowest distance This procedure is done for every landmark point and then the shape

model confirms that the shape is the same as the mean shape The shape model

assures that the pro_le model has not changed the shape If the shape model were

not employed the pro_le model may give the best pro_le results but the resulting

shape may be completely di_erent So as mentioned before the two models restrict

each other A multi-resolution search is done to make the model more robust This

enables the model to be more accurate as it can lock on to the shape from further

away So the model searches over a series of di_erent resolutions of the same image

called an image pyramid The resolutions of the images can be set and changed

in the algorithm [17 24] Figure 45 shows a sample image pyramid The sizes of

the images are given relative to the _rst image A general picture and not a bone

32

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 50: Anu Document

43 Parameters and Variations

The performance of the ASM can be enhanced using optimizing the parameters

that it depends on Number of landmark points and number of training images are

investigated in this thesis

The number of landmark points is an important variable that a_ects the ASM The

pro_le model of the ASM works with these landmark points to create pro_les So

the position of landmark points is as important as the number of landmark points

In the training images landmark points are equally spaced along the boundary of

the bone Images are landmarked with 60 points and subsets of these points are

chosen to conduct experiments The impact of the number of landmark points on computing

time and the mean error (defined in Section 45) is tested by running the algorithm with a

different number of landmarks As the number of landmark points is increased it is expected

that the computing time increases and the error decreases The results are explained in

chapter5 A training set of images is used to train the ASM As the number of training images

increases the model becomes more robust and intelligent The computing time is expected to

increase as it will take time to train and create profile models for each image However as the

number of training images increases the mean profile and the model performs better so the

error is expected to decrease The model in this thesis has 12 images 11 are used to train the

ASM and 1 is used as a test image gives an overview of the ASM Figure 46a shows the

unaligned shape learnt from the training images displays the aligned shapes

44 Initialization Problem

The Active Shape Model locks on to the shape learnt from the training images into the test

image It creates a mean shape pro_le from all the training images using landmark points But

the ASM starts of where the mean shape is located but it may not be near the bone on a test

image So the model needs to be initialized or started somewhere close to the bone boundary

in the test image Experiments were conducted to see the effect of initialization on the error

and the tracking of the shape It was observed that if the initialization is poor which means

that the mean shape starts away from the bone in test X-ray the model does not lock on to the

bone The shape and profile models fail to perform as the profile model looks for regions

similar to those of the training images in the regions away from the bone So it is unable to

find the bone as it is looking in a different region altogether The error increases considerably

if the mean shape is 40-50 pixels away from the bone in the test image Figure 47a shows

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 51: Anu Document

the initialization The pink contour is the mean shape and it starts away from the bone so the

result is a poor tracking of the bone

Chapter 5

OUTPUT SCREENS

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 52: Anu Document

REFERENCES

[1] H Asada andM Brady The curvature primal sketch PAMI 8(1)2ndash14 1986 2

[2] A Baumberg Reliable feature matching across widely separated views CVPR pages

774ndash781 2000 2

[3] P Beaudet Rotationally invariant image operators ICPR pages 579ndash583 1978 2

[4] J Canny A computational approach to edge detection PAMI 8679ndash698 1986 2

[5] R Deriche and G Giraudon A computational approach for corner and vertex detection

IJCV 10(2)101ndash124 1992 2

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 53: Anu Document

[6] T G Dietterich Approximate statistical tests for comparing supervised classification

learning algorithms Neural Computation 10(7)1895ndash1924 1998 6

[7] C Harris and M Stephens A combined corner and edge detector Alvey Vision Conf

pages 147ndash151 1988 2 3

[8] F Jurie and C Schmid Scale-invariant shape features for recognition of object

categories CVPR 290ndash96 2004 1 2

[9] T Kadir and M Brady Scale saliency and image description IJCV 45(2)83ndash105 2001

2

[10] N Landwehr M Hall and E Frank Logistic model trees Machine Learning 59(1-

2)161ndash205 2005 6 7

[11] T Lindeberg Feature detection with automatic scale selection IJCV 30(2)79ndash116

1998 2

[12] T Lindeberg and J Garding Shape-adapted smoothing in estimation of 3-d shape cues

from affine deformations of local 2-d brightness structure Image and Vision Computing

pages 415ndash434 1997 2

[13] D G Lowe Distinctive image features from scale-invariant keypoints IJCV 60(2)91ndash

110 2004 2 3

[14] G Loy and J-O Eklundh Detecting symmetry and symmetric constellations of features

ECCV pages 508ndash521 2006 7

[15] J Matas O Chum M Urban and T Pajdla Robust widebaseline stereo from

maximally stable extremal regions Image and Vision Computing 22(10)761ndash767 2004 2

[16] G Medioni and Y Yasumoto Corner detection and curve representation using cubic b-

splines CVGIP 39267ndash278 1987 2

[17] K Mikolajczyk and C Schmid An affine invariant interest point detector ECCV

1(1)128ndash142 2002 2

[18] K Mikolajczyk and C Schmid Scale and affine invariant interest point detectors IJCV

60(1)63ndash86 2004 2 3

[19] K Mikolajczyk T Tuytelaars C Schmid A Zisserman J Matas F Schaffalitzky T

Kadir and L V Gool A comparison of affine region detectors IJCV 2005 2 3 4 5

[20] F Mokhtarian and R Suomela Robust image corner detection through curvature scale

space PAMI 20(12)1376ndash 1381 1998 2

[21] H Moravec Towards automatic visual obstacle avoidance International Joint Conf on

Artificial Intelligence page 584 1977 2

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2

Page 54: Anu Document

[22] A Opelt M Fussenegger A Pinz and P Auer Weak hypotheses and boosting for

generic object detection and recognition ECCV pages 71ndash84 2004 5 6 7

[23] E Shilat M Werman and Y Gdalyahu Ridgersquos corner detection and correspondence

CVPR pages 976ndash981 1997

[24] S Smith and J M Brady Susanndasha new approach to low level image processing IJCV

23(1)45ndash78 1997

[25] C Steger An unbiased detector of curvilinear structures PAMI 20(2)113ndash125 1998 1

3

[26] T Tuytelaars and L V Gool Wide baseline stereo matching based on local affinely

invariant regions BMVC pages 412ndash 425 2000 2