Joan Bie Dig Er Thesis

8/12/2019 Joan Bie Dig Er Thesis

1/119

THE USE OF DIGITAL IMAGE PROCESSING TO FACILITATE DIGITIZINGLAND COVER ZONES FROM GRAY LEVEL AERIAL PHOTOS

A THESIS PRESENTED TOTHE DEPARTMENT OF GEOLOGY AND GEOGRAPHY

IN CANDIDACY FOR THE DEGREE OFMASTER OF SCIENCE

ByJOAN M. BIEDIGER

NORTHWEST MISSOURI STATE UNIVERSITYMARYVILLE, MISSOURI

April 2012


2/119

ii

DIGITAL IMAGE PROCESSING

The Use of Digital Image Processing to Facilitate Digitizing

Land Cover Zones from Gray Level Aerial Photos

Joan Biediger

Northwest Missouri State University

THESIS APPROVED

____________________________Thesis Advisor, Dr. Ming-Chih Hung Date

____________________________Dr. Yi-Hwa Wu Date

____________________________Dr. Patricia Drews Date

____________________________Dean of Graduate School, Dr. Gregory Haddock Date


3/119

iii

The Use of Digital Image Processing to Facilitate Digitizing

Land Cover Zones from Gray Level Aerial Photos

Abstract

Aerial imagery from the 1930s to the early 1990s was predominantly acquired using

black and white film. Its use in remote sensing applications and GIS analysis is

constrained by its limited spectral information and high spatial resolution. As a historical

record and to study long-term land use/land cover change this imagery is a valuable but

often underutilized resource. Traditional classification of gray level aerial photos has

primarily relied on visual interpretation and digitizing to obtain land cover classifications

that can be used in a GIS. This is a time consuming and labor intensive process that can

often limit the scale of analysis.

This research focused on the use of digital image processing to facilitate visual

interpretation and heads up digitizing of gray level imagery. Existing remote sensing

software packages have limited functionalities with respect to classifying black and white

aerial photos. Traditional image classification alone provides limited results when

determining land cover types derived from gray level imagery. This research examined

approaching classification as a system which uses digital image processing techniques

such as filtering, texture analysis and principle components analysis to improve

supervised and unsupervised classification algorithms to provide a base for digitizing

land cover types in a GIS. Post processing operations included smoothing the

classification result and converting it to a vector layer that can be further refined in a GIS.


4/119

iv

Software tools were developed using ArcObjects to aid the process of refining the vector

classification. These tools improve the usability and accuracy of the digital image

processing results that help facilitate the visual interpretation and digitizing process to

gain a usable land use/land cover classification from gray level imagery.


5/119

v

TABLE OF CONTENTS

ABSTRACT. iiiLIST OF FIGURES..vii

LIST OF TABLES...... viiiACKNOWLEDGMENTS.....ix

CHAPTER 1: INTRODUCTION.11.1 Research Objective 4

CHAPTER 2: LITERATURE REVIEW..... 52.1 Historical Aerial Imagery Uses and Importance... 52.2 Classification Problems of High Resolution Panchromatic Imagery.62.3 Statistical Texture Indicators. 92.4 Image Enhancements and Filtering... 13

2.5 Image Segmentation and Object-based Image Analysis... 15CHAPTER 3: CONCEPTUAL FRAMEWORK AND METHODOLOGY... .17

3.1 Description of Study Area. 173.2 Description of Data 173.3 Methodology.. 21

3.3.1 Conceptual Overview.... 213.3.2 Software Utilized223.3.3 Preliminary Image Processes..... 243.3.4 Unsupervised Classification.. 273.3.5 Supervised Classification.. 333.3.6 Image Enhancement and Texture Analysis....353.3.7 Object-based Image Analysis.383.3.8 Post Processing and Automation403.3.9 Accuracy Assessment.....43

CHAPTER 4: ANALYSIS RESULTS AND DISCUSSION.464.1 Manual Digitizing.. 464.2 Unsupervised Classification.. 484.3 Supervised Classification.. 534.4 Image Enhancements and Texture Analysis...... 574.5 Object-based Image Analysis 634.6 Post Processing and Automation... 654.7 Classification Accuracy and Results. 70

CHAPTER 5: CONCLUSION815.1 Limitations of the Research... 815.2 Potential Future Developments. 81


6/119

vi

APPENDIX 1: ERROR MATRIX TABLES.. .84APPENDIX 2: VECTOR EDITING TOOLBAR .NET CODE..101REFERENCES......... .106


7/119

vii

LIST OF FIGURES

Figure 1 Aerial photo of Ogden study area... 18Figure 2 - Overview of study areas in relationship to the state of Utah.. 18Figure 3 Aerial photo of Salt Lake City study area...... 19

Figure 4 - Ogden DOQQ study area 20Figure 5 - Salt Lake City MDOQ study are. 21Figure 6 Main workflow processes... 23Figure 7 Ogden dendrogram of ISODATA clustering 10 classes.... 29Figure 8 Ogden dendrogram of ISODATA clustering 25 classes 30Figure 9 Ogden dendrogram of ISODATA clustering 100 classes.. 31Figure 10 Distances between classes from Salt Lake City dendrogram... 32Figure 11 Training sample distribution for the Ogden image...34Figure 12 Training sample distribution for Salt Lake City image.... 34Figure 13 Unstretched images compared to contrast stretched images 36Figure 14 Post processing ArcGIS Model 41

Figure 15 Polygon raster to vector, smoothing, and smooth simplify...42Figure 16 Classification using visual interpretation of the Ogden image. 49Figure 17 Classification using visual interpretation of the Salt Lake City image.49Figure 18 Ogden image ISODATA classifications.. 51Figure 19 Salt Lake City image ISODATA classifications.. 53Figure 20 Minimum distance and support vector machine classification of the Salt

Lake City image.... 57Figure 21 Minimum distance classification of the Ogden image with high pass filter 58Figure 22 Minimum distance classification of the Ogden image with low pass filter. 59Figure 23 ISODATA 10 spectral classes Halounova image. 61Figure 24 SCRM object-based segmentation images... 64Figure 25 Ogden object-based classification image and post processing system

vectors... 68Figure 26 Salt Lake City object-based classification image and post processing

system vectors. 69Figure 27 Ogden pixel based classification and post processing system vectors. 69


8/119

viii

LIST OF TABLES

Table 1 First level classification Ogden land use/land cover classes 25Table 2 First level classification Salt Lake City land use/land cover classes 27Table 3 ISODATA overall accuracy results for Ogden and Salt Lake City study areas

.. 50Table 4 Training sample statistics from original Ogden image. 55Table 5 Training sample statistics from original Salt Lake image.56Table 6 Ogden image overall accuracy and level 1 completion time.72Table 7 Salt Lake City image overall accuracy and level 1 completion time.... 74Table 8 Users accuracies for individual land use/land cover types Ogden study area. 76Table 9 Users accuracies for individual land use/land cover types Salt Lake City study

area... 77Table 10 Overall accuracy ranges for classification groups.. 78


9/119

ix

ACKNOWLEDGMENTS

I would like to thank Dr. Ming-Chih Hung for chairing my thesis committee and for

all the support, encouragement and guidance he has given me along the way. I would

also like to thank Dr. Yi-Hwa Wu and Dr. Patricia Drews for serving on my thesis

committee and for their contributions in developing this thesis. Last but certainly not

least I would like to thank my husband Barry for encouraging me through many long

nights and weekends while I completed this work. Without your support and love I

would never have been able to finish this thesis.


10/119

1

CHAPTER 1: INTRODUCTION

Aerial imagery from the 1930s to the present is a primary data source used to study

many natural processes and land use patterns (Carmel and Kadmon 1998, Kadmon and

Harari-Kremer 1999). Early aerial imagery from the 1930s to early 1990s is

predominantly black and white (panchromatic) film photography meaning there is only

one band of data. This type of imagery contains limited spectral information unlike

todays satellite digital sensors, which offer more spectral information even in the

panchromatic band.

The Aerial Photography Field Office (APFO) is a division of the Farm Service

Agency (FSA), of the United States Department of Agriculture (USDA). The APFO,

located in Salt Lake City, Utah, has one of the nations largest collections of historical

aerial imagery dating back to the 1950s. Film from the 1930s through 1940s was sent to

the national archives. APFO has over 50,000 rolls of film of which over 60% is black

and white (Mathews 2005). This historical aerial imagery is a valuable, largely untapped

resource. The film format of the imagery makes it unavailable to GIS and imagery

analysis programs unless it is scanned and processed to digital format. There is

widespread interest from the public and other government agencies in making this

imagery available and usable in digital format.

Recently, more historical imagery from the 1950s to 1990s is being scanned to digital

format for use in change detection projects for the Farm Service Agency (FSA).

According to Brian Vanderbilt (personal communication, 01 Sep 2009) FSA is interested

in studying agricultural loss patterns over long periods so that processes of change can be

more fully understood. One of the challenges with these types of projects is that land


11/119

2

use/land cover classification with the imagery usually involves visual interpretation and

manual digitizing, due to the difficulty of using digital image processing techniques with

the historical panchromatic imagery. Manual digitizing is a time consuming process for

multiple years of imagery, as each photo requires its own analysis. There are not enough

image analysts within FSA to manage the increasing workload for projects requiring the

use of historical imagery. Another concern is that study areas are limited in scale because

of the time and resources needed to digitize land cover types on the imagery. There is

interest and need to explore digital options for land cover classification so that the use of

these historical imagery datasets can be expanded.The ability to facilitate digitizing of land cover types on historical aerial imagery

would make it a more usable resource to study long term land use/land cover changes.

Classification of this type of imagery is very labor intensive which often limits the size of

study areas. If the imagery could be utilized on a more broad scale, we can gain greater

historical perspective on changes such as agricultural loss over time. Increased accuracy

and repeatability of results obtained by using digital image processing could make the

results of long-term change detection projects more valid rather than having to rely on

varying levels of image interpretation skills if a project requires several image analysts to

interpret imagery for a project.

Historical aerial imagery offers a unique opportunity to study long-term patterns of

land use/land cover change by offering the analyst a more extensive historical perspective

on geographic processes such as land use/land cover change, urban expansion and

vegetation patterns (Kadmon and Harari-Kremer 1999, Awwad 2003, Alhaddad et al.,

2009). Producing a thematic map through image classification is one of the most


12/119

3

common imagery analysis tasks in remote sensing. Image classification techniques such

as unsupervised and supervised classification, NDVI, spectral signatures, and spectral

band combinations have limited usability with panchromatic aerial imagery as they rely

heavily on spectral information, which is limited with this type of imagery.

Visual interpretation of imagery does not rely on spectral information alone to classify

imagery. Visual interpretation makes use of scene qualities such as texture, shape,

arrangement of objects and context of elements in an image. The human visual system is

very efficient at pattern recognition and in many ways is superior to existing machine

processing methods, but on the other hand inherent subjectivity and the inability of theeye to extract complex patterns can limit interpretation. Digital image processing

techniques that incorporate the use of texture, tone, shape, pattern recognition and object-

based image analysis can be used to enhance traditional methods of supervised and

unsupervised classification especially with gray level aerial imagery (Caridade et al.,

2008).

A great deal of research has been done on the most effective ways of classifying

multispectral imagery and mapping the results (Jensen, 2005). There is relatively little

research on how digital image processing of historical panchromatic imagery can

improve or reduce manual interpretation for image analysis and GIS analysis. In this

thesis research, digital image processing techniques including texture analysis,

convolution filters, and object-based image analysis were considered in respect to how

they can improve the classification of panchromatic aerial imagery and how this

improvement can facilitate digitizing and in some cases possibly eliminate it. A post

processing system involving image smoothing, raster to vector conversion, polygon


13/119

4

smoothing and simplification, and custom polygon editing tools for use in ESRIs

ArcMap GIS software was used to improve an initial digital image classification. The

post processing system can be used to improve most digital image classifications. The

quality of the baseline land use/land cover classification was the main factor in how

efficient it was to create a usable thematic layer.

Continued study in this area could yield new approaches to land cover classification of

gray level imagery. If historical imagery has the ability to be used effectively in a digital

environment, then more of it may be scanned and become more readily available, which

would benefit the geospatial community.1.1 Research Objective

The objective of this project is to establish a working model that utilizes digital image

processing to facilitate or assist the user with digitizing land cover zones from gray level

aerial photos. This study approaches the problem of digitizing land cover zones by first

classifying the aerial photo and then by establishing a post processing system employing

vector layers for use in a GIS.

There is limited research available in using digital image processing to enhance the

classification process of gray level aerial photos and the digitizing process. Digital image

processing may not be able to completely replace visual interpretation of this type of

imagery, but it may be able to make the process more efficient.


14/119

5

CHAPTER 2: LITERATURE REVIEW

2.1 Historical Aerial Imagery Uses and Importance

Historical imagery as referred to in this study refers to imagery acquired by an aerial

camera mounted in an airplane. The photography has been directly imaged onto film and

is also referred to as analog photography as opposed to modern digital imagery. This

historical imagery is black and white and may be referred to as either panchromatic or

gray level.

Black and White, gray level, and panchromatic are terms which refer to imagery

composed of shades of gray. The imagery used in this study has a pixel depth of 8 bitswhere the binary representation assumes that 0 is black and 255 is white. Between 0 and

255 raw pixel values are grayscale and the digital numbers correspond to different levels

of gray. For example a digital number of 127 will correspond to a medium gray in the

photo. This panchromatic imagery has a single band where digital numbers represent the

spectral reflectance from the visible light range. Historical panchromatic imagery

contains brightness values but has limited spectral information available in the visible

wavelengths (0.4-0.7 m), unlike the panchromatic band of a satellite image such as

Landsat 7, which generally is sensitive into the near infrared wavelengths (.52-.0.9 m)

(Hoffer 1984).

Historical aerial photographs are a valuable and important data source for studying

long term (20 80 year) change processes such as land use/land cover change and

vegetation and environmental dynamics. These historic photos present a snapshot in time

that may offer insight into the current state of land use/land cover change processes and

what patterns may have affected their growth and stability. Much of the imagery


15/119

6

available for long-term analysis is black and white aerial photography (Carmel and

Kadmon 1998, Hudak and Wessman 1998, Caridade et al. 2008). The historical record

that has been captured from aerial photography provides a long temporal history to work

with and provides an extensive frame of reference in which to assess the magnitude of

land use/land cover change. Advances in GIS, photogrammetry, image analysis, and

digital image processing have increased the potential to use historical aerial photography

for many types of change analysis including land use/land cover change (Okeke and

Karnieli 2006).

Gray level historical aerial photos used to produce land cover maps are generallycreated through techniques such as visual interpretation and manual digitizing (Carmel

and Kadmon 1998, Kadmon and Harari-Kremer 1999). This is a very time consuming

and labor intensive process. This fact has a tendency to limit analysis to small areas. The

digitizing itself is generally dependent on the ability of the interpreter and may lead to

results that are not objective due to skill level and human bias (Kadmon and Harari-

Kremer 1999). The assumption is often made that manual interpretation is 100 %

accurate but assessing the accuracy of this method is difficult according to Congalton and

Green (1993) and Carmel and Kadmon (1998).

2.2 Classification Problems of High Resolution Panchromatic Imagery

The historical aerial imagery analyzed in this project is limited in spectral information

and has high spatial detail. These two variables can present some difficulties with the use

of common digital classification and image processing techniques. The first challenge is

the spectral resolution, which is only one band. This band lacks detailed spectral

information. Most panchromatic aerial films are sensitive to the visible spectrum but also


16/119

7

require filtering to take into account haze and atmospheric conditions. The film is

generally filter exposed to green and red visible wavelengths and not the blue

wavelengths to cut down on atmospheric haze. The resulting image records in black and

white the tonal variations of the landscape in the scene (U.S. Army Corp of Engineers

1995). Common classification methods are limited in accuracy and usability when there

is only one band to work with (Short and Short 1987, Anderson and Cobb 2004, Caridade

et al. 2008).

Research from Carmel and Harari-Kremer (1999) and Carmel and Kadmon (1998)

have approached the limitations of having only one band of information to analyze inseveral ways. Carmel and Kadmon (1998) used a combination of illumination

adjustment and a modified maximum likelihood classifier that used neighborhood

statistics to achieve classification accuracies of over 80% for study of long-term

vegetation patterns using gray level aerial imagery. This research showed that the

relationship between neighborhood pixels was an important factor in achieving improved

classification accuracy. Carmel and Harari-Kremer (1999) concentrated on training data

and ancillary data to produce vegetation maps from black and white aerial photos from

1962 and 1992. The accuracy of using a maximum likelihood classifier was about 80%.

Their study stresses the importance of carefully considered training data and the utility of

digital image processing of historical aerial photography in vegetation change detection

studies. Mast et al. (1997) researched long-term change detection of forest ecotones

using gray level aerial imagery from 1937 1990. Density slicing was used after

determining the range of brightness values for tree cover across all imagery to get a

classification of tree cover and no such. Results were satisfactory although no accuracy


17/119

8

assessment was mentioned, but again the significance of object brightness values for gray

level imagery was established.

The second challenge when analyzing this imagery is that higher spatial resolution

does not generalize features to the degree coarse or medium scale imagery does, which

allows much more detail to be considered in an image. Individual trees, buildings and

sidewalks become visible when image detail is more perceptible in these 1-meter

resolution images. This factor makes visual interpretation easier but can cause problems

with automated classification, especially when spectral information is limited or non-

existent. High spatial resolution can increase within-class variances, which can causeuncertainty between classes. Browning et al. (2009) in their study of historical aerial

imagery as a data source emphasized the importance of object scale when analyzing

imagery. Some objects may be larger than a pixel, referred to as H-resolution, and some

objects may be smaller than a pixel, which is referred to as L-resolution. This factor can

make imagery with multiple scale objects more difficult to get consistent classification

results across a scene. Spatial autocorrelation is also an important factor when

considering this concept, as all natural scenes in remote sensing will have some type of

spatial autocorrelation to create a scene, so that the image organization is something other

than random noise (Strahler et al. 1986).

The challenges of limited spectral information and high spatial detail can lead to a

number of features in an image having similar gray level signatures and a great deal of

confusion between class types (Fauvel and Chanussot 2007). In turn a per pixel classifier

such as the maximum likelihood classifier has difficulty distinguishing between a

medium gray field and water in a panchromatic image. Panchromatic image


18/119

9

classification can be improved by considering the relationship between neighborhood

pixels as in texture analysis and object-based image analysis (Alhaddad et al. 2009,

Myint and Lam 2005, and Caridade et al., 2008).

2.3 Statistical Texture Indicators

Image texture is one of the most important visual indicators in distinguishing between

homogenous and heterogeneous regions in a scene. The human interpreter uses shape,

texture, size, pattern, shadow, arrangement and context of elements in an aerial photo to

distinguish between objects in the image (Campbell 2008). According to Tuceryan and

Jain (1998) texture is easy to discern in an image but it can be a difficult concept todefine and there is not one generally accepted definition. One way to define texture is to

consider it as the spatial variation of the intensity values in a region of an image

(Tuceryan and Jain 1998). This regional variation in intensity values implies that the

evaluation of texture is a neighborhood process and that a single pixel does not create

texture on its own.

Texture is also a quality of an image scene that corresponds to a pattern that is part of

the structure of the image. In a natural scene an area of farmland and a forested area

comprise two separate visual patterns in separable regions. These regions may also

contain secondary patterns having characteristics such as brightness, shape, size, etc. A

field may also have a planting pattern and a forest may be comprised of deciduous and

coniferous trees giving the area a distinctive sub pattern that has its own brightness,

shape, size, etc (Srinivasan and Shobha 2008). Texture as a property of an object or

regional feature in an image can be described as fine, smooth, coarse, etc. Tone is the

range of shades of gray in an image. According to Haralick (1979), tone and texture are


19/119

10

interdependent concepts in that both are always present in an image to varying degrees.

This interrelationship between tone and texture is explained by Haralick (1979) as

patches in an image that either have little variation in tonal primitives (tone) or a patch

that has a great variation of tonal primitives (texture).

The work of Haralick et al. (1973) was the foundation for most of the later research

relating to image texture analysis. Their work provided a computational method to

determine textural characteristics in an image scene and discussed several widely used

textural statistics used in image texture recognition. These statistics included: contrast,

correlation, angular second moment, inverse difference moment and entropy. Contrastmeasures the amount of local variation in an image. Correlation measures the linear

dependency of gray levels in the image. Angular second moment measures local

homogeneity. Inverse difference moment also measures local homogeneity but relates

inversely to contrast. Entropy measures randomness of values. Image analysis may be

performed using these measures either alone or in combination.

There are three main approaches to texture analysis. These approaches include

statistical, spectral and structural. Statistical methods are based on local statistical

parameters such as the co-occurrence matrix and variability within moving windows.

Spectral methods include analysis using the Fourier transform and structural methods

emphasize the shape of image primitives (Srinivasan and Shobha 2008). This study

utilized statistical methods to include the co-occurrence matrix, the occurrence measures

and moving windows. By evaluating the spatial distribution of gray values using

statistical methods, a set of statistics can be derived from the distributions of neighboring

features throughout the image. There are first order and second order texture statistics.


20/119

11

First order statistics such as mean, standard deviation, and variance analyze pixel

brightness values without analyzing the relationships between the pixels. Second order

statistics on the other hand analyze the relationships between two pixels and these

measures include contrast, dissimilarity, homogeneity, entropy, and angular second

moment (Srinivasan and Shobha 2008). First order and second order statistics are used in

this study as a method to improve the classification accuracy of panchromatic aerial

photos.

The analysis of texture is a technique that has been used to aid and increase

classification accuracy in both gray level image analysis and multispectral analysis.Haralick et al. (1973) conducted the first major study of texture as an imagery analysis

tool. They demonstrated the utility of the Gray Level Co-occurrence Matrix (GLCM) as

an analysis tool for panchromatic aerial photographs and multispectral imagery even

though computer processing constraints of the time hindered their study. The

classification accuracy in their study was 82% for the panchromatic aerial imagery.

Caridade et al. (2008) used the GLCM and a variety of moving window sizes to achieve

an overall classification accuracy of black and white aerial photos of 83.4% using four

land cover classes. The GLCM uses statistics such as dissimilarity, angular second

moment, homogeneity, contrast, entropy etc. to statistically determine the frequency of

pixel pairs of gray levels in the image. Caridade et al. (2008) also discusses the variation

of land cover type accuracies throughout an image. Their study shows that certain land

cover types such as water may achieve accuracy levels of 100% while others such as bare

ground are much lower at 76.5%. Cots-Folch et al. (2007) used the GLCM to train a

neural network classifier but the highest accuracy obtained was only 74%. Their study


21/119

12

stated that better training data and ancillary data sources could be used to improve the

results. Maillard (2003) compared the GLCM to semi-variogram and Fourier spectra

methods and found that the GLCM works better in areas where textures are easily

distinguished and the semi-variogram is better in areas where texture is more similar.

The Fourier method was less successful than either of the other two methods. Alhaddad

et al. (2009) found that the GLCM and mathematical morphology produced results which

were closer to visual interpretation than other texture analysis methods.

One of the main utilities of texture analysis as it applies to improving the classification

of panchromatic imagery in particular is that it increases the dimensionality of theimagery from one band to multiple bands. A new band is created for each texture

function. This increased dimensionality can help alleviate some of the problems of class

separability that arise when trying to classify historical aerial photos (Halounova 2009).

Halounova used a combination of texture, filtering and object oriented classification to

achieve overall accuracy levels between 89% and 92%. Their methodology of increasing

the dimensionality of panchromatic imagery to try to achieve more separability between

land use/land cover classes was an important influence on this thesis research.

In areas of heterogeneous objects, the texture information in neighborhood pixels is a

consideration. Common classification algorithms that rely on spectral information at the

pixel level do not consider spatial information. This spatial information can become very

important when trying to discern land cover types such as urban areas (Myint and Lam

2005). Two types of analysis can assist the classification process: region-based analysis

and window based analysis. Region-based analysis involves using image segmentation

and window based analysis can be used in pre- or post-classification to filter noise from


22/119

13

the results (Gong et al. 1992). The importance of the spatial aspect of texture analysis is

illustrated in many studies involving texture analysis (Haralick 1973, Gong et al. 1992,

Hudak and Wessman 1998, Myint and Lam 2005, Erener and Duzgun 2009, Pacifici et al

2009). This study used region-based analysis during object-based image analysis and

window based analysis through the GLCM.

2.4 Image Enhancements and Filtering

Texture analysis in combination with image pre-processing such as principal

component analysis has been explored by Awwad (2003). His study, which utilized a

1941 gray level photo, used texture analysis windows of different sizes and thencombined the results to create an image with sixteen layers. Principle components

analysis (PCA) was used to reduce the dimensionality of the resulting image. He

combined several digital processing techniques but overall accuracy was only 58%.

Much of the literature on using digital image processing techniques for classifying gray

level aerial photos does not make use of multiple texture window sizes in combination to

return a result. Even though examples are rare in the literature and accuracy was low as

reported by Awwad (2003), the technique has promise. Halounova (2009, 2005) also

combined several texture window sizes but used filtering and object oriented

classification rather than PCA to achieve classification accuracies over 90%. Image

enhancements such as filtering and texture add multiple channels to the one band

panchromatic image and allow the image to be processed in a similar fashion to a

multiple band image. There is room for more research using this type of methodology

with different parameters and different pre- or post-processing results such as

convolution filtering, edge detection and smoothing windows.


23/119

14

Edge detection is another important consideration when trying to separate a scene into

distinct objects. A natural scene such as an aerial photo does not necessarily have a clear

relationship between an object and a background. Anderson and Cobb (2004) provided a

new unsupervised hybrid classification algorithm based on edge detection and

thresholding for pixel classification. Nearest edge thresholding outperformed both the

maximum likelihood and ISODATA clustering classification schemes. Their study

illustrated the importance of edge detection between features in gray level aerial photos.

Li et al. (2008) also conducted research, which concentrated on the importance of edge

detection and shape characteristics. The process used was automated using ArcGISModel Builder and results were compared to manual digitizing with the model correctly

identifying 70% of the manual classifications. Hu et al. (2008) used grayscale

thresholding in regards to image segmentation and emphasized the importance of

transition regions between objects in a scene and the ability to segment objects in an

image. Transition regions between objects can be problematic when classifying complex

scenes, as there can be multiple areas in the image with different gray scales between

objects causing classification errors and a salt and pepper effect.

Texture filters in combination with neural network classifiers are another methodology

that has shown some success in land use/land cover classification of gray level aerial

photos. Ashish (2002) used several artificial neural network (ANN) classifiers based on

histograms, texture and spatial parameters with some success on 1993 gray level aerial

photos. Textural parameters yielded the highest overall accuracy at 92%. His study

further showed the importance of texture parameters for classification of gray level aerial

photos. Another study conducted by Pacifici et al. (2009) used a neural network


24/119

15

classifier and a simplification procedure with some success on the panchromatic bands of

WorldView-1 satellite imagery. After the simplification procedure called network

pruning was used on the imagery, texture was optimized and input features were

reduced producing classification accuracy above 90% in relation to the Kappa coefficient.

Their study provided another example of how texture parameters can improve the

classification accuracy of different types of classifiers using high resolution panchromatic

imagery.

2.5 Image Segmentation and Object-based Image Analysis

Considering the high spatial resolution of gray level aerial photos and the lack ofspectral information, object-based image analysis is another technique that has been

successful in classifying high spatial resolution imagery. Object-based image analysis

(OBIA) is a method of image analysis that uses objects in a scene rather than individual

pixels to derive information from the imagery. OBIA is a two-part process consisting of

image segmentation and then image classification. The image is first divided into

homogenous and adjacent regions, which take into account texture, region context, shape

and spectral information during the segmentation phase. Image segmentation reduces the

complexity of the image, and produces regions in the image, which can in turn be

considered meaningful to the image interpreter.

OBIA was compared to pixel based classification in a study by Pillai and Wesberg

(2005) using gray level aerial imagery from 1965 and 1995. Their study illustrated how

scale dependency can affect classification results depending on the objects studied. Scale

dependency of individual landscape elements can also affect the usefulness of texture

parameters as illustrated in Resler et al. (2004). Change at the scale of individual trees


25/119

16

was not statistically significant between pixel based classification and object-based

classification. Object-based classification was more accurate when comparing patches of

trees in high spatial-resolution panchromatic imagery. Their study illustrates the

importance of determining land use categories and object scale when classifying imagery.

Elmqvist et al. (2008) performed OBIA on the panchromatic band of an Ikonos image

and found that spectral information provided the best segmentation results. Classification

accuracies were fairly low for their study but outperformed pixel based classification.

Laliberte et al. (2004) used a combination of low-pass filtering and object-based image

analysis on gray level aerial photos successfully integrating gray level aerial photos andsatellite imagery in a change detection study. Middleton et al. (2008) successfully used

feature extraction and a support vector machine (SVM) supervised classifier to extract

features on a 1947 aerial image in a change detection study. One of the main conclusions

of their study was that classification accuracy of the panchromatic image was based on

image quality. Historic panchromatic imagery is not always of good quality due to age or

deterioration of the film. A successful methodology for classifying this type of imagery

needs to be successful for various levels of image quality.

The literature regarding classification of gray level aerial photos concentrates for the

most part on replacing manual digitizing with digital image processing techniques. There

is a gap in the literature in regard to using digital image processing to help facilitate

digitizing. By combining digital image analysis techniques such as texture and object-

based image analysis with GIS vector capabilities, digitizing land cover classification

zones can be enhanced and in some cases possibly eliminated.


26/119

17

CHAPTER 3: CONCEPTUAL FRAMEWORK AND METHODOLOGY

3.1 Description of the Study Area

The study area for this project is near Ogden, Utah (Figure 1). The area is in north

central Utah (Figure 2) and consists of a variety of land cover types including agricultural

land, impervious surfaces, grassland, forest and water. The Ogden study area does not

provide an example of dense urban land cover so a secondary area of interest was chosen

in Salt Lake City, Utah (Figure 3). The Salt Lake City study area includes a park and a

variety of residential and commercial land cover. By using two study areas with a variety

of textures and objects in the scene, this research can show the usefulness of digital image processing across two completely different areas and images.

The classification results concentrate on the Ogden imagery as this imagery has better

defined and larger areas of land class types. The Salt Lake City image is used mainly to

see how the same techniques can be used in an urban area. Urban areas have their own

unique classification challenges that are increased when trying to classify panchromatic

imagery. Another reason the Ogden image was the main focus of this research is that this

imagery was originally flown for FSA for agricultural purposes. It is also likely that

much of the historical imagery in the vault at APFO will be used to further study

historical agricultural change processes.

3.2 Description of Data

The image of Ogden, Utah from 1958 was obtained from the Aerial Photography Field

Offices internal imagery storage network. The Ogden study area was clipped from a

digital orthophoto quarter quadrangle (DOQQ) 4111256ne from 1958 (Figure 4) and

covers approximately 0.5 square miles. The image was scanned from black and white


27/119

18

Figure 1 Aerial photo of Ogden study area

Figure 2 - Overview of study areas in relationship to the state of Utah


28/119


29/119

20

and was scanned and ortho rectified at APFO using the same parameters and methods as

the 1958 Ogden imagery. Q1219_1977 is a mosaic that was created from original

DOQQs using Socet Set 4x and interactive seaming. The image resolution is 1 meter and

the bit depth is 8 bits.

Figure 4 - Ogden DOQQ Study Area


30/119


31/119

22

processing was completed, a number of digital image processing techniques were

performed on the imagery (see Figure 6). The original imagery was classified using

supervised and unsupervised classifiers to form the classification baseline information.

Then four main digital image processing techniques were used to try to improve the

classification. These four processes were: convolution filtering, texture analysis,

principle components analysis, and object-based image classification. Texture analysis

was used to create layer-stacked images which increased the dimensionality of the

original one band image to improve classification results. Principle components analysis

was used to decrease the dimensionality of the multiple layer texture images and in onecase the first principle component image derived from the multi-layer texture image was

layer stacked with the original one band image. The final digital image processing

component in the research was image post-processing to refine the most promising results

for GIS analysis. After image post-processing an accuracy assessment was completed to

compare the results of each classification with the digitized baseline information obtained

by visual interpretation (heads up digitizing).

3.3.2 Software Utilized

There were three software programs used in this project as no single software suite

available to me provided all the tools needed for this research. The imagery analysis

programs used were ERDAS Imagine version 11.0, ENVI 4.8 and ENVI EX 4.8. The

GIS software used is ArcMap 10.0. ERDAS Imagine has a good set of texture analysis

and filtering tools. ENVI EX and ENVI have the benefit of integration with the GIS

software and ENVI EX provided a wizard based feature extraction toolset for object-

based image classification. The main interface used to provide the baseline land use/land


32/119

23

Figure 6 Main Workflow Processes


33/119

24

cover zones to aid or facilitate the manual digitizing process is ArcMap 10 as this

software has good vector tools, and the ability to integrate ENVI image analysis tools

into ArcMap Model Builder.

3.3.3 Preliminary Processes

The study area was clipped from the original DOQQs using the ERDAS Imagine

subset tool. The area covers approximately 0.5 miles in both project areas to facilitate

digitizing and image processing. Much of the image processing including the use of

convolution filters; texture analysis and classification methods required trial and error to

find the best settings and analysis methods for the imagery. The best results wereanalyzed further using post processing, vector conversion and editing.

Heads up digitizing was performed on the Ogden and Salt Lake City imagery. This

provided the digitized baseline information as ground truth to be used later in the

classification accuracy assessment. Heads up digitizing was performed using ESRIs

ArcMap 10.0 software. A geodatabase was created for both the Ogden imagery and the

Salt Lake City imagery.

One person performed the visual interpretation of the imagery for the sake of

consistency. The interpreter has had eight years of work experience using photo

interpretation to create a variety of map types for the Defense Mapping Agency (now the

National Geospatial Intelligence Agency). The times were recorded so that a comparison

can be made between manual digitizing and digital image processing to determine the

efficiency of digital image processing.

The determination of land use classes was an important consideration as it had a great

deal of impact in the final results of image classification especially for panchromatic


34/119

25

imagery since so many land use/land cover types have similar digital number (DN)

values. Classification schemes in previous studies using black and white aerial imagery

have used relatively limited categories (Kadmon and Harari-Kremer 1999, Laliberte et al.

2004, Okeke and Karnieli 2006, and Pringle et al. 2009). This study includes three levels

of classification detail for the study areas. The approach looked at the classification of

the imagery in a bottom up manner going from a high level of detail in representing the

land cover types existing in the imagery to grouping these types into larger categories.

This strategy was used to determine how useful detailed digital analysis of the imagery

was compared to visual interpretation. The first level of classification of the Ogdenimagery was based on eight land use/land cover classes including water, forest, grassland,

dark fields, medium fields, light fields, bare earth and impervious surface (Table 1). At

this level it was too difficult to represent the cropland as one class as there is too much

variation between fallow fields and fields that are growing or wet. There was also

confusion between the most representative digital number values between dark, medium,

and light fields as there are pattern variations in the respective fields.

Table 1 First level classification Ogden land use/land cover classes

Class Name Description

Water Lakes, Reservoirs, Rivers

Forest Areas of trees with a canopy cover greater than 50%

Grassland Areas dominated by grasses and herbaceous plants with little or no tree or shrub cover

Dark Fields Agricultural cropland area characterized by dark gray tone DN ~ 0-122

Medium Fields Agricultural cropland area characterized by medium gray tone DN ~ 100-188

Light Fields Agricultural cropland area characterized by light gray tone DN ~ 151-200

Bare Earth Areas of earth, sand, and rock with little to no vegetation

Impervious Surface Buildings, roads, parking lots


35/119

26

The second level of classification took the eight classes and combined them into three

larger groups: cropland, vegetation, and other. Finally, the third level of classification

consisted of cropland and non-cropland. The results of these classifications and their

impact on classification accuracy were obtained by combining the results of the initial

classifications rather than running new supervised and unsupervised classifications to

reflect these combined groupings.

The classification system used on the Salt Lake City image also used a bottom up

approach starting out with a more detailed classification and then moving to more general

groupings. The first level of classification consisted of five land types includingcommercial, transportation, trees, grass, and residential (Table 2). The second level

classification was reduced to built up areas, vegetation, and transportation. The third level

of classification consisted of built up areas and non-built up areas. The Salt Lake City

image has entirely different characteristics from the Ogden image, as the Salt Lake City

image is comprised of a mixed type urban area without any agriculture, bare earth, forest,

or large bodies of water. The added classification difficulty in the Salt Lake City image

was that the commercial and residential areas are made up of a mixture of manmade and

natural materials. These areas consisted of thousands of small buildings and may be

surrounded by either grass or concrete, all of which provide a very complex pattern of

shapes and surfaces which were tonally very similar. There were many tonal similarities

existing in the Ogden imagery as well but the land cover types such as dark fields, light

fields, water, etc. are fairly homogenous blocks unlike the patchwork of the urban areas.


36/119

27

Table 2 First level classification Salt Lake City land use/land cover classes

Class Name Description

Commercial Built up area consisting of industrial, commercial complexes

Transportation Transportation network including major streets and highways

Residential Mixed area that includes single family homes, apartments, trees, and grass

Grass Areas dominated by grasses and herbaceous plants (yards, fields)

Trees Woody vegetation < 20ft tall

3.3.4 Unsupervised Classification

Unsupervised classification was performed on the original subset of the Ogden and

Salt Lake City images to provide the unsupervised classification baseline information for

comparison to digital classifications with image enhancements. This initial classification

was completed using ENVI 4.8 tools for ArcGIS and the ISODATA clustering algorithm.

This clustering algorithm essentially divides the image into naturally occurring groups of

pixels. Similar pixels are grouped together. Three classification sets were used to

process the imagery: 10, 25, and 100 spectral classes. After the imagery was classified,

these groups were interactively assigned an information class by visually comparing the

classified image and/or reference data. Since many of the spectral classes have similar

tonal values and statistics, it was necessary to assign some of these mixed classes to

either the most numerous type or the type with the most concentrated areas of pixels.

There was room for interpretation, and there is a certain amount of subjectivity involved

in assigning these classes. The interpreter needs to be familiar with the study area, and

when some classes are divided between seemingly equal areas, it was difficult to

determine which was the best class to assign the pixels to. In some cases a spectral class

was divided between 3 or 4 information classes. At this stage there was not a method to

split these classes into their respective groups using the ENVI or ArcGIS software. It is


37/119

28

possible to use masking and a technique called cluster busting, but this methodology was

not used in this research, as it requires a significant amount of extra processing.

The unsupervised classification process did provide some useful general information

about the imagery. It was very difficult to assign classes to the detail level land

classification system used for both the Ogden and the Salt Lake City images. After

aggregating classes and assigning them a land use/land cover type from the classification

scheme, there were about five classes that could be distinguished in the Ogden image and

three in the Salt Lake City image. A useful tool to visualize how the clusters in an image

are derived is a dendrogram. Dendrograms were created using the ArcGIS software forthe same number of classes and iterations as the unsupervised classifications (Figures 7,

8, 9). A dendrogram is a graphic diagram in the form of a tree that is used to analyze

clusters in a signature file (ESRI 2011). The dendrograms are used to show the clustering

process from individual classes to one large cluster. The dendrogram tool takes an input

signature file created in ArcMap and creates the diagram based on a hierarchical

clustering diagram. The classes are clusters of pixels and the graph illustrates the

distances between merged classes. The dendrogram helps to illustrate how the 10, 25,

and 100 classes are distributed using the ISODATA classifier. Many of the classes

overlap and are very close together numerically, which is why unsupervised classification

on panchromatic imagery often gives the user unsatisfactory results. The dendrograms

also illustrate the relatively small changes in class distances between having 10, 25, and

100 classes. Dendrograms of the Salt Lake City imagery were very similar except for

slight differences of distances between pairs of combined classes (Figure 10). The


38/119

29

ISODATA classifier only returned 67 classes instead of 100 for the Salt Lake City image

and 93 out of 100 for the Ogden image.

Figure 7 - Ogden dendrogram of ISODATA clustering 10 classes


39/119

30



40/119

31



41/119

32

10 Classes 25 Classes

100 Classes

Figure 10 Distances between classes from Salt Lake City dendrograms

A K-Means unsupervised classifier was also used to classify an Ogden texture image

incorporating the mean, variance and homogeneity bands. This classifier provided a

more satisfactory result on the texture images than the ISODATA classifier did. The K-

Means classifier in the ENVI software uses a set number of classes provided by the

analyst, and classes are determined after the classifier iterates through the image and the

optimal separability is reached based on the distance to mean (ENVI 2011). The

ISODATA classifier had difficulties with the texture image and returned a completely


42/119

33

gray image unless the classes were increased to well over 25. Considering how time

consuming it was to assign classes to the result the K-Means classifier was used. Ten

classes and 25 classes were used on the texture image.

3.3.5 Supervised Classification

Supervised classification was performed on the original image subsets to create the

supervised classification baseline information. Later on, another supervised classification

was performed on images which had been digitally processed or enhanced (filtering or

texture analysis). Results of the latter supervised classification were compared to the

supervised classification baseline information to determine if these digital image processenhancements improved classification. Supervised classification was performed using

ENVI and ArcGIS 10 software.

Supervised classification unlike unsupervised classification involves the user creating

training samples from land use/land cover classes that are determined to be present in the

imagery. The training sets called region of interest (ROI) were created using ENVI

software. This training data was used throughout the supervised classifications performed

on the original imagery, texture images, PCA images, and the filtered images. The final

training sets for both study areas were determined by trial and error. A training set was

developed which had about twice as many samples, but this set did not significantly

improve classification results for either image. These larger sets did however increase

processing time, so in the interest of efficiency smaller training sets were used throughout

(Figure 11 and 12). Training sets are inherently subjective and do require the analyst to

be able to distinguish land use/land cover types.


43/119

34

Figure 11 Training sample distribution for the Ogden image

Figure 12 Training sample distribution for Salt Lake City Image


44/119

35

Several supervised classifiers were used to evaluate the imagery using ENVI software.

The minimum distance classifier, the maximum likelihood classifier, neural net, and

SVM classifiers were examined. Each classifier provides distinct advantages and

disadvantages. The minimum distance to means classifier determines the mean of each

pre-defined class and then classifies pixels into the appropriate class by using the

Euclidean distance of the closest mean. One of the advantages to this algorithm is that it

classifies all pixels and processes very quickly. The maximum likelihood classifier

assumes that each class is normally distributed and is based on the highest probability

that a pixel will be assigned to a particular class. When classes have a multimodaldistribution this classifier will not provide optimum results. An advantage of this method

is that the classifier considers the mean and covariance of the samples. The neural net

classifier provided by ENVI software uses back propagation to determine class

assignment of pixels. An advantage of the neural net classifier is that it does not make

assumptions about the distribution of the data. The Support Vector Machine (SVM)

classifier available in the ENVI software works with any number of bands and has good

accuracy when automatically separating pixels into classes. This classifier also

maximizes the boundary between classes, which may be useful for distinguishing land

use/land cover types with similar characteristics. Another advantage of this classifier is

that it works well on imagery that has a lot of noise (ENVI 2011, Jensen 2005).

3.3.6 Image Enhancement and Texture Analysis

Digital image processing techniques were explored to determine if classification

results could be improved. Texture analysis, convolution filtering, and contrast stretching

enhance some of the spatial characteristics of the imagery. For example, contrast


45/119

36

stretching brings out more differences between light and dark areas of the imagery, and

convolution filters can enhance edges. Low pass filters can smooth out areas of noise in

an image such as the variations found throughout the field areas in the Ogden imagery,

while high pass filters make the image appear more crisp or sharp (Jensen 2005).

Convolution filtering, contrast stretching and texture filtering were used in a variety of

combinations to enhance the study areas and try to improve classification.

A two standard deviation contrast stretch was applied to both study areas to enhance

the contrast and sharpness of the imagery. Both original images lacked definition in the

light and dark areas of the image (Figure 13). The Ogden study area had a DN range of0-235 and the Salt Lake City study area had a DN range of 0-187. All subsequent

filtering and texture analysis was performed on the stretched images.

Unstretched Stretched

Figure 13 Unstretched images compared to contrast stretched images


46/119

37

Convolution filtering was performed on the study areas using ENVI software. High

pass filtering was used to help sharpen the imagery using a variety of kernel sizes: 3x3,

5x5, 7x7, and 11x11. Low pass filtering was applied to the imagery to smooth out noise

in the field areas. Again 3x3, 5x5, 7x7, and 11x11 kernels were examined. As the kernel

gets larger with low pass filtering, the detail becomes more generalized or blurred as this

type of filtering preserves the low frequency parts of the image. A median filter was also

examined using the previously mentioned kernel sizes. This filter has a smoothing effect

on the image but the edges remain somewhat crisper than the low pass filter. ENVI also

provides several edge enhancing filters that were used to process the original studyimages. The filters used in this study were Laplacian, Roberts and Sobel. The Laplacian

filter has an editable window size whereas the Roberts and Sobel filters do not have

editable kernels or window sizes. Edge filtered images were created using the Laplacian

filter using window sizes of 3x3, 5x5, 7x7 and 11x11. The Laplacian filter was also used

in combination with the Gaussian low pass filter to try and reduce some of the noise that

results when creating the Laplacian filtered images.

Texture images were created using ENVI software and are based on the GLCM which

includes the following texture characteristics: mean, variance, homogeneity, contrast,

dissimilarity, entropy, second moment and correlation. Another set of texture images

were created using the Occurrence measures which consist of data range, mean, variance,

entropy, and skewness. Each set of texture images was created using a 3x3, 5x5, 7x7 and

11x11 processing window. The processing window measures the number of times each

gray level occurs in that particular part of the image (ENVI 2011). As the processing

window becomes larger, image detail is lost. The texture images created using the


47/119

38

GLCM are eight band images, and the texture occurrence images are five band images;

thus the dimensionality of the imagery is significantly increased by the use of texture.

These two texture images were also layer stacked with the original imagery to create

nine-band and six-band images. Additional nine-band and six-band images were also

created from these two texture images layer stacked with a filtered original image. The

resulting images were then classified using unsupervised and supervised classifiers. The

accuracy of these classifications was then compared to the classification baseline

information using an error matrix.

Principle components analysis was used to reduce the number of bands on severalcomposite images. In this way the dimensionality of the imagery is reduced but most of

the information in the imagery is maintained. PCA was performed on a multi-layer

image consisting of images created from variance, mean, and homogeneity texture

operators, plus the original unprocessed image. The result was a two-layer image which

incorporates information from the original image and the texture layers.

ENVI software also provides tools to perform mathematical morphology filtering

which is a non-linear process based on shape. Morphology filtering was performed on

both the original imagery and 5x5 occurrence texture images. Supervised and

unsupervised classification was then performed to determine the accuracy as compared to

the classification baseline information.

3.3.7. Object-based Image Analysis

Another digital image processing technique which was explored in this research was

object-based image analysis. Object-based image analysis is based on regions or groups

of pixels in an image rather than single pixels. Feature extraction was performed using


48/119

39

ENVI EX which provides object-based tools that utilize spatial, spectral, and textural

features. The object-based analysis provided by the ENVI software uses an edge-based

segmentation algorithm and requires only the scale level as an input parameter. The scale

levels range from 0-100 where a high scale level reduces the number of segments that are

defined, and a low scale level increases the number of segments that are defined. There

should be a balance in determining the scale level by trying to choose a scale that

delineates the image object boundaries as well as possible. This level is likely to be

different depending on the characteristics of the imagery being analyzed. ENVI provides

an interactive preview window to help determine an appropriate scale level for an image.The preview window allows you to see what kind of effect changing the scale level of the

segmentation has on the objects of interest in the image scene before the segmentation

runs. This helps to avoid creating numerous unsuccessful segmentation images. After

the initial segmentation has been performed, image segment merging can be done. ENVI

uses the Lambda-Schedule algorithm that iteratively merges segments by using a

combination of spectral and spatial information. This step is especially helpful when an

image has been over segmented as it enables the aggregation of small segments that may

occur from image object variation (ENVI 2011). After segmentation the next step is to

find objects and classify the imagery. Objects were chosen interactively from the

segmented image and the image was then classified. ENVI EX offers either a K-means

classifier or a SVM classifier. Classification and post processing was performed using

both available OBIA classifiers. The final step before classification in the ENVI EX

feature extraction workflow is the refine results window. In this window there are

options to export vectors and smooth the results similar to using a majority filter on a


49/119

40

classified image. The process for using the feature extraction tools in ENVI EX is

designed to make the process of OBIA user friendly.

ENVI 4.8 also offers an OBIA classification method called size-constrained region

merging (SCRM). This tool is an extension that can be added to ENVI. The tool

partitions an image into reasonably homogenous polygons based on a minimum size

threshold. The output of the tool is a vector file and an image file. The vector file can be

used directly as an initial source to assist visual interpretation, and the image can be

further classified using either unsupervised or supervised classification. One of the

limitations of this extension is that there is a size limitation of 2MB for the image(Castilla and Hay 2007). All of the layer stacked imagery exceeded the size limitation for

using this tool. SCRM was used on the original imagery, the one band dissimilarity,

mean, homogeneity, and variance texture images. The second moment, entropy, and

contrast bands were not used, as there appears to be a lot of correlation between them and

the bands that were selected. The correlation band does not have enough usable

information in it to segment it into objects. The output image was then classified using

the SVM classifier.

3.3.8. Post Processing and Automation

The classified images created from the previously mentioned digital processing

techniques and classifiers contained varying quantities of island pixels and salt and

pepper noise. There are numerous methodologies to reduce these types of areas in a

classified image. Majority and minority filtering, clump, sieve, and combine classes are

some of the commonly available tools provided in GIS and image analysis software.

These processes reduce the complexity of the classification and allow a more cohesive


50/119

41

result for further analysis. Post classification processing may also produce error in the

final imagery by smoothing and combining the wrong classes together. It is also not

practical to remove noise pixel by pixel, as there may be thousands of areas to examine.

The next step in this research was to produce a vector polygon layer that can assist in

visual interpretation of the imagery. In order to simplify the procedure of processing the

classified rasters and converting them to a vector layer that facilitates visual

interpretation, a model was developed using ArcGIS Model Builder (Figure 14). This

model allows the user to input a classified image, apply a smoothing kernel, aggregate

island pixels to a specified tolerance, convert the raster to a vector layer, and smooth andsimplify the resulting polygons. For consistency a majority filter using a 3x3 window

and aggregation using a minimum threshold of 25 was used on all the classified images

examined. The model parameters for smoothing and simplifying polygons were left open

so that adjustments can be made for different images.

Figure 14 Post Processing ArcGIS Model


51/119

42

One of the challenges of using vector files that have been converted from raster files is

that polygons have a stepped appearance that follows pixel boundaries. This

characteristic appearance is much different from a vector file created through heads up

digitizing. A human digitizer classifies an image into recognizable objects using shape,

context, texture, shadows, etc. to help determine the boundaries of objects. This would

be very difficult if not impossible for a human digitizer to create land use/land cover

boundaries at the pixel level. This is one of the main differences between automated

classification and classification performed by visual interpretation.

The polygon smoothing and aggregation steps used in the model help to reduce someof the stepped appearance created by the raster to vector conversion process (Figure 15).

After polygons underwent smoothing and simplification, the result appeared much closer

to results obtained through visual interpretation. This process was also an advantage if

polygons needed to be reshaped. There are fewer vertices for each polygon after

completing these operations.

Once the vector layer had been processed through the model, it was edited using a

custom toolbar in ArcGIS 10 software. The custom toolbar includes a combination of

Figure 15 Polygon raster to vector, smoothing, and smooth and simplify


52/119

43

out of the box tools (Selection Tool and Cut Polygon Tool) and several custom tools

created using C#.net and ArcObjects. The purpose of the custom toolbar is to provide

functions to remove small islands by merging them to other neighboring pixels. It was

implemented as an Add-in which was easily added to the ArcGIS 10 user interface.

The toolbar consists of four custom tools: select by area, merge with smallest neighbor,

merge with largest neighbor, and merge with selected polygon. These tools are very

similar to raster majority and minority filtering except that the user has more control over

them. The tools were then used to further refine the classification using visual

interpretation. The automated classifications in essence become the starting point for themanual digitizing effort for the study areas.

3.3.9. Accuracy Assessment

One of the most serious limitations of historical imagery is ground-truthing. The

imagery is between 33 and 52 years old, and it is likely that many of the objects in the

imagery have changed or no longer exist today. Ground-truthing was limited to visual

interpretation and image accuracy. The baseline information derived from heads up

digitizing was used as ground truth to evaluate the accuracy of the classification of both

the original images and the images where digital image processing has been used (i.e.

filtering, texture, PCA and segmentation).

An evaluation tool called a confusion matrix (or error matrix) was used between

classification baseline information and classifications after image processing

enhancement so that there is a comparison of accuracy results. To save time and labor,

only the classifications deemed best were evaluated. The confusion matrix can help to


53/119


54/119

45

accuracy, errors of commission, single class accuracy, and the Kappa coefficient. Refer to

Appendix 1 for a sample of the error matrixes used in this research.


55/119

46

CHAPTER 4: ANALYSIS RESULTS AND DISCUSSION

4.1. Manual Digitizing

Heads up digitizing of land cover classes on any type of imagery whether it is multi

spectral or panchromatic allows the user more control over the results of the

classification. The results of this method of classification in general do not require

further editing or post processing. On the other hand subjectivity of the digitizer has an

effect on the results of the classification. It is unlikely that a digitizer would be able to

classify an image exactly the same every time.

Digitizing took place in two sessions with the Ogden imagery taking approximatelyfive hours to complete and the Salt Lake City image took approximately three hours to

complete. The Salt Lake City image has 331 polygons compared to the Ogden image

which has 172 polygons. The Ogden image took much longer to digitize even though

there are approximately half the amount of polygons. The polygons and land cover

configurations were more complicated when considering the integration of grassland,

forest and water areas on the image. The features on the Salt Lake City image are laid

out in a grid pattern separated by wide streets so even though there were almost twice as

many polygons to digitize the process went more quickly. An important aspect of this

research was to show that digital image processing of historical panchromatic imagery

could enhance and facilitate visual interpretation of the imagery on a variety of terrains

and features.

The visual interpretation of the imagery required a zoom level of between 1:1,500 and

1:3,000 on the Salt Lake City image and 1:1,000 and 1:4,000 on the Ogden image. These

zoom levels were determined by the digitizer by how well they could see the details in


56/119

47

the imagery while still being able to have some reference to the context of objects being

examined. In the experience of the digitizer a more consistent result is also achieved if

there is not a large variance in the viewing scale of the objects in the scene. If an area is

digitized at 1:3,000 and another area at 1:24,000 then the details being observed will not

be consistent throughout the study area. Digital image processing on the other hand

classifies by pixel without involving scale issues. This is a major difference in the

methodology of classification. Digitizing at varying scales is both an advantage and

disadvantage compared to digital classification. If the scale is zoomed in at the pixel

level, it was impossible to discern what the objects in the imagery were. A large variancein scale can lead to inconsistency, but a small variance in digitizing scale can help the

digitizer to consider a features relationship to surrounding objects when determining

what the object is, unlike most per pixel digital classifications. By using a small variance

in digitizing scale for land use/land cover classification of panchromatic imagery, both

detail and consistency can be maintained while the expert knowledge of relationships and

contexts of features can be utilized.

This project used relatively small areas of interest. After examining the land use/land

cover classes from the beginning of the project to its conclusion, there were areas of the

initial digitizing which on further analysis could have been refined or changed, especially

in diverse areas containing many intricate changes in the landscape. There was a

tendency to generalize areas where the land use/land cover types are fragmented. This

tendency is most notable in the Ogden image in the southern half of the image where the

forested areas are broken up by water and grasslands. The initial digitizing was not

changed to reflect new perceptions of the land class areas on the imagery. Some of these


57/119

48

inconsistencies have an effect on the final accuracy of the digital classifications, as it was

apparent that at some points the digital classification was more correct than the visual

interpretation. This is a limitation of the research.

One of the major differences found in this research between the manual digitizing

classification and the digital image processing classification was the level of detail

achieved in the classifications. In the Ogden image the total number of polygons

digitized was 172 (Figure 16) and the total number of polygons digitized for the Salt

Lake City image was 331 (Figure 17). The digital classifications in comparison before

post processing yielded several thousand polygons. After post processing most digitalimage classifications still exceeded the digitized baseline information but results

averaged about 500-1000 polygons. It was a difficult task to digitize very detailed areas

on the imagery. This study has shown that by utilizing digital image processing

techniques to help facilitate visual interpretation of land use/land cover classes, the

analyst can take advantage of the detail and repeatability that digital processes provide

while improving the classification accuracy using a GIS in post processing the results.

Results using visual interpretation and heads up digitizing may provide more initial

accuracy, but digital image processing lends some added consistency to the process.

4.2. Unsupervised Classification

Supervised and unsupervised classification results varied depending on the image, the

classification method, pre-processing, and post-processing. Panchromatic imagery

presents many challenges as previously mentioned in this study. The heterogeneity of

the study area also has an effect on how successful classification is. This study has


58/119

49

Figure 16 Classification using visual interpretation of the Ogden image

Figure 17 - Classification using visual interpretation of the Salt Lake City image


59/119


60/119

51

Although unsupervised classification showed low accuracy in both study areas, the

results showed some important trends in the data. In the Ogden image it was very

difficult to extract more than five classes which was an indication that land cover types

such as water, medium fields and grassland are very similar. Panchromatic imagery

would require more pre- and post-processing to achieve a more accurate classification

using eight land cover types. As the classes are aggregated into larger parent classes the

classification accuracy increased accordingly. Unsupervised classification even on a

small study area such as this was more time consuming than supervised classification and

provided somewhat unsatisfactory results.The Salt Lake City image proved difficult in a different way in that the mixed urban

area consisted of commercial, residential and transportation areas which appear very

distinct using visual interpretation but present difficulties for digital classifiers. Urban

areas are uniquely difficult to classify on multispectral imagery, as there is such a mixture

of impervious surfaces. Black and white high spatial resolution imagery complicates this

10 spectral classes 25 spectral classes 100 spectral classes

Figure 18 Ogden image ISODATA classifications


61/119

52

situation, as there was an extreme overlap between classes, because features such as

buildings and mixed surfaces like parking lots and vegetation exist in both residential and

commercial areas making it difficult to distinguish these areas. None of the ISODATA

classifications of the Salt Lake City imagery were able to distinguish between all five,

detail level land cover types. Trees, transportation, and commercial land cover types

were the only three land cover types that could be classified from the 10, 25, and 100

spectral class ISODATA classifications (Figure 19). Many areas of overlap exist

between the commercial and transportation classes in all three unsupervised

classifications. The transportation network in this image is a very distinct linear featurewhen classifying the imagery through visual interpretation, but there are many tonal

variations in the pavement which causes a great deal of confusion for most traditional

unsupervised classifiers. Grass and residential land cover types were unable to be

distinguished from commercial, transportation and trees as there was considerable tonal

overlap between these areas.

A 10 and 25 spectral class K-Means unsupervised classification was performed on the

Ogden imagery using a layer stacked image consisting of the original image and the

following texture characteristics: mean, variance, and homogeneity. Surprisingly the use

of texture did not improve the unsupervised classification using the level 1 land use/land

cover types. Overall accuracy was 25% for 10 classes and 34% for 25 classes. This is

most likely due to the fact that there was little to no distinction between the field areas as

most of them exhibit a smooth surface. Also the field areas and the water areas were

confused as well. Aggregating the classification into the more generalized classes

increased accuracy significantly in the unsupervised classification. This was particularly


62/119

53

10 spectral classes 25 spectral classes 100 spectral classes

Figure 19 Salt Lake City image ISODATA classifications

apparent in the texture image. Accuracy increased to 54% for the level 2 classification (3

land use/land cover types classification scheme) and increased to 71% for the level 3

classification (2 land use/land cover types classification scheme). The Halounova image

which consisted of texture and filtered layers did not provide improvement for the Ogden

image level 1 classification scheme using unsupervised classification, but did slightly

improve the Salt Lake City level 1 overall accuracy. Due to the poor accuracy results

using texture and unsupervised classification no further analysis was performed in either

study area.

4.3 Supervised Classification

Supervised classification of panchromatic imagery again presents many challenges.

The SVM classifier was used to perform the supervised classification as it has the ability

to process single band imagery and it provided better results. The supervised classifiers

available in ENVI are limited when using single band data as many options such as

maximum likelihood, spectral angle divergence, and neural net all require more than one


63/119


64/119

55

Table 4 Training sample statistics from original Ogden image

Land Cover Type Min Max Mean StDev Points

Water 129 177 151.6 13.1 1611

Forest 0 162 70.3 33.7 3246

Grassland 81 197 129.9 13.8 1944

Dark Field 53 102 78.2 12 2706

Medium Field 94 150 124.1 13.5 5056

Light Field 148 194 179 6.5 1918

Impervious Surface 74 223 170.7 28.3 732

Bare Earth 180 227 197.8 7.5 1393

to illustrate the overlap which occurs between commercial, residential and transportation

classes throughout this image (Table 5). The histograms were either bimodal or

multimodal. Although the histogram for transportation approached a normal distribution,

there were still many peaks and valleys indicating variations in gray levels in the image

for this land cover type.

Supervised classification results showed that it was very difficult to extract more than

8 classes on the Ogden image and 5 classes on the Salt Lake City image. One of the

limitations of using panchromatic imagery for land use/land cover classification is that

the DN values which make up the signature for many land use/land cover types contain a

significant amount of confusion. Real world features may be difficult to identify without

taking into account their spatial context (Hung and Wu 2005). Land use/land cover types

may need to be generalized. For example, detail like corn or wheat fields may not be

characterized using panchromatic imagery, but dark fields and light fields or cropland

may be possible. The increased accuracy achieved when aggregating land use/land cover


65/119

56

types into the level 2 and level 3 classification schemes support this conclusion. Training

samples tested with a greater number of pixels increased the confusion between classes

Joan Bie Dig Er Thesis

Documents

Transcript of Joan Bie Dig Er Thesis