Joan Bie Dig Er Thesis
-
Upload
rahul-koshta -
Category
Documents
-
view
216 -
download
0
Transcript of Joan Bie Dig Er Thesis
-
8/12/2019 Joan Bie Dig Er Thesis
1/119
THE USE OF DIGITAL IMAGE PROCESSING TO FACILITATE DIGITIZINGLAND COVER ZONES FROM GRAY LEVEL AERIAL PHOTOS
A THESIS PRESENTED TOTHE DEPARTMENT OF GEOLOGY AND GEOGRAPHY
IN CANDIDACY FOR THE DEGREE OFMASTER OF SCIENCE
ByJOAN M. BIEDIGER
NORTHWEST MISSOURI STATE UNIVERSITYMARYVILLE, MISSOURI
April 2012
-
8/12/2019 Joan Bie Dig Er Thesis
2/119
ii
DIGITAL IMAGE PROCESSING
The Use of Digital Image Processing to Facilitate Digitizing
Land Cover Zones from Gray Level Aerial Photos
Joan Biediger
Northwest Missouri State University
THESIS APPROVED
____________________________Thesis Advisor, Dr. Ming-Chih Hung Date
____________________________Dr. Yi-Hwa Wu Date
____________________________Dr. Patricia Drews Date
____________________________Dean of Graduate School, Dr. Gregory Haddock Date
-
8/12/2019 Joan Bie Dig Er Thesis
3/119
iii
The Use of Digital Image Processing to Facilitate Digitizing
Land Cover Zones from Gray Level Aerial Photos
Abstract
Aerial imagery from the 1930s to the early 1990s was predominantly acquired using
black and white film. Its use in remote sensing applications and GIS analysis is
constrained by its limited spectral information and high spatial resolution. As a historical
record and to study long-term land use/land cover change this imagery is a valuable but
often underutilized resource. Traditional classification of gray level aerial photos has
primarily relied on visual interpretation and digitizing to obtain land cover classifications
that can be used in a GIS. This is a time consuming and labor intensive process that can
often limit the scale of analysis.
This research focused on the use of digital image processing to facilitate visual
interpretation and heads up digitizing of gray level imagery. Existing remote sensing
software packages have limited functionalities with respect to classifying black and white
aerial photos. Traditional image classification alone provides limited results when
determining land cover types derived from gray level imagery. This research examined
approaching classification as a system which uses digital image processing techniques
such as filtering, texture analysis and principle components analysis to improve
supervised and unsupervised classification algorithms to provide a base for digitizing
land cover types in a GIS. Post processing operations included smoothing the
classification result and converting it to a vector layer that can be further refined in a GIS.
-
8/12/2019 Joan Bie Dig Er Thesis
4/119
iv
Software tools were developed using ArcObjects to aid the process of refining the vector
classification. These tools improve the usability and accuracy of the digital image
processing results that help facilitate the visual interpretation and digitizing process to
gain a usable land use/land cover classification from gray level imagery.
-
8/12/2019 Joan Bie Dig Er Thesis
5/119
v
TABLE OF CONTENTS
ABSTRACT. iiiLIST OF FIGURES..vii
LIST OF TABLES...... viiiACKNOWLEDGMENTS.....ix
CHAPTER 1: INTRODUCTION.11.1 Research Objective 4
CHAPTER 2: LITERATURE REVIEW..... 52.1 Historical Aerial Imagery Uses and Importance... 52.2 Classification Problems of High Resolution Panchromatic Imagery.62.3 Statistical Texture Indicators. 92.4 Image Enhancements and Filtering... 13
2.5 Image Segmentation and Object-based Image Analysis... 15CHAPTER 3: CONCEPTUAL FRAMEWORK AND METHODOLOGY... .17
3.1 Description of Study Area. 173.2 Description of Data 173.3 Methodology.. 21
3.3.1 Conceptual Overview.... 213.3.2 Software Utilized223.3.3 Preliminary Image Processes..... 243.3.4 Unsupervised Classification.. 273.3.5 Supervised Classification.. 333.3.6 Image Enhancement and Texture Analysis....353.3.7 Object-based Image Analysis.383.3.8 Post Processing and Automation403.3.9 Accuracy Assessment.....43
CHAPTER 4: ANALYSIS RESULTS AND DISCUSSION.464.1 Manual Digitizing.. 464.2 Unsupervised Classification.. 484.3 Supervised Classification.. 534.4 Image Enhancements and Texture Analysis...... 574.5 Object-based Image Analysis 634.6 Post Processing and Automation... 654.7 Classification Accuracy and Results. 70
CHAPTER 5: CONCLUSION815.1 Limitations of the Research... 815.2 Potential Future Developments. 81
-
8/12/2019 Joan Bie Dig Er Thesis
6/119
vi
APPENDIX 1: ERROR MATRIX TABLES.. .84APPENDIX 2: VECTOR EDITING TOOLBAR .NET CODE..101REFERENCES......... .106
-
8/12/2019 Joan Bie Dig Er Thesis
7/119
vii
LIST OF FIGURES
Figure 1 Aerial photo of Ogden study area... 18Figure 2 - Overview of study areas in relationship to the state of Utah.. 18Figure 3 Aerial photo of Salt Lake City study area...... 19
Figure 4 - Ogden DOQQ study area 20Figure 5 - Salt Lake City MDOQ study are. 21Figure 6 Main workflow processes... 23Figure 7 Ogden dendrogram of ISODATA clustering 10 classes.... 29Figure 8 Ogden dendrogram of ISODATA clustering 25 classes 30Figure 9 Ogden dendrogram of ISODATA clustering 100 classes.. 31Figure 10 Distances between classes from Salt Lake City dendrogram... 32Figure 11 Training sample distribution for the Ogden image...34Figure 12 Training sample distribution for Salt Lake City image.... 34Figure 13 Unstretched images compared to contrast stretched images 36Figure 14 Post processing ArcGIS Model 41
Figure 15 Polygon raster to vector, smoothing, and smooth simplify...42Figure 16 Classification using visual interpretation of the Ogden image. 49Figure 17 Classification using visual interpretation of the Salt Lake City image.49Figure 18 Ogden image ISODATA classifications.. 51Figure 19 Salt Lake City image ISODATA classifications.. 53Figure 20 Minimum distance and support vector machine classification of the Salt
Lake City image.... 57Figure 21 Minimum distance classification of the Ogden image with high pass filter 58Figure 22 Minimum distance classification of the Ogden image with low pass filter. 59Figure 23 ISODATA 10 spectral classes Halounova image. 61Figure 24 SCRM object-based segmentation images... 64Figure 25 Ogden object-based classification image and post processing system
vectors... 68Figure 26 Salt Lake City object-based classification image and post processing
system vectors. 69Figure 27 Ogden pixel based classification and post processing system vectors. 69
-
8/12/2019 Joan Bie Dig Er Thesis
8/119
viii
LIST OF TABLES
Table 1 First level classification Ogden land use/land cover classes 25Table 2 First level classification Salt Lake City land use/land cover classes 27Table 3 ISODATA overall accuracy results for Ogden and Salt Lake City study areas
.. 50Table 4 Training sample statistics from original Ogden image. 55Table 5 Training sample statistics from original Salt Lake image.56Table 6 Ogden image overall accuracy and level 1 completion time.72Table 7 Salt Lake City image overall accuracy and level 1 completion time.... 74Table 8 Users accuracies for individual land use/land cover types Ogden study area. 76Table 9 Users accuracies for individual land use/land cover types Salt Lake City study
area... 77Table 10 Overall accuracy ranges for classification groups.. 78
-
8/12/2019 Joan Bie Dig Er Thesis
9/119
ix
ACKNOWLEDGMENTS
I would like to thank Dr. Ming-Chih Hung for chairing my thesis committee and for
all the support, encouragement and guidance he has given me along the way. I would
also like to thank Dr. Yi-Hwa Wu and Dr. Patricia Drews for serving on my thesis
committee and for their contributions in developing this thesis. Last but certainly not
least I would like to thank my husband Barry for encouraging me through many long
nights and weekends while I completed this work. Without your support and love I
would never have been able to finish this thesis.
-
8/12/2019 Joan Bie Dig Er Thesis
10/119
1
CHAPTER 1: INTRODUCTION
Aerial imagery from the 1930s to the present is a primary data source used to study
many natural processes and land use patterns (Carmel and Kadmon 1998, Kadmon and
Harari-Kremer 1999). Early aerial imagery from the 1930s to early 1990s is
predominantly black and white (panchromatic) film photography meaning there is only
one band of data. This type of imagery contains limited spectral information unlike
todays satellite digital sensors, which offer more spectral information even in the
panchromatic band.
The Aerial Photography Field Office (APFO) is a division of the Farm Service
Agency (FSA), of the United States Department of Agriculture (USDA). The APFO,
located in Salt Lake City, Utah, has one of the nations largest collections of historical
aerial imagery dating back to the 1950s. Film from the 1930s through 1940s was sent to
the national archives. APFO has over 50,000 rolls of film of which over 60% is black
and white (Mathews 2005). This historical aerial imagery is a valuable, largely untapped
resource. The film format of the imagery makes it unavailable to GIS and imagery
analysis programs unless it is scanned and processed to digital format. There is
widespread interest from the public and other government agencies in making this
imagery available and usable in digital format.
Recently, more historical imagery from the 1950s to 1990s is being scanned to digital
format for use in change detection projects for the Farm Service Agency (FSA).
According to Brian Vanderbilt (personal communication, 01 Sep 2009) FSA is interested
in studying agricultural loss patterns over long periods so that processes of change can be
more fully understood. One of the challenges with these types of projects is that land
-
8/12/2019 Joan Bie Dig Er Thesis
11/119
2
use/land cover classification with the imagery usually involves visual interpretation and
manual digitizing, due to the difficulty of using digital image processing techniques with
the historical panchromatic imagery. Manual digitizing is a time consuming process for
multiple years of imagery, as each photo requires its own analysis. There are not enough
image analysts within FSA to manage the increasing workload for projects requiring the
use of historical imagery. Another concern is that study areas are limited in scale because
of the time and resources needed to digitize land cover types on the imagery. There is
interest and need to explore digital options for land cover classification so that the use of
these historical imagery datasets can be expanded.The ability to facilitate digitizing of land cover types on historical aerial imagery
would make it a more usable resource to study long term land use/land cover changes.
Classification of this type of imagery is very labor intensive which often limits the size of
study areas. If the imagery could be utilized on a more broad scale, we can gain greater
historical perspective on changes such as agricultural loss over time. Increased accuracy
and repeatability of results obtained by using digital image processing could make the
results of long-term change detection projects more valid rather than having to rely on
varying levels of image interpretation skills if a project requires several image analysts to
interpret imagery for a project.
Historical aerial imagery offers a unique opportunity to study long-term patterns of
land use/land cover change by offering the analyst a more extensive historical perspective
on geographic processes such as land use/land cover change, urban expansion and
vegetation patterns (Kadmon and Harari-Kremer 1999, Awwad 2003, Alhaddad et al.,
2009). Producing a thematic map through image classification is one of the most
-
8/12/2019 Joan Bie Dig Er Thesis
12/119
3
common imagery analysis tasks in remote sensing. Image classification techniques such
as unsupervised and supervised classification, NDVI, spectral signatures, and spectral
band combinations have limited usability with panchromatic aerial imagery as they rely
heavily on spectral information, which is limited with this type of imagery.
Visual interpretation of imagery does not rely on spectral information alone to classify
imagery. Visual interpretation makes use of scene qualities such as texture, shape,
arrangement of objects and context of elements in an image. The human visual system is
very efficient at pattern recognition and in many ways is superior to existing machine
processing methods, but on the other hand inherent subjectivity and the inability of theeye to extract complex patterns can limit interpretation. Digital image processing
techniques that incorporate the use of texture, tone, shape, pattern recognition and object-
based image analysis can be used to enhance traditional methods of supervised and
unsupervised classification especially with gray level aerial imagery (Caridade et al.,
2008).
A great deal of research has been done on the most effective ways of classifying
multispectral imagery and mapping the results (Jensen, 2005). There is relatively little
research on how digital image processing of historical panchromatic imagery can
improve or reduce manual interpretation for image analysis and GIS analysis. In this
thesis research, digital image processing techniques including texture analysis,
convolution filters, and object-based image analysis were considered in respect to how
they can improve the classification of panchromatic aerial imagery and how this
improvement can facilitate digitizing and in some cases possibly eliminate it. A post
processing system involving image smoothing, raster to vector conversion, polygon
-
8/12/2019 Joan Bie Dig Er Thesis
13/119
4
smoothing and simplification, and custom polygon editing tools for use in ESRIs
ArcMap GIS software was used to improve an initial digital image classification. The
post processing system can be used to improve most digital image classifications. The
quality of the baseline land use/land cover classification was the main factor in how
efficient it was to create a usable thematic layer.
Continued study in this area could yield new approaches to land cover classification of
gray level imagery. If historical imagery has the ability to be used effectively in a digital
environment, then more of it may be scanned and become more readily available, which
would benefit the geospatial community.1.1 Research Objective
The objective of this project is to establish a working model that utilizes digital image
processing to facilitate or assist the user with digitizing land cover zones from gray level
aerial photos. This study approaches the problem of digitizing land cover zones by first
classifying the aerial photo and then by establishing a post processing system employing
vector layers for use in a GIS.
There is limited research available in using digital image processing to enhance the
classification process of gray level aerial photos and the digitizing process. Digital image
processing may not be able to completely replace visual interpretation of this type of
imagery, but it may be able to make the process more efficient.
-
8/12/2019 Joan Bie Dig Er Thesis
14/119
5
CHAPTER 2: LITERATURE REVIEW
2.1 Historical Aerial Imagery Uses and Importance
Historical imagery as referred to in this study refers to imagery acquired by an aerial
camera mounted in an airplane. The photography has been directly imaged onto film and
is also referred to as analog photography as opposed to modern digital imagery. This
historical imagery is black and white and may be referred to as either panchromatic or
gray level.
Black and White, gray level, and panchromatic are terms which refer to imagery
composed of shades of gray. The imagery used in this study has a pixel depth of 8 bitswhere the binary representation assumes that 0 is black and 255 is white. Between 0 and
255 raw pixel values are grayscale and the digital numbers correspond to different levels
of gray. For example a digital number of 127 will correspond to a medium gray in the
photo. This panchromatic imagery has a single band where digital numbers represent the
spectral reflectance from the visible light range. Historical panchromatic imagery
contains brightness values but has limited spectral information available in the visible
wavelengths (0.4-0.7 m), unlike the panchromatic band of a satellite image such as
Landsat 7, which generally is sensitive into the near infrared wavelengths (.52-.0.9 m)
(Hoffer 1984).
Historical aerial photographs are a valuable and important data source for studying
long term (20 80 year) change processes such as land use/land cover change and
vegetation and environmental dynamics. These historic photos present a snapshot in time
that may offer insight into the current state of land use/land cover change processes and
what patterns may have affected their growth and stability. Much of the imagery
-
8/12/2019 Joan Bie Dig Er Thesis
15/119
6
available for long-term analysis is black and white aerial photography (Carmel and
Kadmon 1998, Hudak and Wessman 1998, Caridade et al. 2008). The historical record
that has been captured from aerial photography provides a long temporal history to work
with and provides an extensive frame of reference in which to assess the magnitude of
land use/land cover change. Advances in GIS, photogrammetry, image analysis, and
digital image processing have increased the potential to use historical aerial photography
for many types of change analysis including land use/land cover change (Okeke and
Karnieli 2006).
Gray level historical aerial photos used to produce land cover maps are generallycreated through techniques such as visual interpretation and manual digitizing (Carmel
and Kadmon 1998, Kadmon and Harari-Kremer 1999). This is a very time consuming
and labor intensive process. This fact has a tendency to limit analysis to small areas. The
digitizing itself is generally dependent on the ability of the interpreter and may lead to
results that are not objective due to skill level and human bias (Kadmon and Harari-
Kremer 1999). The assumption is often made that manual interpretation is 100 %
accurate but assessing the accuracy of this method is difficult according to Congalton and
Green (1993) and Carmel and Kadmon (1998).
2.2 Classification Problems of High Resolution Panchromatic Imagery
The historical aerial imagery analyzed in this project is limited in spectral information
and has high spatial detail. These two variables can present some difficulties with the use
of common digital classification and image processing techniques. The first challenge is
the spectral resolution, which is only one band. This band lacks detailed spectral
information. Most panchromatic aerial films are sensitive to the visible spectrum but also
-
8/12/2019 Joan Bie Dig Er Thesis
16/119
7
require filtering to take into account haze and atmospheric conditions. The film is
generally filter exposed to green and red visible wavelengths and not the blue
wavelengths to cut down on atmospheric haze. The resulting image records in black and
white the tonal variations of the landscape in the scene (U.S. Army Corp of Engineers
1995). Common classification methods are limited in accuracy and usability when there
is only one band to work with (Short and Short 1987, Anderson and Cobb 2004, Caridade
et al. 2008).
Research from Carmel and Harari-Kremer (1999) and Carmel and Kadmon (1998)
have approached the limitations of having only one band of information to analyze inseveral ways. Carmel and Kadmon (1998) used a combination of illumination
adjustment and a modified maximum likelihood classifier that used neighborhood
statistics to achieve classification accuracies of over 80% for study of long-term
vegetation patterns using gray level aerial imagery. This research showed that the
relationship between neighborhood pixels was an important factor in achieving improved
classification accuracy. Carmel and Harari-Kremer (1999) concentrated on training data
and ancillary data to produce vegetation maps from black and white aerial photos from
1962 and 1992. The accuracy of using a maximum likelihood classifier was about 80%.
Their study stresses the importance of carefully considered training data and the utility of
digital image processing of historical aerial photography in vegetation change detection
studies. Mast et al. (1997) researched long-term change detection of forest ecotones
using gray level aerial imagery from 1937 1990. Density slicing was used after
determining the range of brightness values for tree cover across all imagery to get a
classification of tree cover and no such. Results were satisfactory although no accuracy
-
8/12/2019 Joan Bie Dig Er Thesis
17/119
8
assessment was mentioned, but again the significance of object brightness values for gray
level imagery was established.
The second challenge when analyzing this imagery is that higher spatial resolution
does not generalize features to the degree coarse or medium scale imagery does, which
allows much more detail to be considered in an image. Individual trees, buildings and
sidewalks become visible when image detail is more perceptible in these 1-meter
resolution images. This factor makes visual interpretation easier but can cause problems
with automated classification, especially when spectral information is limited or non-
existent. High spatial resolution can increase within-class variances, which can causeuncertainty between classes. Browning et al. (2009) in their study of historical aerial
imagery as a data source emphasized the importance of object scale when analyzing
imagery. Some objects may be larger than a pixel, referred to as H-resolution, and some
objects may be smaller than a pixel, which is referred to as L-resolution. This factor can
make imagery with multiple scale objects more difficult to get consistent classification
results across a scene. Spatial autocorrelation is also an important factor when
considering this concept, as all natural scenes in remote sensing will have some type of
spatial autocorrelation to create a scene, so that the image organization is something other
than random noise (Strahler et al. 1986).
The challenges of limited spectral information and high spatial detail can lead to a
number of features in an image having similar gray level signatures and a great deal of
confusion between class types (Fauvel and Chanussot 2007). In turn a per pixel classifier
such as the maximum likelihood classifier has difficulty distinguishing between a
medium gray field and water in a panchromatic image. Panchromatic image
-
8/12/2019 Joan Bie Dig Er Thesis
18/119
9
classification can be improved by considering the relationship between neighborhood
pixels as in texture analysis and object-based image analysis (Alhaddad et al. 2009,
Myint and Lam 2005, and Caridade et al., 2008).
2.3 Statistical Texture Indicators
Image texture is one of the most important visual indicators in distinguishing between
homogenous and heterogeneous regions in a scene. The human interpreter uses shape,
texture, size, pattern, shadow, arrangement and context of elements in an aerial photo to
distinguish between objects in the image (Campbell 2008). According to Tuceryan and
Jain (1998) texture is easy to discern in an image but it can be a difficult concept todefine and there is not one generally accepted definition. One way to define texture is to
consider it as the spatial variation of the intensity values in a region of an image
(Tuceryan and Jain 1998). This regional variation in intensity values implies that the
evaluation of texture is a neighborhood process and that a single pixel does not create
texture on its own.
Texture is also a quality of an image scene that corresponds to a pattern that is part of
the structure of the image. In a natural scene an area of farmland and a forested area
comprise two separate visual patterns in separable regions. These regions may also
contain secondary patterns having characteristics such as brightness, shape, size, etc. A
field may also have a planting pattern and a forest may be comprised of deciduous and
coniferous trees giving the area a distinctive sub pattern that has its own brightness,
shape, size, etc (Srinivasan and Shobha 2008). Texture as a property of an object or
regional feature in an image can be described as fine, smooth, coarse, etc. Tone is the
range of shades of gray in an image. According to Haralick (1979), tone and texture are
-
8/12/2019 Joan Bie Dig Er Thesis
19/119
10
interdependent concepts in that both are always present in an image to varying degrees.
This interrelationship between tone and texture is explained by Haralick (1979) as
patches in an image that either have little variation in tonal primitives (tone) or a patch
that has a great variation of tonal primitives (texture).
The work of Haralick et al. (1973) was the foundation for most of the later research
relating to image texture analysis. Their work provided a computational method to
determine textural characteristics in an image scene and discussed several widely used
textural statistics used in image texture recognition. These statistics included: contrast,
correlation, angular second moment, inverse difference moment and entropy. Contrastmeasures the amount of local variation in an image. Correlation measures the linear
dependency of gray levels in the image. Angular second moment measures local
homogeneity. Inverse difference moment also measures local homogeneity but relates
inversely to contrast. Entropy measures randomness of values. Image analysis may be
performed using these measures either alone or in combination.
There are three main approaches to texture analysis. These approaches include
statistical, spectral and structural. Statistical methods are based on local statistical
parameters such as the co-occurrence matrix and variability within moving windows.
Spectral methods include analysis using the Fourier transform and structural methods
emphasize the shape of image primitives (Srinivasan and Shobha 2008). This study
utilized statistical methods to include the co-occurrence matrix, the occurrence measures
and moving windows. By evaluating the spatial distribution of gray values using
statistical methods, a set of statistics can be derived from the distributions of neighboring
features throughout the image. There are first order and second order texture statistics.
-
8/12/2019 Joan Bie Dig Er Thesis
20/119
11
First order statistics such as mean, standard deviation, and variance analyze pixel
brightness values without analyzing the relationships between the pixels. Second order
statistics on the other hand analyze the relationships between two pixels and these
measures include contrast, dissimilarity, homogeneity, entropy, and angular second
moment (Srinivasan and Shobha 2008). First order and second order statistics are used in
this study as a method to improve the classification accuracy of panchromatic aerial
photos.
The analysis of texture is a technique that has been used to aid and increase
classification accuracy in both gray level image analysis and multispectral analysis.Haralick et al. (1973) conducted the first major study of texture as an imagery analysis
tool. They demonstrated the utility of the Gray Level Co-occurrence Matrix (GLCM) as
an analysis tool for panchromatic aerial photographs and multispectral imagery even
though computer processing constraints of the time hindered their study. The
classification accuracy in their study was 82% for the panchromatic aerial imagery.
Caridade et al. (2008) used the GLCM and a variety of moving window sizes to achieve
an overall classification accuracy of black and white aerial photos of 83.4% using four
land cover classes. The GLCM uses statistics such as dissimilarity, angular second
moment, homogeneity, contrast, entropy etc. to statistically determine the frequency of
pixel pairs of gray levels in the image. Caridade et al. (2008) also discusses the variation
of land cover type accuracies throughout an image. Their study shows that certain land
cover types such as water may achieve accuracy levels of 100% while others such as bare
ground are much lower at 76.5%. Cots-Folch et al. (2007) used the GLCM to train a
neural network classifier but the highest accuracy obtained was only 74%. Their study
-
8/12/2019 Joan Bie Dig Er Thesis
21/119
12
stated that better training data and ancillary data sources could be used to improve the
results. Maillard (2003) compared the GLCM to semi-variogram and Fourier spectra
methods and found that the GLCM works better in areas where textures are easily
distinguished and the semi-variogram is better in areas where texture is more similar.
The Fourier method was less successful than either of the other two methods. Alhaddad
et al. (2009) found that the GLCM and mathematical morphology produced results which
were closer to visual interpretation than other texture analysis methods.
One of the main utilities of texture analysis as it applies to improving the classification
of panchromatic imagery in particular is that it increases the dimensionality of theimagery from one band to multiple bands. A new band is created for each texture
function. This increased dimensionality can help alleviate some of the problems of class
separability that arise when trying to classify historical aerial photos (Halounova 2009).
Halounova used a combination of texture, filtering and object oriented classification to
achieve overall accuracy levels between 89% and 92%. Their methodology of increasing
the dimensionality of panchromatic imagery to try to achieve more separability between
land use/land cover classes was an important influence on this thesis research.
In areas of heterogeneous objects, the texture information in neighborhood pixels is a
consideration. Common classification algorithms that rely on spectral information at the
pixel level do not consider spatial information. This spatial information can become very
important when trying to discern land cover types such as urban areas (Myint and Lam
2005). Two types of analysis can assist the classification process: region-based analysis
and window based analysis. Region-based analysis involves using image segmentation
and window based analysis can be used in pre- or post-classification to filter noise from
-
8/12/2019 Joan Bie Dig Er Thesis
22/119
13
the results (Gong et al. 1992). The importance of the spatial aspect of texture analysis is
illustrated in many studies involving texture analysis (Haralick 1973, Gong et al. 1992,
Hudak and Wessman 1998, Myint and Lam 2005, Erener and Duzgun 2009, Pacifici et al
2009). This study used region-based analysis during object-based image analysis and
window based analysis through the GLCM.
2.4 Image Enhancements and Filtering
Texture analysis in combination with image pre-processing such as principal
component analysis has been explored by Awwad (2003). His study, which utilized a
1941 gray level photo, used texture analysis windows of different sizes and thencombined the results to create an image with sixteen layers. Principle components
analysis (PCA) was used to reduce the dimensionality of the resulting image. He
combined several digital processing techniques but overall accuracy was only 58%.
Much of the literature on using digital image processing techniques for classifying gray
level aerial photos does not make use of multiple texture window sizes in combination to
return a result. Even though examples are rare in the literature and accuracy was low as
reported by Awwad (2003), the technique has promise. Halounova (2009, 2005) also
combined several texture window sizes but used filtering and object oriented
classification rather than PCA to achieve classification accuracies over 90%. Image
enhancements such as filtering and texture add multiple channels to the one band
panchromatic image and allow the image to be processed in a similar fashion to a
multiple band image. There is room for more research using this type of methodology
with different parameters and different pre- or post-processing results such as
convolution filtering, edge detection and smoothing windows.
-
8/12/2019 Joan Bie Dig Er Thesis
23/119
14
Edge detection is another important consideration when trying to separate a scene into
distinct objects. A natural scene such as an aerial photo does not necessarily have a clear
relationship between an object and a background. Anderson and Cobb (2004) provided a
new unsupervised hybrid classification algorithm based on edge detection and
thresholding for pixel classification. Nearest edge thresholding outperformed both the
maximum likelihood and ISODATA clustering classification schemes. Their study
illustrated the importance of edge detection between features in gray level aerial photos.
Li et al. (2008) also conducted research, which concentrated on the importance of edge
detection and shape characteristics. The process used was automated using ArcGISModel Builder and results were compared to manual digitizing with the model correctly
identifying 70% of the manual classifications. Hu et al. (2008) used grayscale
thresholding in regards to image segmentation and emphasized the importance of
transition regions between objects in a scene and the ability to segment objects in an
image. Transition regions between objects can be problematic when classifying complex
scenes, as there can be multiple areas in the image with different gray scales between
objects causing classification errors and a salt and pepper effect.
Texture filters in combination with neural network classifiers are another methodology
that has shown some success in land use/land cover classification of gray level aerial
photos. Ashish (2002) used several artificial neural network (ANN) classifiers based on
histograms, texture and spatial parameters with some success on 1993 gray level aerial
photos. Textural parameters yielded the highest overall accuracy at 92%. His study
further showed the importance of texture parameters for classification of gray level aerial
photos. Another study conducted by Pacifici et al. (2009) used a neural network
-
8/12/2019 Joan Bie Dig Er Thesis
24/119
15
classifier and a simplification procedure with some success on the panchromatic bands of
WorldView-1 satellite imagery. After the simplification procedure called network
pruning was used on the imagery, texture was optimized and input features were
reduced producing classification accuracy above 90% in relation to the Kappa coefficient.
Their study provided another example of how texture parameters can improve the
classification accuracy of different types of classifiers using high resolution panchromatic
imagery.
2.5 Image Segmentation and Object-based Image Analysis
Considering the high spatial resolution of gray level aerial photos and the lack ofspectral information, object-based image analysis is another technique that has been
successful in classifying high spatial resolution imagery. Object-based image analysis
(OBIA) is a method of image analysis that uses objects in a scene rather than individual
pixels to derive information from the imagery. OBIA is a two-part process consisting of
image segmentation and then image classification. The image is first divided into
homogenous and adjacent regions, which take into account texture, region context, shape
and spectral information during the segmentation phase. Image segmentation reduces the
complexity of the image, and produces regions in the image, which can in turn be
considered meaningful to the image interpreter.
OBIA was compared to pixel based classification in a study by Pillai and Wesberg
(2005) using gray level aerial imagery from 1965 and 1995. Their study illustrated how
scale dependency can affect classification results depending on the objects studied. Scale
dependency of individual landscape elements can also affect the usefulness of texture
parameters as illustrated in Resler et al. (2004). Change at the scale of individual trees
-
8/12/2019 Joan Bie Dig Er Thesis
25/119
16
was not statistically significant between pixel based classification and object-based
classification. Object-based classification was more accurate when comparing patches of
trees in high spatial-resolution panchromatic imagery. Their study illustrates the
importance of determining land use categories and object scale when classifying imagery.
Elmqvist et al. (2008) performed OBIA on the panchromatic band of an Ikonos image
and found that spectral information provided the best segmentation results. Classification
accuracies were fairly low for their study but outperformed pixel based classification.
Laliberte et al. (2004) used a combination of low-pass filtering and object-based image
analysis on gray level aerial photos successfully integrating gray level aerial photos andsatellite imagery in a change detection study. Middleton et al. (2008) successfully used
feature extraction and a support vector machine (SVM) supervised classifier to extract
features on a 1947 aerial image in a change detection study. One of the main conclusions
of their study was that classification accuracy of the panchromatic image was based on
image quality. Historic panchromatic imagery is not always of good quality due to age or
deterioration of the film. A successful methodology for classifying this type of imagery
needs to be successful for various levels of image quality.
The literature regarding classification of gray level aerial photos concentrates for the
most part on replacing manual digitizing with digital image processing techniques. There
is a gap in the literature in regard to using digital image processing to help facilitate
digitizing. By combining digital image analysis techniques such as texture and object-
based image analysis with GIS vector capabilities, digitizing land cover classification
zones can be enhanced and in some cases possibly eliminated.
-
8/12/2019 Joan Bie Dig Er Thesis
26/119
17
CHAPTER 3: CONCEPTUAL FRAMEWORK AND METHODOLOGY
3.1 Description of the Study Area
The study area for this project is near Ogden, Utah (Figure 1). The area is in north
central Utah (Figure 2) and consists of a variety of land cover types including agricultural
land, impervious surfaces, grassland, forest and water. The Ogden study area does not
provide an example of dense urban land cover so a secondary area of interest was chosen
in Salt Lake City, Utah (Figure 3). The Salt Lake City study area includes a park and a
variety of residential and commercial land cover. By using two study areas with a variety
of textures and objects in the scene, this research can show the usefulness of digital image processing across two completely different areas and images.
The classification results concentrate on the Ogden imagery as this imagery has better
defined and larger areas of land class types. The Salt Lake City image is used mainly to
see how the same techniques can be used in an urban area. Urban areas have their own
unique classification challenges that are increased when trying to classify panchromatic
imagery. Another reason the Ogden image was the main focus of this research is that this
imagery was originally flown for FSA for agricultural purposes. It is also likely that
much of the historical imagery in the vault at APFO will be used to further study
historical agricultural change processes.
3.2 Description of Data
The image of Ogden, Utah from 1958 was obtained from the Aerial Photography Field
Offices internal imagery storage network. The Ogden study area was clipped from a
digital orthophoto quarter quadrangle (DOQQ) 4111256ne from 1958 (Figure 4) and
covers approximately 0.5 square miles. The image was scanned from black and white
-
8/12/2019 Joan Bie Dig Er Thesis
27/119
18
Figure 1 Aerial photo of Ogden study area
Figure 2 - Overview of study areas in relationship to the state of Utah
-
8/12/2019 Joan Bie Dig Er Thesis
28/119
-
8/12/2019 Joan Bie Dig Er Thesis
29/119
20
and was scanned and ortho rectified at APFO using the same parameters and methods as
the 1958 Ogden imagery. Q1219_1977 is a mosaic that was created from original
DOQQs using Socet Set 4x and interactive seaming. The image resolution is 1 meter and
the bit depth is 8 bits.
Figure 4 - Ogden DOQQ Study Area
-
8/12/2019 Joan Bie Dig Er Thesis
30/119
-
8/12/2019 Joan Bie Dig Er Thesis
31/119
22
processing was completed, a number of digital image processing techniques were
performed on the imagery (see Figure 6). The original imagery was classified using
supervised and unsupervised classifiers to form the classification baseline information.
Then four main digital image processing techniques were used to try to improve the
classification. These four processes were: convolution filtering, texture analysis,
principle components analysis, and object-based image classification. Texture analysis
was used to create layer-stacked images which increased the dimensionality of the
original one band image to improve classification results. Principle components analysis
was used to decrease the dimensionality of the multiple layer texture images and in onecase the first principle component image derived from the multi-layer texture image was
layer stacked with the original one band image. The final digital image processing
component in the research was image post-processing to refine the most promising results
for GIS analysis. After image post-processing an accuracy assessment was completed to
compare the results of each classification with the digitized baseline information obtained
by visual interpretation (heads up digitizing).
3.3.2 Software Utilized
There were three software programs used in this project as no single software suite
available to me provided all the tools needed for this research. The imagery analysis
programs used were ERDAS Imagine version 11.0, ENVI 4.8 and ENVI EX 4.8. The
GIS software used is ArcMap 10.0. ERDAS Imagine has a good set of texture analysis
and filtering tools. ENVI EX and ENVI have the benefit of integration with the GIS
software and ENVI EX provided a wizard based feature extraction toolset for object-
based image classification. The main interface used to provide the baseline land use/land
-
8/12/2019 Joan Bie Dig Er Thesis
32/119
23
Figure 6 Main Workflow Processes
-
8/12/2019 Joan Bie Dig Er Thesis
33/119
24
cover zones to aid or facilitate the manual digitizing process is ArcMap 10 as this
software has good vector tools, and the ability to integrate ENVI image analysis tools
into ArcMap Model Builder.
3.3.3 Preliminary Processes
The study area was clipped from the original DOQQs using the ERDAS Imagine
subset tool. The area covers approximately 0.5 miles in both project areas to facilitate
digitizing and image processing. Much of the image processing including the use of
convolution filters; texture analysis and classification methods required trial and error to
find the best settings and analysis methods for the imagery. The best results wereanalyzed further using post processing, vector conversion and editing.
Heads up digitizing was performed on the Ogden and Salt Lake City imagery. This
provided the digitized baseline information as ground truth to be used later in the
classification accuracy assessment. Heads up digitizing was performed using ESRIs
ArcMap 10.0 software. A geodatabase was created for both the Ogden imagery and the
Salt Lake City imagery.
One person performed the visual interpretation of the imagery for the sake of
consistency. The interpreter has had eight years of work experience using photo
interpretation to create a variety of map types for the Defense Mapping Agency (now the
National Geospatial Intelligence Agency). The times were recorded so that a comparison
can be made between manual digitizing and digital image processing to determine the
efficiency of digital image processing.
The determination of land use classes was an important consideration as it had a great
deal of impact in the final results of image classification especially for panchromatic
-
8/12/2019 Joan Bie Dig Er Thesis
34/119
25
imagery since so many land use/land cover types have similar digital number (DN)
values. Classification schemes in previous studies using black and white aerial imagery
have used relatively limited categories (Kadmon and Harari-Kremer 1999, Laliberte et al.
2004, Okeke and Karnieli 2006, and Pringle et al. 2009). This study includes three levels
of classification detail for the study areas. The approach looked at the classification of
the imagery in a bottom up manner going from a high level of detail in representing the
land cover types existing in the imagery to grouping these types into larger categories.
This strategy was used to determine how useful detailed digital analysis of the imagery
was compared to visual interpretation. The first level of classification of the Ogdenimagery was based on eight land use/land cover classes including water, forest, grassland,
dark fields, medium fields, light fields, bare earth and impervious surface (Table 1). At
this level it was too difficult to represent the cropland as one class as there is too much
variation between fallow fields and fields that are growing or wet. There was also
confusion between the most representative digital number values between dark, medium,
and light fields as there are pattern variations in the respective fields.
Table 1 First level classification Ogden land use/land cover classes
Class Name Description
Water Lakes, Reservoirs, Rivers
Forest Areas of trees with a canopy cover greater than 50%
Grassland Areas dominated by grasses and herbaceous plants with little or no tree or shrub cover
Dark Fields Agricultural cropland area characterized by dark gray tone DN ~ 0-122
Medium Fields Agricultural cropland area characterized by medium gray tone DN ~ 100-188
Light Fields Agricultural cropland area characterized by light gray tone DN ~ 151-200
Bare Earth Areas of earth, sand, and rock with little to no vegetation
Impervious Surface Buildings, roads, parking lots
-
8/12/2019 Joan Bie Dig Er Thesis
35/119
26
The second level of classification took the eight classes and combined them into three
larger groups: cropland, vegetation, and other. Finally, the third level of classification
consisted of cropland and non-cropland. The results of these classifications and their
impact on classification accuracy were obtained by combining the results of the initial
classifications rather than running new supervised and unsupervised classifications to
reflect these combined groupings.
The classification system used on the Salt Lake City image also used a bottom up
approach starting out with a more detailed classification and then moving to more general
groupings. The first level of classification consisted of five land types includingcommercial, transportation, trees, grass, and residential (Table 2). The second level
classification was reduced to built up areas, vegetation, and transportation. The third level
of classification consisted of built up areas and non-built up areas. The Salt Lake City
image has entirely different characteristics from the Ogden image, as the Salt Lake City
image is comprised of a mixed type urban area without any agriculture, bare earth, forest,
or large bodies of water. The added classification difficulty in the Salt Lake City image
was that the commercial and residential areas are made up of a mixture of manmade and
natural materials. These areas consisted of thousands of small buildings and may be
surrounded by either grass or concrete, all of which provide a very complex pattern of
shapes and surfaces which were tonally very similar. There were many tonal similarities
existing in the Ogden imagery as well but the land cover types such as dark fields, light
fields, water, etc. are fairly homogenous blocks unlike the patchwork of the urban areas.
-
8/12/2019 Joan Bie Dig Er Thesis
36/119
27
Table 2 First level classification Salt Lake City land use/land cover classes
Class Name Description
Commercial Built up area consisting of industrial, commercial complexes
Transportation Transportation network including major streets and highways
Residential Mixed area that includes single family homes, apartments, trees, and grass
Grass Areas dominated by grasses and herbaceous plants (yards, fields)
Trees Woody vegetation < 20ft tall
3.3.4 Unsupervised Classification
Unsupervised classification was performed on the original subset of the Ogden and
Salt Lake City images to provide the unsupervised classification baseline information for
comparison to digital classifications with image enhancements. This initial classification
was completed using ENVI 4.8 tools for ArcGIS and the ISODATA clustering algorithm.
This clustering algorithm essentially divides the image into naturally occurring groups of
pixels. Similar pixels are grouped together. Three classification sets were used to
process the imagery: 10, 25, and 100 spectral classes. After the imagery was classified,
these groups were interactively assigned an information class by visually comparing the
classified image and/or reference data. Since many of the spectral classes have similar
tonal values and statistics, it was necessary to assign some of these mixed classes to
either the most numerous type or the type with the most concentrated areas of pixels.
There was room for interpretation, and there is a certain amount of subjectivity involved
in assigning these classes. The interpreter needs to be familiar with the study area, and
when some classes are divided between seemingly equal areas, it was difficult to
determine which was the best class to assign the pixels to. In some cases a spectral class
was divided between 3 or 4 information classes. At this stage there was not a method to
split these classes into their respective groups using the ENVI or ArcGIS software. It is
-
8/12/2019 Joan Bie Dig Er Thesis
37/119
28
possible to use masking and a technique called cluster busting, but this methodology was
not used in this research, as it requires a significant amount of extra processing.
The unsupervised classification process did provide some useful general information
about the imagery. It was very difficult to assign classes to the detail level land
classification system used for both the Ogden and the Salt Lake City images. After
aggregating classes and assigning them a land use/land cover type from the classification
scheme, there were about five classes that could be distinguished in the Ogden image and
three in the Salt Lake City image. A useful tool to visualize how the clusters in an image
are derived is a dendrogram. Dendrograms were created using the ArcGIS software forthe same number of classes and iterations as the unsupervised classifications (Figures 7,
8, 9). A dendrogram is a graphic diagram in the form of a tree that is used to analyze
clusters in a signature file (ESRI 2011). The dendrograms are used to show the clustering
process from individual classes to one large cluster. The dendrogram tool takes an input
signature file created in ArcMap and creates the diagram based on a hierarchical
clustering diagram. The classes are clusters of pixels and the graph illustrates the
distances between merged classes. The dendrogram helps to illustrate how the 10, 25,
and 100 classes are distributed using the ISODATA classifier. Many of the classes
overlap and are very close together numerically, which is why unsupervised classification
on panchromatic imagery often gives the user unsatisfactory results. The dendrograms
also illustrate the relatively small changes in class distances between having 10, 25, and
100 classes. Dendrograms of the Salt Lake City imagery were very similar except for
slight differences of distances between pairs of combined classes (Figure 10). The
-
8/12/2019 Joan Bie Dig Er Thesis
38/119
29
ISODATA classifier only returned 67 classes instead of 100 for the Salt Lake City image
and 93 out of 100 for the Ogden image.
Figure 7 - Ogden dendrogram of ISODATA clustering 10 classes
-
8/12/2019 Joan Bie Dig Er Thesis
39/119
30
Figure 8 - Ogden dendrogram of ISODATA clustering 25 classes
-
8/12/2019 Joan Bie Dig Er Thesis
40/119
31
Figure 9 - Ogden dendrogram of ISODATA clustering 100 classes
-
8/12/2019 Joan Bie Dig Er Thesis
41/119
32
10 Classes 25 Classes
100 Classes
Figure 10 Distances between classes from Salt Lake City dendrograms
A K-Means unsupervised classifier was also used to classify an Ogden texture image
incorporating the mean, variance and homogeneity bands. This classifier provided a
more satisfactory result on the texture images than the ISODATA classifier did. The K-
Means classifier in the ENVI software uses a set number of classes provided by the
analyst, and classes are determined after the classifier iterates through the image and the
optimal separability is reached based on the distance to mean (ENVI 2011). The
ISODATA classifier had difficulties with the texture image and returned a completely
-
8/12/2019 Joan Bie Dig Er Thesis
42/119
33
gray image unless the classes were increased to well over 25. Considering how time
consuming it was to assign classes to the result the K-Means classifier was used. Ten
classes and 25 classes were used on the texture image.
3.3.5 Supervised Classification
Supervised classification was performed on the original image subsets to create the
supervised classification baseline information. Later on, another supervised classification
was performed on images which had been digitally processed or enhanced (filtering or
texture analysis). Results of the latter supervised classification were compared to the
supervised classification baseline information to determine if these digital image processenhancements improved classification. Supervised classification was performed using
ENVI and ArcGIS 10 software.
Supervised classification unlike unsupervised classification involves the user creating
training samples from land use/land cover classes that are determined to be present in the
imagery. The training sets called region of interest (ROI) were created using ENVI
software. This training data was used throughout the supervised classifications performed
on the original imagery, texture images, PCA images, and the filtered images. The final
training sets for both study areas were determined by trial and error. A training set was
developed which had about twice as many samples, but this set did not significantly
improve classification results for either image. These larger sets did however increase
processing time, so in the interest of efficiency smaller training sets were used throughout
(Figure 11 and 12). Training sets are inherently subjective and do require the analyst to
be able to distinguish land use/land cover types.
-
8/12/2019 Joan Bie Dig Er Thesis
43/119
34
Figure 11 Training sample distribution for the Ogden image
Figure 12 Training sample distribution for Salt Lake City Image
-
8/12/2019 Joan Bie Dig Er Thesis
44/119
35
Several supervised classifiers were used to evaluate the imagery using ENVI software.
The minimum distance classifier, the maximum likelihood classifier, neural net, and
SVM classifiers were examined. Each classifier provides distinct advantages and
disadvantages. The minimum distance to means classifier determines the mean of each
pre-defined class and then classifies pixels into the appropriate class by using the
Euclidean distance of the closest mean. One of the advantages to this algorithm is that it
classifies all pixels and processes very quickly. The maximum likelihood classifier
assumes that each class is normally distributed and is based on the highest probability
that a pixel will be assigned to a particular class. When classes have a multimodaldistribution this classifier will not provide optimum results. An advantage of this method
is that the classifier considers the mean and covariance of the samples. The neural net
classifier provided by ENVI software uses back propagation to determine class
assignment of pixels. An advantage of the neural net classifier is that it does not make
assumptions about the distribution of the data. The Support Vector Machine (SVM)
classifier available in the ENVI software works with any number of bands and has good
accuracy when automatically separating pixels into classes. This classifier also
maximizes the boundary between classes, which may be useful for distinguishing land
use/land cover types with similar characteristics. Another advantage of this classifier is
that it works well on imagery that has a lot of noise (ENVI 2011, Jensen 2005).
3.3.6 Image Enhancement and Texture Analysis
Digital image processing techniques were explored to determine if classification
results could be improved. Texture analysis, convolution filtering, and contrast stretching
enhance some of the spatial characteristics of the imagery. For example, contrast
-
8/12/2019 Joan Bie Dig Er Thesis
45/119
36
stretching brings out more differences between light and dark areas of the imagery, and
convolution filters can enhance edges. Low pass filters can smooth out areas of noise in
an image such as the variations found throughout the field areas in the Ogden imagery,
while high pass filters make the image appear more crisp or sharp (Jensen 2005).
Convolution filtering, contrast stretching and texture filtering were used in a variety of
combinations to enhance the study areas and try to improve classification.
A two standard deviation contrast stretch was applied to both study areas to enhance
the contrast and sharpness of the imagery. Both original images lacked definition in the
light and dark areas of the image (Figure 13). The Ogden study area had a DN range of0-235 and the Salt Lake City study area had a DN range of 0-187. All subsequent
filtering and texture analysis was performed on the stretched images.
Unstretched Stretched
Figure 13 Unstretched images compared to contrast stretched images
-
8/12/2019 Joan Bie Dig Er Thesis
46/119
37
Convolution filtering was performed on the study areas using ENVI software. High
pass filtering was used to help sharpen the imagery using a variety of kernel sizes: 3x3,
5x5, 7x7, and 11x11. Low pass filtering was applied to the imagery to smooth out noise
in the field areas. Again 3x3, 5x5, 7x7, and 11x11 kernels were examined. As the kernel
gets larger with low pass filtering, the detail becomes more generalized or blurred as this
type of filtering preserves the low frequency parts of the image. A median filter was also
examined using the previously mentioned kernel sizes. This filter has a smoothing effect
on the image but the edges remain somewhat crisper than the low pass filter. ENVI also
provides several edge enhancing filters that were used to process the original studyimages. The filters used in this study were Laplacian, Roberts and Sobel. The Laplacian
filter has an editable window size whereas the Roberts and Sobel filters do not have
editable kernels or window sizes. Edge filtered images were created using the Laplacian
filter using window sizes of 3x3, 5x5, 7x7 and 11x11. The Laplacian filter was also used
in combination with the Gaussian low pass filter to try and reduce some of the noise that
results when creating the Laplacian filtered images.
Texture images were created using ENVI software and are based on the GLCM which
includes the following texture characteristics: mean, variance, homogeneity, contrast,
dissimilarity, entropy, second moment and correlation. Another set of texture images
were created using the Occurrence measures which consist of data range, mean, variance,
entropy, and skewness. Each set of texture images was created using a 3x3, 5x5, 7x7 and
11x11 processing window. The processing window measures the number of times each
gray level occurs in that particular part of the image (ENVI 2011). As the processing
window becomes larger, image detail is lost. The texture images created using the
-
8/12/2019 Joan Bie Dig Er Thesis
47/119
38
GLCM are eight band images, and the texture occurrence images are five band images;
thus the dimensionality of the imagery is significantly increased by the use of texture.
These two texture images were also layer stacked with the original imagery to create
nine-band and six-band images. Additional nine-band and six-band images were also
created from these two texture images layer stacked with a filtered original image. The
resulting images were then classified using unsupervised and supervised classifiers. The
accuracy of these classifications was then compared to the classification baseline
information using an error matrix.
Principle components analysis was used to reduce the number of bands on severalcomposite images. In this way the dimensionality of the imagery is reduced but most of
the information in the imagery is maintained. PCA was performed on a multi-layer
image consisting of images created from variance, mean, and homogeneity texture
operators, plus the original unprocessed image. The result was a two-layer image which
incorporates information from the original image and the texture layers.
ENVI software also provides tools to perform mathematical morphology filtering
which is a non-linear process based on shape. Morphology filtering was performed on
both the original imagery and 5x5 occurrence texture images. Supervised and
unsupervised classification was then performed to determine the accuracy as compared to
the classification baseline information.
3.3.7. Object-based Image Analysis
Another digital image processing technique which was explored in this research was
object-based image analysis. Object-based image analysis is based on regions or groups
of pixels in an image rather than single pixels. Feature extraction was performed using
-
8/12/2019 Joan Bie Dig Er Thesis
48/119
39
ENVI EX which provides object-based tools that utilize spatial, spectral, and textural
features. The object-based analysis provided by the ENVI software uses an edge-based
segmentation algorithm and requires only the scale level as an input parameter. The scale
levels range from 0-100 where a high scale level reduces the number of segments that are
defined, and a low scale level increases the number of segments that are defined. There
should be a balance in determining the scale level by trying to choose a scale that
delineates the image object boundaries as well as possible. This level is likely to be
different depending on the characteristics of the imagery being analyzed. ENVI provides
an interactive preview window to help determine an appropriate scale level for an image.The preview window allows you to see what kind of effect changing the scale level of the
segmentation has on the objects of interest in the image scene before the segmentation
runs. This helps to avoid creating numerous unsuccessful segmentation images. After
the initial segmentation has been performed, image segment merging can be done. ENVI
uses the Lambda-Schedule algorithm that iteratively merges segments by using a
combination of spectral and spatial information. This step is especially helpful when an
image has been over segmented as it enables the aggregation of small segments that may
occur from image object variation (ENVI 2011). After segmentation the next step is to
find objects and classify the imagery. Objects were chosen interactively from the
segmented image and the image was then classified. ENVI EX offers either a K-means
classifier or a SVM classifier. Classification and post processing was performed using
both available OBIA classifiers. The final step before classification in the ENVI EX
feature extraction workflow is the refine results window. In this window there are
options to export vectors and smooth the results similar to using a majority filter on a
-
8/12/2019 Joan Bie Dig Er Thesis
49/119
40
classified image. The process for using the feature extraction tools in ENVI EX is
designed to make the process of OBIA user friendly.
ENVI 4.8 also offers an OBIA classification method called size-constrained region
merging (SCRM). This tool is an extension that can be added to ENVI. The tool
partitions an image into reasonably homogenous polygons based on a minimum size
threshold. The output of the tool is a vector file and an image file. The vector file can be
used directly as an initial source to assist visual interpretation, and the image can be
further classified using either unsupervised or supervised classification. One of the
limitations of this extension is that there is a size limitation of 2MB for the image(Castilla and Hay 2007). All of the layer stacked imagery exceeded the size limitation for
using this tool. SCRM was used on the original imagery, the one band dissimilarity,
mean, homogeneity, and variance texture images. The second moment, entropy, and
contrast bands were not used, as there appears to be a lot of correlation between them and
the bands that were selected. The correlation band does not have enough usable
information in it to segment it into objects. The output image was then classified using
the SVM classifier.
3.3.8. Post Processing and Automation
The classified images created from the previously mentioned digital processing
techniques and classifiers contained varying quantities of island pixels and salt and
pepper noise. There are numerous methodologies to reduce these types of areas in a
classified image. Majority and minority filtering, clump, sieve, and combine classes are
some of the commonly available tools provided in GIS and image analysis software.
These processes reduce the complexity of the classification and allow a more cohesive
-
8/12/2019 Joan Bie Dig Er Thesis
50/119
41
result for further analysis. Post classification processing may also produce error in the
final imagery by smoothing and combining the wrong classes together. It is also not
practical to remove noise pixel by pixel, as there may be thousands of areas to examine.
The next step in this research was to produce a vector polygon layer that can assist in
visual interpretation of the imagery. In order to simplify the procedure of processing the
classified rasters and converting them to a vector layer that facilitates visual
interpretation, a model was developed using ArcGIS Model Builder (Figure 14). This
model allows the user to input a classified image, apply a smoothing kernel, aggregate
island pixels to a specified tolerance, convert the raster to a vector layer, and smooth andsimplify the resulting polygons. For consistency a majority filter using a 3x3 window
and aggregation using a minimum threshold of 25 was used on all the classified images
examined. The model parameters for smoothing and simplifying polygons were left open
so that adjustments can be made for different images.
Figure 14 Post Processing ArcGIS Model
-
8/12/2019 Joan Bie Dig Er Thesis
51/119
42
One of the challenges of using vector files that have been converted from raster files is
that polygons have a stepped appearance that follows pixel boundaries. This
characteristic appearance is much different from a vector file created through heads up
digitizing. A human digitizer classifies an image into recognizable objects using shape,
context, texture, shadows, etc. to help determine the boundaries of objects. This would
be very difficult if not impossible for a human digitizer to create land use/land cover
boundaries at the pixel level. This is one of the main differences between automated
classification and classification performed by visual interpretation.
The polygon smoothing and aggregation steps used in the model help to reduce someof the stepped appearance created by the raster to vector conversion process (Figure 15).
After polygons underwent smoothing and simplification, the result appeared much closer
to results obtained through visual interpretation. This process was also an advantage if
polygons needed to be reshaped. There are fewer vertices for each polygon after
completing these operations.
Once the vector layer had been processed through the model, it was edited using a
custom toolbar in ArcGIS 10 software. The custom toolbar includes a combination of
Figure 15 Polygon raster to vector, smoothing, and smooth and simplify
-
8/12/2019 Joan Bie Dig Er Thesis
52/119
43
out of the box tools (Selection Tool and Cut Polygon Tool) and several custom tools
created using C#.net and ArcObjects. The purpose of the custom toolbar is to provide
functions to remove small islands by merging them to other neighboring pixels. It was
implemented as an Add-in which was easily added to the ArcGIS 10 user interface.
The toolbar consists of four custom tools: select by area, merge with smallest neighbor,
merge with largest neighbor, and merge with selected polygon. These tools are very
similar to raster majority and minority filtering except that the user has more control over
them. The tools were then used to further refine the classification using visual
interpretation. The automated classifications in essence become the starting point for themanual digitizing effort for the study areas.
3.3.9. Accuracy Assessment
One of the most serious limitations of historical imagery is ground-truthing. The
imagery is between 33 and 52 years old, and it is likely that many of the objects in the
imagery have changed or no longer exist today. Ground-truthing was limited to visual
interpretation and image accuracy. The baseline information derived from heads up
digitizing was used as ground truth to evaluate the accuracy of the classification of both
the original images and the images where digital image processing has been used (i.e.
filtering, texture, PCA and segmentation).
An evaluation tool called a confusion matrix (or error matrix) was used between
classification baseline information and classifications after image processing
enhancement so that there is a comparison of accuracy results. To save time and labor,
only the classifications deemed best were evaluated. The confusion matrix can help to
-
8/12/2019 Joan Bie Dig Er Thesis
53/119
-
8/12/2019 Joan Bie Dig Er Thesis
54/119
45
accuracy, errors of commission, single class accuracy, and the Kappa coefficient. Refer to
Appendix 1 for a sample of the error matrixes used in this research.
-
8/12/2019 Joan Bie Dig Er Thesis
55/119
46
CHAPTER 4: ANALYSIS RESULTS AND DISCUSSION
4.1. Manual Digitizing
Heads up digitizing of land cover classes on any type of imagery whether it is multi
spectral or panchromatic allows the user more control over the results of the
classification. The results of this method of classification in general do not require
further editing or post processing. On the other hand subjectivity of the digitizer has an
effect on the results of the classification. It is unlikely that a digitizer would be able to
classify an image exactly the same every time.
Digitizing took place in two sessions with the Ogden imagery taking approximatelyfive hours to complete and the Salt Lake City image took approximately three hours to
complete. The Salt Lake City image has 331 polygons compared to the Ogden image
which has 172 polygons. The Ogden image took much longer to digitize even though
there are approximately half the amount of polygons. The polygons and land cover
configurations were more complicated when considering the integration of grassland,
forest and water areas on the image. The features on the Salt Lake City image are laid
out in a grid pattern separated by wide streets so even though there were almost twice as
many polygons to digitize the process went more quickly. An important aspect of this
research was to show that digital image processing of historical panchromatic imagery
could enhance and facilitate visual interpretation of the imagery on a variety of terrains
and features.
The visual interpretation of the imagery required a zoom level of between 1:1,500 and
1:3,000 on the Salt Lake City image and 1:1,000 and 1:4,000 on the Ogden image. These
zoom levels were determined by the digitizer by how well they could see the details in
-
8/12/2019 Joan Bie Dig Er Thesis
56/119
47
the imagery while still being able to have some reference to the context of objects being
examined. In the experience of the digitizer a more consistent result is also achieved if
there is not a large variance in the viewing scale of the objects in the scene. If an area is
digitized at 1:3,000 and another area at 1:24,000 then the details being observed will not
be consistent throughout the study area. Digital image processing on the other hand
classifies by pixel without involving scale issues. This is a major difference in the
methodology of classification. Digitizing at varying scales is both an advantage and
disadvantage compared to digital classification. If the scale is zoomed in at the pixel
level, it was impossible to discern what the objects in the imagery were. A large variancein scale can lead to inconsistency, but a small variance in digitizing scale can help the
digitizer to consider a features relationship to surrounding objects when determining
what the object is, unlike most per pixel digital classifications. By using a small variance
in digitizing scale for land use/land cover classification of panchromatic imagery, both
detail and consistency can be maintained while the expert knowledge of relationships and
contexts of features can be utilized.
This project used relatively small areas of interest. After examining the land use/land
cover classes from the beginning of the project to its conclusion, there were areas of the
initial digitizing which on further analysis could have been refined or changed, especially
in diverse areas containing many intricate changes in the landscape. There was a
tendency to generalize areas where the land use/land cover types are fragmented. This
tendency is most notable in the Ogden image in the southern half of the image where the
forested areas are broken up by water and grasslands. The initial digitizing was not
changed to reflect new perceptions of the land class areas on the imagery. Some of these
-
8/12/2019 Joan Bie Dig Er Thesis
57/119
48
inconsistencies have an effect on the final accuracy of the digital classifications, as it was
apparent that at some points the digital classification was more correct than the visual
interpretation. This is a limitation of the research.
One of the major differences found in this research between the manual digitizing
classification and the digital image processing classification was the level of detail
achieved in the classifications. In the Ogden image the total number of polygons
digitized was 172 (Figure 16) and the total number of polygons digitized for the Salt
Lake City image was 331 (Figure 17). The digital classifications in comparison before
post processing yielded several thousand polygons. After post processing most digitalimage classifications still exceeded the digitized baseline information but results
averaged about 500-1000 polygons. It was a difficult task to digitize very detailed areas
on the imagery. This study has shown that by utilizing digital image processing
techniques to help facilitate visual interpretation of land use/land cover classes, the
analyst can take advantage of the detail and repeatability that digital processes provide
while improving the classification accuracy using a GIS in post processing the results.
Results using visual interpretation and heads up digitizing may provide more initial
accuracy, but digital image processing lends some added consistency to the process.
4.2. Unsupervised Classification
Supervised and unsupervised classification results varied depending on the image, the
classification method, pre-processing, and post-processing. Panchromatic imagery
presents many challenges as previously mentioned in this study. The heterogeneity of
the study area also has an effect on how successful classification is. This study has
-
8/12/2019 Joan Bie Dig Er Thesis
58/119
49
Figure 16 Classification using visual interpretation of the Ogden image
Figure 17 - Classification using visual interpretation of the Salt Lake City image
-
8/12/2019 Joan Bie Dig Er Thesis
59/119
-
8/12/2019 Joan Bie Dig Er Thesis
60/119
51
Although unsupervised classification showed low accuracy in both study areas, the
results showed some important trends in the data. In the Ogden image it was very
difficult to extract more than five classes which was an indication that land cover types
such as water, medium fields and grassland are very similar. Panchromatic imagery
would require more pre- and post-processing to achieve a more accurate classification
using eight land cover types. As the classes are aggregated into larger parent classes the
classification accuracy increased accordingly. Unsupervised classification even on a
small study area such as this was more time consuming than supervised classification and
provided somewhat unsatisfactory results.The Salt Lake City image proved difficult in a different way in that the mixed urban
area consisted of commercial, residential and transportation areas which appear very
distinct using visual interpretation but present difficulties for digital classifiers. Urban
areas are uniquely difficult to classify on multispectral imagery, as there is such a mixture
of impervious surfaces. Black and white high spatial resolution imagery complicates this
10 spectral classes 25 spectral classes 100 spectral classes
Figure 18 Ogden image ISODATA classifications
-
8/12/2019 Joan Bie Dig Er Thesis
61/119
52
situation, as there was an extreme overlap between classes, because features such as
buildings and mixed surfaces like parking lots and vegetation exist in both residential and
commercial areas making it difficult to distinguish these areas. None of the ISODATA
classifications of the Salt Lake City imagery were able to distinguish between all five,
detail level land cover types. Trees, transportation, and commercial land cover types
were the only three land cover types that could be classified from the 10, 25, and 100
spectral class ISODATA classifications (Figure 19). Many areas of overlap exist
between the commercial and transportation classes in all three unsupervised
classifications. The transportation network in this image is a very distinct linear featurewhen classifying the imagery through visual interpretation, but there are many tonal
variations in the pavement which causes a great deal of confusion for most traditional
unsupervised classifiers. Grass and residential land cover types were unable to be
distinguished from commercial, transportation and trees as there was considerable tonal
overlap between these areas.
A 10 and 25 spectral class K-Means unsupervised classification was performed on the
Ogden imagery using a layer stacked image consisting of the original image and the
following texture characteristics: mean, variance, and homogeneity. Surprisingly the use
of texture did not improve the unsupervised classification using the level 1 land use/land
cover types. Overall accuracy was 25% for 10 classes and 34% for 25 classes. This is
most likely due to the fact that there was little to no distinction between the field areas as
most of them exhibit a smooth surface. Also the field areas and the water areas were
confused as well. Aggregating the classification into the more generalized classes
increased accuracy significantly in the unsupervised classification. This was particularly
-
8/12/2019 Joan Bie Dig Er Thesis
62/119
53
10 spectral classes 25 spectral classes 100 spectral classes
Figure 19 Salt Lake City image ISODATA classifications
apparent in the texture image. Accuracy increased to 54% for the level 2 classification (3
land use/land cover types classification scheme) and increased to 71% for the level 3
classification (2 land use/land cover types classification scheme). The Halounova image
which consisted of texture and filtered layers did not provide improvement for the Ogden
image level 1 classification scheme using unsupervised classification, but did slightly
improve the Salt Lake City level 1 overall accuracy. Due to the poor accuracy results
using texture and unsupervised classification no further analysis was performed in either
study area.
4.3 Supervised Classification
Supervised classification of panchromatic imagery again presents many challenges.
The SVM classifier was used to perform the supervised classification as it has the ability
to process single band imagery and it provided better results. The supervised classifiers
available in ENVI are limited when using single band data as many options such as
maximum likelihood, spectral angle divergence, and neural net all require more than one
-
8/12/2019 Joan Bie Dig Er Thesis
63/119
-
8/12/2019 Joan Bie Dig Er Thesis
64/119
55
Table 4 Training sample statistics from original Ogden image
Land Cover Type Min Max Mean StDev Points
Water 129 177 151.6 13.1 1611
Forest 0 162 70.3 33.7 3246
Grassland 81 197 129.9 13.8 1944
Dark Field 53 102 78.2 12 2706
Medium Field 94 150 124.1 13.5 5056
Light Field 148 194 179 6.5 1918
Impervious Surface 74 223 170.7 28.3 732
Bare Earth 180 227 197.8 7.5 1393
to illustrate the overlap which occurs between commercial, residential and transportation
classes throughout this image (Table 5). The histograms were either bimodal or
multimodal. Although the histogram for transportation approached a normal distribution,
there were still many peaks and valleys indicating variations in gray levels in the image
for this land cover type.
Supervised classification results showed that it was very difficult to extract more than
8 classes on the Ogden image and 5 classes on the Salt Lake City image. One of the
limitations of using panchromatic imagery for land use/land cover classification is that
the DN values which make up the signature for many land use/land cover types contain a
significant amount of confusion. Real world features may be difficult to identify without
taking into account their spatial context (Hung and Wu 2005). Land use/land cover types
may need to be generalized. For example, detail like corn or wheat fields may not be
characterized using panchromatic imagery, but dark fields and light fields or cropland
may be possible. The increased accuracy achieved when aggregating land use/land cover
-
8/12/2019 Joan Bie Dig Er Thesis
65/119
56
types into the level 2 and level 3 classification schemes support this conclusion. Training
samples tested with a greater number of pixels increased the confusion between classes