Joan Bie Dig Er Thesis

download Joan Bie Dig Er Thesis

of 119

Transcript of Joan Bie Dig Er Thesis

  • 8/12/2019 Joan Bie Dig Er Thesis

    1/119

    THE USE OF DIGITAL IMAGE PROCESSING TO FACILITATE DIGITIZINGLAND COVER ZONES FROM GRAY LEVEL AERIAL PHOTOS

    A THESIS PRESENTED TOTHE DEPARTMENT OF GEOLOGY AND GEOGRAPHY

    IN CANDIDACY FOR THE DEGREE OFMASTER OF SCIENCE

    ByJOAN M. BIEDIGER

    NORTHWEST MISSOURI STATE UNIVERSITYMARYVILLE, MISSOURI

    April 2012

  • 8/12/2019 Joan Bie Dig Er Thesis

    2/119

    ii

    DIGITAL IMAGE PROCESSING

    The Use of Digital Image Processing to Facilitate Digitizing

    Land Cover Zones from Gray Level Aerial Photos

    Joan Biediger

    Northwest Missouri State University

    THESIS APPROVED

    ____________________________Thesis Advisor, Dr. Ming-Chih Hung Date

    ____________________________Dr. Yi-Hwa Wu Date

    ____________________________Dr. Patricia Drews Date

    ____________________________Dean of Graduate School, Dr. Gregory Haddock Date

  • 8/12/2019 Joan Bie Dig Er Thesis

    3/119

    iii

    The Use of Digital Image Processing to Facilitate Digitizing

    Land Cover Zones from Gray Level Aerial Photos

    Abstract

    Aerial imagery from the 1930s to the early 1990s was predominantly acquired using

    black and white film. Its use in remote sensing applications and GIS analysis is

    constrained by its limited spectral information and high spatial resolution. As a historical

    record and to study long-term land use/land cover change this imagery is a valuable but

    often underutilized resource. Traditional classification of gray level aerial photos has

    primarily relied on visual interpretation and digitizing to obtain land cover classifications

    that can be used in a GIS. This is a time consuming and labor intensive process that can

    often limit the scale of analysis.

    This research focused on the use of digital image processing to facilitate visual

    interpretation and heads up digitizing of gray level imagery. Existing remote sensing

    software packages have limited functionalities with respect to classifying black and white

    aerial photos. Traditional image classification alone provides limited results when

    determining land cover types derived from gray level imagery. This research examined

    approaching classification as a system which uses digital image processing techniques

    such as filtering, texture analysis and principle components analysis to improve

    supervised and unsupervised classification algorithms to provide a base for digitizing

    land cover types in a GIS. Post processing operations included smoothing the

    classification result and converting it to a vector layer that can be further refined in a GIS.

  • 8/12/2019 Joan Bie Dig Er Thesis

    4/119

    iv

    Software tools were developed using ArcObjects to aid the process of refining the vector

    classification. These tools improve the usability and accuracy of the digital image

    processing results that help facilitate the visual interpretation and digitizing process to

    gain a usable land use/land cover classification from gray level imagery.

  • 8/12/2019 Joan Bie Dig Er Thesis

    5/119

    v

    TABLE OF CONTENTS

    ABSTRACT. iiiLIST OF FIGURES..vii

    LIST OF TABLES...... viiiACKNOWLEDGMENTS.....ix

    CHAPTER 1: INTRODUCTION.11.1 Research Objective 4

    CHAPTER 2: LITERATURE REVIEW..... 52.1 Historical Aerial Imagery Uses and Importance... 52.2 Classification Problems of High Resolution Panchromatic Imagery.62.3 Statistical Texture Indicators. 92.4 Image Enhancements and Filtering... 13

    2.5 Image Segmentation and Object-based Image Analysis... 15CHAPTER 3: CONCEPTUAL FRAMEWORK AND METHODOLOGY... .17

    3.1 Description of Study Area. 173.2 Description of Data 173.3 Methodology.. 21

    3.3.1 Conceptual Overview.... 213.3.2 Software Utilized223.3.3 Preliminary Image Processes..... 243.3.4 Unsupervised Classification.. 273.3.5 Supervised Classification.. 333.3.6 Image Enhancement and Texture Analysis....353.3.7 Object-based Image Analysis.383.3.8 Post Processing and Automation403.3.9 Accuracy Assessment.....43

    CHAPTER 4: ANALYSIS RESULTS AND DISCUSSION.464.1 Manual Digitizing.. 464.2 Unsupervised Classification.. 484.3 Supervised Classification.. 534.4 Image Enhancements and Texture Analysis...... 574.5 Object-based Image Analysis 634.6 Post Processing and Automation... 654.7 Classification Accuracy and Results. 70

    CHAPTER 5: CONCLUSION815.1 Limitations of the Research... 815.2 Potential Future Developments. 81

  • 8/12/2019 Joan Bie Dig Er Thesis

    6/119

    vi

    APPENDIX 1: ERROR MATRIX TABLES.. .84APPENDIX 2: VECTOR EDITING TOOLBAR .NET CODE..101REFERENCES......... .106

  • 8/12/2019 Joan Bie Dig Er Thesis

    7/119

    vii

    LIST OF FIGURES

    Figure 1 Aerial photo of Ogden study area... 18Figure 2 - Overview of study areas in relationship to the state of Utah.. 18Figure 3 Aerial photo of Salt Lake City study area...... 19

    Figure 4 - Ogden DOQQ study area 20Figure 5 - Salt Lake City MDOQ study are. 21Figure 6 Main workflow processes... 23Figure 7 Ogden dendrogram of ISODATA clustering 10 classes.... 29Figure 8 Ogden dendrogram of ISODATA clustering 25 classes 30Figure 9 Ogden dendrogram of ISODATA clustering 100 classes.. 31Figure 10 Distances between classes from Salt Lake City dendrogram... 32Figure 11 Training sample distribution for the Ogden image...34Figure 12 Training sample distribution for Salt Lake City image.... 34Figure 13 Unstretched images compared to contrast stretched images 36Figure 14 Post processing ArcGIS Model 41

    Figure 15 Polygon raster to vector, smoothing, and smooth simplify...42Figure 16 Classification using visual interpretation of the Ogden image. 49Figure 17 Classification using visual interpretation of the Salt Lake City image.49Figure 18 Ogden image ISODATA classifications.. 51Figure 19 Salt Lake City image ISODATA classifications.. 53Figure 20 Minimum distance and support vector machine classification of the Salt

    Lake City image.... 57Figure 21 Minimum distance classification of the Ogden image with high pass filter 58Figure 22 Minimum distance classification of the Ogden image with low pass filter. 59Figure 23 ISODATA 10 spectral classes Halounova image. 61Figure 24 SCRM object-based segmentation images... 64Figure 25 Ogden object-based classification image and post processing system

    vectors... 68Figure 26 Salt Lake City object-based classification image and post processing

    system vectors. 69Figure 27 Ogden pixel based classification and post processing system vectors. 69

  • 8/12/2019 Joan Bie Dig Er Thesis

    8/119

    viii

    LIST OF TABLES

    Table 1 First level classification Ogden land use/land cover classes 25Table 2 First level classification Salt Lake City land use/land cover classes 27Table 3 ISODATA overall accuracy results for Ogden and Salt Lake City study areas

    .. 50Table 4 Training sample statistics from original Ogden image. 55Table 5 Training sample statistics from original Salt Lake image.56Table 6 Ogden image overall accuracy and level 1 completion time.72Table 7 Salt Lake City image overall accuracy and level 1 completion time.... 74Table 8 Users accuracies for individual land use/land cover types Ogden study area. 76Table 9 Users accuracies for individual land use/land cover types Salt Lake City study

    area... 77Table 10 Overall accuracy ranges for classification groups.. 78

  • 8/12/2019 Joan Bie Dig Er Thesis

    9/119

    ix

    ACKNOWLEDGMENTS

    I would like to thank Dr. Ming-Chih Hung for chairing my thesis committee and for

    all the support, encouragement and guidance he has given me along the way. I would

    also like to thank Dr. Yi-Hwa Wu and Dr. Patricia Drews for serving on my thesis

    committee and for their contributions in developing this thesis. Last but certainly not

    least I would like to thank my husband Barry for encouraging me through many long

    nights and weekends while I completed this work. Without your support and love I

    would never have been able to finish this thesis.

  • 8/12/2019 Joan Bie Dig Er Thesis

    10/119

    1

    CHAPTER 1: INTRODUCTION

    Aerial imagery from the 1930s to the present is a primary data source used to study

    many natural processes and land use patterns (Carmel and Kadmon 1998, Kadmon and

    Harari-Kremer 1999). Early aerial imagery from the 1930s to early 1990s is

    predominantly black and white (panchromatic) film photography meaning there is only

    one band of data. This type of imagery contains limited spectral information unlike

    todays satellite digital sensors, which offer more spectral information even in the

    panchromatic band.

    The Aerial Photography Field Office (APFO) is a division of the Farm Service

    Agency (FSA), of the United States Department of Agriculture (USDA). The APFO,

    located in Salt Lake City, Utah, has one of the nations largest collections of historical

    aerial imagery dating back to the 1950s. Film from the 1930s through 1940s was sent to

    the national archives. APFO has over 50,000 rolls of film of which over 60% is black

    and white (Mathews 2005). This historical aerial imagery is a valuable, largely untapped

    resource. The film format of the imagery makes it unavailable to GIS and imagery

    analysis programs unless it is scanned and processed to digital format. There is

    widespread interest from the public and other government agencies in making this

    imagery available and usable in digital format.

    Recently, more historical imagery from the 1950s to 1990s is being scanned to digital

    format for use in change detection projects for the Farm Service Agency (FSA).

    According to Brian Vanderbilt (personal communication, 01 Sep 2009) FSA is interested

    in studying agricultural loss patterns over long periods so that processes of change can be

    more fully understood. One of the challenges with these types of projects is that land

  • 8/12/2019 Joan Bie Dig Er Thesis

    11/119

    2

    use/land cover classification with the imagery usually involves visual interpretation and

    manual digitizing, due to the difficulty of using digital image processing techniques with

    the historical panchromatic imagery. Manual digitizing is a time consuming process for

    multiple years of imagery, as each photo requires its own analysis. There are not enough

    image analysts within FSA to manage the increasing workload for projects requiring the

    use of historical imagery. Another concern is that study areas are limited in scale because

    of the time and resources needed to digitize land cover types on the imagery. There is

    interest and need to explore digital options for land cover classification so that the use of

    these historical imagery datasets can be expanded.The ability to facilitate digitizing of land cover types on historical aerial imagery

    would make it a more usable resource to study long term land use/land cover changes.

    Classification of this type of imagery is very labor intensive which often limits the size of

    study areas. If the imagery could be utilized on a more broad scale, we can gain greater

    historical perspective on changes such as agricultural loss over time. Increased accuracy

    and repeatability of results obtained by using digital image processing could make the

    results of long-term change detection projects more valid rather than having to rely on

    varying levels of image interpretation skills if a project requires several image analysts to

    interpret imagery for a project.

    Historical aerial imagery offers a unique opportunity to study long-term patterns of

    land use/land cover change by offering the analyst a more extensive historical perspective

    on geographic processes such as land use/land cover change, urban expansion and

    vegetation patterns (Kadmon and Harari-Kremer 1999, Awwad 2003, Alhaddad et al.,

    2009). Producing a thematic map through image classification is one of the most

  • 8/12/2019 Joan Bie Dig Er Thesis

    12/119

    3

    common imagery analysis tasks in remote sensing. Image classification techniques such

    as unsupervised and supervised classification, NDVI, spectral signatures, and spectral

    band combinations have limited usability with panchromatic aerial imagery as they rely

    heavily on spectral information, which is limited with this type of imagery.

    Visual interpretation of imagery does not rely on spectral information alone to classify

    imagery. Visual interpretation makes use of scene qualities such as texture, shape,

    arrangement of objects and context of elements in an image. The human visual system is

    very efficient at pattern recognition and in many ways is superior to existing machine

    processing methods, but on the other hand inherent subjectivity and the inability of theeye to extract complex patterns can limit interpretation. Digital image processing

    techniques that incorporate the use of texture, tone, shape, pattern recognition and object-

    based image analysis can be used to enhance traditional methods of supervised and

    unsupervised classification especially with gray level aerial imagery (Caridade et al.,

    2008).

    A great deal of research has been done on the most effective ways of classifying

    multispectral imagery and mapping the results (Jensen, 2005). There is relatively little

    research on how digital image processing of historical panchromatic imagery can

    improve or reduce manual interpretation for image analysis and GIS analysis. In this

    thesis research, digital image processing techniques including texture analysis,

    convolution filters, and object-based image analysis were considered in respect to how

    they can improve the classification of panchromatic aerial imagery and how this

    improvement can facilitate digitizing and in some cases possibly eliminate it. A post

    processing system involving image smoothing, raster to vector conversion, polygon

  • 8/12/2019 Joan Bie Dig Er Thesis

    13/119

    4

    smoothing and simplification, and custom polygon editing tools for use in ESRIs

    ArcMap GIS software was used to improve an initial digital image classification. The

    post processing system can be used to improve most digital image classifications. The

    quality of the baseline land use/land cover classification was the main factor in how

    efficient it was to create a usable thematic layer.

    Continued study in this area could yield new approaches to land cover classification of

    gray level imagery. If historical imagery has the ability to be used effectively in a digital

    environment, then more of it may be scanned and become more readily available, which

    would benefit the geospatial community.1.1 Research Objective

    The objective of this project is to establish a working model that utilizes digital image

    processing to facilitate or assist the user with digitizing land cover zones from gray level

    aerial photos. This study approaches the problem of digitizing land cover zones by first

    classifying the aerial photo and then by establishing a post processing system employing

    vector layers for use in a GIS.

    There is limited research available in using digital image processing to enhance the

    classification process of gray level aerial photos and the digitizing process. Digital image

    processing may not be able to completely replace visual interpretation of this type of

    imagery, but it may be able to make the process more efficient.

  • 8/12/2019 Joan Bie Dig Er Thesis

    14/119

    5

    CHAPTER 2: LITERATURE REVIEW

    2.1 Historical Aerial Imagery Uses and Importance

    Historical imagery as referred to in this study refers to imagery acquired by an aerial

    camera mounted in an airplane. The photography has been directly imaged onto film and

    is also referred to as analog photography as opposed to modern digital imagery. This

    historical imagery is black and white and may be referred to as either panchromatic or

    gray level.

    Black and White, gray level, and panchromatic are terms which refer to imagery

    composed of shades of gray. The imagery used in this study has a pixel depth of 8 bitswhere the binary representation assumes that 0 is black and 255 is white. Between 0 and

    255 raw pixel values are grayscale and the digital numbers correspond to different levels

    of gray. For example a digital number of 127 will correspond to a medium gray in the

    photo. This panchromatic imagery has a single band where digital numbers represent the

    spectral reflectance from the visible light range. Historical panchromatic imagery

    contains brightness values but has limited spectral information available in the visible

    wavelengths (0.4-0.7 m), unlike the panchromatic band of a satellite image such as

    Landsat 7, which generally is sensitive into the near infrared wavelengths (.52-.0.9 m)

    (Hoffer 1984).

    Historical aerial photographs are a valuable and important data source for studying

    long term (20 80 year) change processes such as land use/land cover change and

    vegetation and environmental dynamics. These historic photos present a snapshot in time

    that may offer insight into the current state of land use/land cover change processes and

    what patterns may have affected their growth and stability. Much of the imagery

  • 8/12/2019 Joan Bie Dig Er Thesis

    15/119

    6

    available for long-term analysis is black and white aerial photography (Carmel and

    Kadmon 1998, Hudak and Wessman 1998, Caridade et al. 2008). The historical record

    that has been captured from aerial photography provides a long temporal history to work

    with and provides an extensive frame of reference in which to assess the magnitude of

    land use/land cover change. Advances in GIS, photogrammetry, image analysis, and

    digital image processing have increased the potential to use historical aerial photography

    for many types of change analysis including land use/land cover change (Okeke and

    Karnieli 2006).

    Gray level historical aerial photos used to produce land cover maps are generallycreated through techniques such as visual interpretation and manual digitizing (Carmel

    and Kadmon 1998, Kadmon and Harari-Kremer 1999). This is a very time consuming

    and labor intensive process. This fact has a tendency to limit analysis to small areas. The

    digitizing itself is generally dependent on the ability of the interpreter and may lead to

    results that are not objective due to skill level and human bias (Kadmon and Harari-

    Kremer 1999). The assumption is often made that manual interpretation is 100 %

    accurate but assessing the accuracy of this method is difficult according to Congalton and

    Green (1993) and Carmel and Kadmon (1998).

    2.2 Classification Problems of High Resolution Panchromatic Imagery

    The historical aerial imagery analyzed in this project is limited in spectral information

    and has high spatial detail. These two variables can present some difficulties with the use

    of common digital classification and image processing techniques. The first challenge is

    the spectral resolution, which is only one band. This band lacks detailed spectral

    information. Most panchromatic aerial films are sensitive to the visible spectrum but also

  • 8/12/2019 Joan Bie Dig Er Thesis

    16/119

    7

    require filtering to take into account haze and atmospheric conditions. The film is

    generally filter exposed to green and red visible wavelengths and not the blue

    wavelengths to cut down on atmospheric haze. The resulting image records in black and

    white the tonal variations of the landscape in the scene (U.S. Army Corp of Engineers

    1995). Common classification methods are limited in accuracy and usability when there

    is only one band to work with (Short and Short 1987, Anderson and Cobb 2004, Caridade

    et al. 2008).

    Research from Carmel and Harari-Kremer (1999) and Carmel and Kadmon (1998)

    have approached the limitations of having only one band of information to analyze inseveral ways. Carmel and Kadmon (1998) used a combination of illumination

    adjustment and a modified maximum likelihood classifier that used neighborhood

    statistics to achieve classification accuracies of over 80% for study of long-term

    vegetation patterns using gray level aerial imagery. This research showed that the

    relationship between neighborhood pixels was an important factor in achieving improved

    classification accuracy. Carmel and Harari-Kremer (1999) concentrated on training data

    and ancillary data to produce vegetation maps from black and white aerial photos from

    1962 and 1992. The accuracy of using a maximum likelihood classifier was about 80%.

    Their study stresses the importance of carefully considered training data and the utility of

    digital image processing of historical aerial photography in vegetation change detection

    studies. Mast et al. (1997) researched long-term change detection of forest ecotones

    using gray level aerial imagery from 1937 1990. Density slicing was used after

    determining the range of brightness values for tree cover across all imagery to get a

    classification of tree cover and no such. Results were satisfactory although no accuracy

  • 8/12/2019 Joan Bie Dig Er Thesis

    17/119

    8

    assessment was mentioned, but again the significance of object brightness values for gray

    level imagery was established.

    The second challenge when analyzing this imagery is that higher spatial resolution

    does not generalize features to the degree coarse or medium scale imagery does, which

    allows much more detail to be considered in an image. Individual trees, buildings and

    sidewalks become visible when image detail is more perceptible in these 1-meter

    resolution images. This factor makes visual interpretation easier but can cause problems

    with automated classification, especially when spectral information is limited or non-

    existent. High spatial resolution can increase within-class variances, which can causeuncertainty between classes. Browning et al. (2009) in their study of historical aerial

    imagery as a data source emphasized the importance of object scale when analyzing

    imagery. Some objects may be larger than a pixel, referred to as H-resolution, and some

    objects may be smaller than a pixel, which is referred to as L-resolution. This factor can

    make imagery with multiple scale objects more difficult to get consistent classification

    results across a scene. Spatial autocorrelation is also an important factor when

    considering this concept, as all natural scenes in remote sensing will have some type of

    spatial autocorrelation to create a scene, so that the image organization is something other

    than random noise (Strahler et al. 1986).

    The challenges of limited spectral information and high spatial detail can lead to a

    number of features in an image having similar gray level signatures and a great deal of

    confusion between class types (Fauvel and Chanussot 2007). In turn a per pixel classifier

    such as the maximum likelihood classifier has difficulty distinguishing between a

    medium gray field and water in a panchromatic image. Panchromatic image

  • 8/12/2019 Joan Bie Dig Er Thesis

    18/119

    9

    classification can be improved by considering the relationship between neighborhood

    pixels as in texture analysis and object-based image analysis (Alhaddad et al. 2009,

    Myint and Lam 2005, and Caridade et al., 2008).

    2.3 Statistical Texture Indicators

    Image texture is one of the most important visual indicators in distinguishing between

    homogenous and heterogeneous regions in a scene. The human interpreter uses shape,

    texture, size, pattern, shadow, arrangement and context of elements in an aerial photo to

    distinguish between objects in the image (Campbell 2008). According to Tuceryan and

    Jain (1998) texture is easy to discern in an image but it can be a difficult concept todefine and there is not one generally accepted definition. One way to define texture is to

    consider it as the spatial variation of the intensity values in a region of an image

    (Tuceryan and Jain 1998). This regional variation in intensity values implies that the

    evaluation of texture is a neighborhood process and that a single pixel does not create

    texture on its own.

    Texture is also a quality of an image scene that corresponds to a pattern that is part of

    the structure of the image. In a natural scene an area of farmland and a forested area

    comprise two separate visual patterns in separable regions. These regions may also

    contain secondary patterns having characteristics such as brightness, shape, size, etc. A

    field may also have a planting pattern and a forest may be comprised of deciduous and

    coniferous trees giving the area a distinctive sub pattern that has its own brightness,

    shape, size, etc (Srinivasan and Shobha 2008). Texture as a property of an object or

    regional feature in an image can be described as fine, smooth, coarse, etc. Tone is the

    range of shades of gray in an image. According to Haralick (1979), tone and texture are

  • 8/12/2019 Joan Bie Dig Er Thesis

    19/119

    10

    interdependent concepts in that both are always present in an image to varying degrees.

    This interrelationship between tone and texture is explained by Haralick (1979) as

    patches in an image that either have little variation in tonal primitives (tone) or a patch

    that has a great variation of tonal primitives (texture).

    The work of Haralick et al. (1973) was the foundation for most of the later research

    relating to image texture analysis. Their work provided a computational method to

    determine textural characteristics in an image scene and discussed several widely used

    textural statistics used in image texture recognition. These statistics included: contrast,

    correlation, angular second moment, inverse difference moment and entropy. Contrastmeasures the amount of local variation in an image. Correlation measures the linear

    dependency of gray levels in the image. Angular second moment measures local

    homogeneity. Inverse difference moment also measures local homogeneity but relates

    inversely to contrast. Entropy measures randomness of values. Image analysis may be

    performed using these measures either alone or in combination.

    There are three main approaches to texture analysis. These approaches include

    statistical, spectral and structural. Statistical methods are based on local statistical

    parameters such as the co-occurrence matrix and variability within moving windows.

    Spectral methods include analysis using the Fourier transform and structural methods

    emphasize the shape of image primitives (Srinivasan and Shobha 2008). This study

    utilized statistical methods to include the co-occurrence matrix, the occurrence measures

    and moving windows. By evaluating the spatial distribution of gray values using

    statistical methods, a set of statistics can be derived from the distributions of neighboring

    features throughout the image. There are first order and second order texture statistics.

  • 8/12/2019 Joan Bie Dig Er Thesis

    20/119

    11

    First order statistics such as mean, standard deviation, and variance analyze pixel

    brightness values without analyzing the relationships between the pixels. Second order

    statistics on the other hand analyze the relationships between two pixels and these

    measures include contrast, dissimilarity, homogeneity, entropy, and angular second

    moment (Srinivasan and Shobha 2008). First order and second order statistics are used in

    this study as a method to improve the classification accuracy of panchromatic aerial

    photos.

    The analysis of texture is a technique that has been used to aid and increase

    classification accuracy in both gray level image analysis and multispectral analysis.Haralick et al. (1973) conducted the first major study of texture as an imagery analysis

    tool. They demonstrated the utility of the Gray Level Co-occurrence Matrix (GLCM) as

    an analysis tool for panchromatic aerial photographs and multispectral imagery even

    though computer processing constraints of the time hindered their study. The

    classification accuracy in their study was 82% for the panchromatic aerial imagery.

    Caridade et al. (2008) used the GLCM and a variety of moving window sizes to achieve

    an overall classification accuracy of black and white aerial photos of 83.4% using four

    land cover classes. The GLCM uses statistics such as dissimilarity, angular second

    moment, homogeneity, contrast, entropy etc. to statistically determine the frequency of

    pixel pairs of gray levels in the image. Caridade et al. (2008) also discusses the variation

    of land cover type accuracies throughout an image. Their study shows that certain land

    cover types such as water may achieve accuracy levels of 100% while others such as bare

    ground are much lower at 76.5%. Cots-Folch et al. (2007) used the GLCM to train a

    neural network classifier but the highest accuracy obtained was only 74%. Their study

  • 8/12/2019 Joan Bie Dig Er Thesis

    21/119

    12

    stated that better training data and ancillary data sources could be used to improve the

    results. Maillard (2003) compared the GLCM to semi-variogram and Fourier spectra

    methods and found that the GLCM works better in areas where textures are easily

    distinguished and the semi-variogram is better in areas where texture is more similar.

    The Fourier method was less successful than either of the other two methods. Alhaddad

    et al. (2009) found that the GLCM and mathematical morphology produced results which

    were closer to visual interpretation than other texture analysis methods.

    One of the main utilities of texture analysis as it applies to improving the classification

    of panchromatic imagery in particular is that it increases the dimensionality of theimagery from one band to multiple bands. A new band is created for each texture

    function. This increased dimensionality can help alleviate some of the problems of class

    separability that arise when trying to classify historical aerial photos (Halounova 2009).

    Halounova used a combination of texture, filtering and object oriented classification to

    achieve overall accuracy levels between 89% and 92%. Their methodology of increasing

    the dimensionality of panchromatic imagery to try to achieve more separability between

    land use/land cover classes was an important influence on this thesis research.

    In areas of heterogeneous objects, the texture information in neighborhood pixels is a

    consideration. Common classification algorithms that rely on spectral information at the

    pixel level do not consider spatial information. This spatial information can become very

    important when trying to discern land cover types such as urban areas (Myint and Lam

    2005). Two types of analysis can assist the classification process: region-based analysis

    and window based analysis. Region-based analysis involves using image segmentation

    and window based analysis can be used in pre- or post-classification to filter noise from

  • 8/12/2019 Joan Bie Dig Er Thesis

    22/119

    13

    the results (Gong et al. 1992). The importance of the spatial aspect of texture analysis is

    illustrated in many studies involving texture analysis (Haralick 1973, Gong et al. 1992,

    Hudak and Wessman 1998, Myint and Lam 2005, Erener and Duzgun 2009, Pacifici et al

    2009). This study used region-based analysis during object-based image analysis and

    window based analysis through the GLCM.

    2.4 Image Enhancements and Filtering

    Texture analysis in combination with image pre-processing such as principal

    component analysis has been explored by Awwad (2003). His study, which utilized a

    1941 gray level photo, used texture analysis windows of different sizes and thencombined the results to create an image with sixteen layers. Principle components

    analysis (PCA) was used to reduce the dimensionality of the resulting image. He

    combined several digital processing techniques but overall accuracy was only 58%.

    Much of the literature on using digital image processing techniques for classifying gray

    level aerial photos does not make use of multiple texture window sizes in combination to

    return a result. Even though examples are rare in the literature and accuracy was low as

    reported by Awwad (2003), the technique has promise. Halounova (2009, 2005) also

    combined several texture window sizes but used filtering and object oriented

    classification rather than PCA to achieve classification accuracies over 90%. Image

    enhancements such as filtering and texture add multiple channels to the one band

    panchromatic image and allow the image to be processed in a similar fashion to a

    multiple band image. There is room for more research using this type of methodology

    with different parameters and different pre- or post-processing results such as

    convolution filtering, edge detection and smoothing windows.

  • 8/12/2019 Joan Bie Dig Er Thesis

    23/119

    14

    Edge detection is another important consideration when trying to separate a scene into

    distinct objects. A natural scene such as an aerial photo does not necessarily have a clear

    relationship between an object and a background. Anderson and Cobb (2004) provided a

    new unsupervised hybrid classification algorithm based on edge detection and

    thresholding for pixel classification. Nearest edge thresholding outperformed both the

    maximum likelihood and ISODATA clustering classification schemes. Their study

    illustrated the importance of edge detection between features in gray level aerial photos.

    Li et al. (2008) also conducted research, which concentrated on the importance of edge

    detection and shape characteristics. The process used was automated using ArcGISModel Builder and results were compared to manual digitizing with the model correctly

    identifying 70% of the manual classifications. Hu et al. (2008) used grayscale

    thresholding in regards to image segmentation and emphasized the importance of

    transition regions between objects in a scene and the ability to segment objects in an

    image. Transition regions between objects can be problematic when classifying complex

    scenes, as there can be multiple areas in the image with different gray scales between

    objects causing classification errors and a salt and pepper effect.

    Texture filters in combination with neural network classifiers are another methodology

    that has shown some success in land use/land cover classification of gray level aerial

    photos. Ashish (2002) used several artificial neural network (ANN) classifiers based on

    histograms, texture and spatial parameters with some success on 1993 gray level aerial

    photos. Textural parameters yielded the highest overall accuracy at 92%. His study

    further showed the importance of texture parameters for classification of gray level aerial

    photos. Another study conducted by Pacifici et al. (2009) used a neural network

  • 8/12/2019 Joan Bie Dig Er Thesis

    24/119

    15

    classifier and a simplification procedure with some success on the panchromatic bands of

    WorldView-1 satellite imagery. After the simplification procedure called network

    pruning was used on the imagery, texture was optimized and input features were

    reduced producing classification accuracy above 90% in relation to the Kappa coefficient.

    Their study provided another example of how texture parameters can improve the

    classification accuracy of different types of classifiers using high resolution panchromatic

    imagery.

    2.5 Image Segmentation and Object-based Image Analysis

    Considering the high spatial resolution of gray level aerial photos and the lack ofspectral information, object-based image analysis is another technique that has been

    successful in classifying high spatial resolution imagery. Object-based image analysis

    (OBIA) is a method of image analysis that uses objects in a scene rather than individual

    pixels to derive information from the imagery. OBIA is a two-part process consisting of

    image segmentation and then image classification. The image is first divided into

    homogenous and adjacent regions, which take into account texture, region context, shape

    and spectral information during the segmentation phase. Image segmentation reduces the

    complexity of the image, and produces regions in the image, which can in turn be

    considered meaningful to the image interpreter.

    OBIA was compared to pixel based classification in a study by Pillai and Wesberg

    (2005) using gray level aerial imagery from 1965 and 1995. Their study illustrated how

    scale dependency can affect classification results depending on the objects studied. Scale

    dependency of individual landscape elements can also affect the usefulness of texture

    parameters as illustrated in Resler et al. (2004). Change at the scale of individual trees

  • 8/12/2019 Joan Bie Dig Er Thesis

    25/119

    16

    was not statistically significant between pixel based classification and object-based

    classification. Object-based classification was more accurate when comparing patches of

    trees in high spatial-resolution panchromatic imagery. Their study illustrates the

    importance of determining land use categories and object scale when classifying imagery.

    Elmqvist et al. (2008) performed OBIA on the panchromatic band of an Ikonos image

    and found that spectral information provided the best segmentation results. Classification

    accuracies were fairly low for their study but outperformed pixel based classification.

    Laliberte et al. (2004) used a combination of low-pass filtering and object-based image

    analysis on gray level aerial photos successfully integrating gray level aerial photos andsatellite imagery in a change detection study. Middleton et al. (2008) successfully used

    feature extraction and a support vector machine (SVM) supervised classifier to extract

    features on a 1947 aerial image in a change detection study. One of the main conclusions

    of their study was that classification accuracy of the panchromatic image was based on

    image quality. Historic panchromatic imagery is not always of good quality due to age or

    deterioration of the film. A successful methodology for classifying this type of imagery

    needs to be successful for various levels of image quality.

    The literature regarding classification of gray level aerial photos concentrates for the

    most part on replacing manual digitizing with digital image processing techniques. There

    is a gap in the literature in regard to using digital image processing to help facilitate

    digitizing. By combining digital image analysis techniques such as texture and object-

    based image analysis with GIS vector capabilities, digitizing land cover classification

    zones can be enhanced and in some cases possibly eliminated.

  • 8/12/2019 Joan Bie Dig Er Thesis

    26/119

    17

    CHAPTER 3: CONCEPTUAL FRAMEWORK AND METHODOLOGY

    3.1 Description of the Study Area

    The study area for this project is near Ogden, Utah (Figure 1). The area is in north

    central Utah (Figure 2) and consists of a variety of land cover types including agricultural

    land, impervious surfaces, grassland, forest and water. The Ogden study area does not

    provide an example of dense urban land cover so a secondary area of interest was chosen

    in Salt Lake City, Utah (Figure 3). The Salt Lake City study area includes a park and a

    variety of residential and commercial land cover. By using two study areas with a variety

    of textures and objects in the scene, this research can show the usefulness of digital image processing across two completely different areas and images.

    The classification results concentrate on the Ogden imagery as this imagery has better

    defined and larger areas of land class types. The Salt Lake City image is used mainly to

    see how the same techniques can be used in an urban area. Urban areas have their own

    unique classification challenges that are increased when trying to classify panchromatic

    imagery. Another reason the Ogden image was the main focus of this research is that this

    imagery was originally flown for FSA for agricultural purposes. It is also likely that

    much of the historical imagery in the vault at APFO will be used to further study

    historical agricultural change processes.

    3.2 Description of Data

    The image of Ogden, Utah from 1958 was obtained from the Aerial Photography Field

    Offices internal imagery storage network. The Ogden study area was clipped from a

    digital orthophoto quarter quadrangle (DOQQ) 4111256ne from 1958 (Figure 4) and

    covers approximately 0.5 square miles. The image was scanned from black and white

  • 8/12/2019 Joan Bie Dig Er Thesis

    27/119

    18

    Figure 1 Aerial photo of Ogden study area

    Figure 2 - Overview of study areas in relationship to the state of Utah

  • 8/12/2019 Joan Bie Dig Er Thesis

    28/119

  • 8/12/2019 Joan Bie Dig Er Thesis

    29/119

    20

    and was scanned and ortho rectified at APFO using the same parameters and methods as

    the 1958 Ogden imagery. Q1219_1977 is a mosaic that was created from original

    DOQQs using Socet Set 4x and interactive seaming. The image resolution is 1 meter and

    the bit depth is 8 bits.

    Figure 4 - Ogden DOQQ Study Area

  • 8/12/2019 Joan Bie Dig Er Thesis

    30/119

  • 8/12/2019 Joan Bie Dig Er Thesis

    31/119

    22

    processing was completed, a number of digital image processing techniques were

    performed on the imagery (see Figure 6). The original imagery was classified using

    supervised and unsupervised classifiers to form the classification baseline information.

    Then four main digital image processing techniques were used to try to improve the

    classification. These four processes were: convolution filtering, texture analysis,

    principle components analysis, and object-based image classification. Texture analysis

    was used to create layer-stacked images which increased the dimensionality of the

    original one band image to improve classification results. Principle components analysis

    was used to decrease the dimensionality of the multiple layer texture images and in onecase the first principle component image derived from the multi-layer texture image was

    layer stacked with the original one band image. The final digital image processing

    component in the research was image post-processing to refine the most promising results

    for GIS analysis. After image post-processing an accuracy assessment was completed to

    compare the results of each classification with the digitized baseline information obtained

    by visual interpretation (heads up digitizing).

    3.3.2 Software Utilized

    There were three software programs used in this project as no single software suite

    available to me provided all the tools needed for this research. The imagery analysis

    programs used were ERDAS Imagine version 11.0, ENVI 4.8 and ENVI EX 4.8. The

    GIS software used is ArcMap 10.0. ERDAS Imagine has a good set of texture analysis

    and filtering tools. ENVI EX and ENVI have the benefit of integration with the GIS

    software and ENVI EX provided a wizard based feature extraction toolset for object-

    based image classification. The main interface used to provide the baseline land use/land

  • 8/12/2019 Joan Bie Dig Er Thesis

    32/119

    23

    Figure 6 Main Workflow Processes

  • 8/12/2019 Joan Bie Dig Er Thesis

    33/119

    24

    cover zones to aid or facilitate the manual digitizing process is ArcMap 10 as this

    software has good vector tools, and the ability to integrate ENVI image analysis tools

    into ArcMap Model Builder.

    3.3.3 Preliminary Processes

    The study area was clipped from the original DOQQs using the ERDAS Imagine

    subset tool. The area covers approximately 0.5 miles in both project areas to facilitate

    digitizing and image processing. Much of the image processing including the use of

    convolution filters; texture analysis and classification methods required trial and error to

    find the best settings and analysis methods for the imagery. The best results wereanalyzed further using post processing, vector conversion and editing.

    Heads up digitizing was performed on the Ogden and Salt Lake City imagery. This

    provided the digitized baseline information as ground truth to be used later in the

    classification accuracy assessment. Heads up digitizing was performed using ESRIs

    ArcMap 10.0 software. A geodatabase was created for both the Ogden imagery and the

    Salt Lake City imagery.

    One person performed the visual interpretation of the imagery for the sake of

    consistency. The interpreter has had eight years of work experience using photo

    interpretation to create a variety of map types for the Defense Mapping Agency (now the

    National Geospatial Intelligence Agency). The times were recorded so that a comparison

    can be made between manual digitizing and digital image processing to determine the

    efficiency of digital image processing.

    The determination of land use classes was an important consideration as it had a great

    deal of impact in the final results of image classification especially for panchromatic

  • 8/12/2019 Joan Bie Dig Er Thesis

    34/119

    25

    imagery since so many land use/land cover types have similar digital number (DN)

    values. Classification schemes in previous studies using black and white aerial imagery

    have used relatively limited categories (Kadmon and Harari-Kremer 1999, Laliberte et al.

    2004, Okeke and Karnieli 2006, and Pringle et al. 2009). This study includes three levels

    of classification detail for the study areas. The approach looked at the classification of

    the imagery in a bottom up manner going from a high level of detail in representing the

    land cover types existing in the imagery to grouping these types into larger categories.

    This strategy was used to determine how useful detailed digital analysis of the imagery

    was compared to visual interpretation. The first level of classification of the Ogdenimagery was based on eight land use/land cover classes including water, forest, grassland,

    dark fields, medium fields, light fields, bare earth and impervious surface (Table 1). At

    this level it was too difficult to represent the cropland as one class as there is too much

    variation between fallow fields and fields that are growing or wet. There was also

    confusion between the most representative digital number values between dark, medium,

    and light fields as there are pattern variations in the respective fields.

    Table 1 First level classification Ogden land use/land cover classes

    Class Name Description

    Water Lakes, Reservoirs, Rivers

    Forest Areas of trees with a canopy cover greater than 50%

    Grassland Areas dominated by grasses and herbaceous plants with little or no tree or shrub cover

    Dark Fields Agricultural cropland area characterized by dark gray tone DN ~ 0-122

    Medium Fields Agricultural cropland area characterized by medium gray tone DN ~ 100-188

    Light Fields Agricultural cropland area characterized by light gray tone DN ~ 151-200

    Bare Earth Areas of earth, sand, and rock with little to no vegetation

    Impervious Surface Buildings, roads, parking lots

  • 8/12/2019 Joan Bie Dig Er Thesis

    35/119

    26

    The second level of classification took the eight classes and combined them into three

    larger groups: cropland, vegetation, and other. Finally, the third level of classification

    consisted of cropland and non-cropland. The results of these classifications and their

    impact on classification accuracy were obtained by combining the results of the initial

    classifications rather than running new supervised and unsupervised classifications to

    reflect these combined groupings.

    The classification system used on the Salt Lake City image also used a bottom up

    approach starting out with a more detailed classification and then moving to more general

    groupings. The first level of classification consisted of five land types includingcommercial, transportation, trees, grass, and residential (Table 2). The second level

    classification was reduced to built up areas, vegetation, and transportation. The third level

    of classification consisted of built up areas and non-built up areas. The Salt Lake City

    image has entirely different characteristics from the Ogden image, as the Salt Lake City

    image is comprised of a mixed type urban area without any agriculture, bare earth, forest,

    or large bodies of water. The added classification difficulty in the Salt Lake City image

    was that the commercial and residential areas are made up of a mixture of manmade and

    natural materials. These areas consisted of thousands of small buildings and may be

    surrounded by either grass or concrete, all of which provide a very complex pattern of

    shapes and surfaces which were tonally very similar. There were many tonal similarities

    existing in the Ogden imagery as well but the land cover types such as dark fields, light

    fields, water, etc. are fairly homogenous blocks unlike the patchwork of the urban areas.

  • 8/12/2019 Joan Bie Dig Er Thesis

    36/119

    27

    Table 2 First level classification Salt Lake City land use/land cover classes

    Class Name Description

    Commercial Built up area consisting of industrial, commercial complexes

    Transportation Transportation network including major streets and highways

    Residential Mixed area that includes single family homes, apartments, trees, and grass

    Grass Areas dominated by grasses and herbaceous plants (yards, fields)

    Trees Woody vegetation < 20ft tall

    3.3.4 Unsupervised Classification

    Unsupervised classification was performed on the original subset of the Ogden and

    Salt Lake City images to provide the unsupervised classification baseline information for

    comparison to digital classifications with image enhancements. This initial classification

    was completed using ENVI 4.8 tools for ArcGIS and the ISODATA clustering algorithm.

    This clustering algorithm essentially divides the image into naturally occurring groups of

    pixels. Similar pixels are grouped together. Three classification sets were used to

    process the imagery: 10, 25, and 100 spectral classes. After the imagery was classified,

    these groups were interactively assigned an information class by visually comparing the

    classified image and/or reference data. Since many of the spectral classes have similar

    tonal values and statistics, it was necessary to assign some of these mixed classes to

    either the most numerous type or the type with the most concentrated areas of pixels.

    There was room for interpretation, and there is a certain amount of subjectivity involved

    in assigning these classes. The interpreter needs to be familiar with the study area, and

    when some classes are divided between seemingly equal areas, it was difficult to

    determine which was the best class to assign the pixels to. In some cases a spectral class

    was divided between 3 or 4 information classes. At this stage there was not a method to

    split these classes into their respective groups using the ENVI or ArcGIS software. It is

  • 8/12/2019 Joan Bie Dig Er Thesis

    37/119

    28

    possible to use masking and a technique called cluster busting, but this methodology was

    not used in this research, as it requires a significant amount of extra processing.

    The unsupervised classification process did provide some useful general information

    about the imagery. It was very difficult to assign classes to the detail level land

    classification system used for both the Ogden and the Salt Lake City images. After

    aggregating classes and assigning them a land use/land cover type from the classification

    scheme, there were about five classes that could be distinguished in the Ogden image and

    three in the Salt Lake City image. A useful tool to visualize how the clusters in an image

    are derived is a dendrogram. Dendrograms were created using the ArcGIS software forthe same number of classes and iterations as the unsupervised classifications (Figures 7,

    8, 9). A dendrogram is a graphic diagram in the form of a tree that is used to analyze

    clusters in a signature file (ESRI 2011). The dendrograms are used to show the clustering

    process from individual classes to one large cluster. The dendrogram tool takes an input

    signature file created in ArcMap and creates the diagram based on a hierarchical

    clustering diagram. The classes are clusters of pixels and the graph illustrates the

    distances between merged classes. The dendrogram helps to illustrate how the 10, 25,

    and 100 classes are distributed using the ISODATA classifier. Many of the classes

    overlap and are very close together numerically, which is why unsupervised classification

    on panchromatic imagery often gives the user unsatisfactory results. The dendrograms

    also illustrate the relatively small changes in class distances between having 10, 25, and

    100 classes. Dendrograms of the Salt Lake City imagery were very similar except for

    slight differences of distances between pairs of combined classes (Figure 10). The

  • 8/12/2019 Joan Bie Dig Er Thesis

    38/119

    29

    ISODATA classifier only returned 67 classes instead of 100 for the Salt Lake City image

    and 93 out of 100 for the Ogden image.

    Figure 7 - Ogden dendrogram of ISODATA clustering 10 classes

  • 8/12/2019 Joan Bie Dig Er Thesis

    39/119

    30

    Figure 8 - Ogden dendrogram of ISODATA clustering 25 classes

  • 8/12/2019 Joan Bie Dig Er Thesis

    40/119

    31

    Figure 9 - Ogden dendrogram of ISODATA clustering 100 classes

  • 8/12/2019 Joan Bie Dig Er Thesis

    41/119

    32

    10 Classes 25 Classes

    100 Classes

    Figure 10 Distances between classes from Salt Lake City dendrograms

    A K-Means unsupervised classifier was also used to classify an Ogden texture image

    incorporating the mean, variance and homogeneity bands. This classifier provided a

    more satisfactory result on the texture images than the ISODATA classifier did. The K-

    Means classifier in the ENVI software uses a set number of classes provided by the

    analyst, and classes are determined after the classifier iterates through the image and the

    optimal separability is reached based on the distance to mean (ENVI 2011). The

    ISODATA classifier had difficulties with the texture image and returned a completely

  • 8/12/2019 Joan Bie Dig Er Thesis

    42/119

    33

    gray image unless the classes were increased to well over 25. Considering how time

    consuming it was to assign classes to the result the K-Means classifier was used. Ten

    classes and 25 classes were used on the texture image.

    3.3.5 Supervised Classification

    Supervised classification was performed on the original image subsets to create the

    supervised classification baseline information. Later on, another supervised classification

    was performed on images which had been digitally processed or enhanced (filtering or

    texture analysis). Results of the latter supervised classification were compared to the

    supervised classification baseline information to determine if these digital image processenhancements improved classification. Supervised classification was performed using

    ENVI and ArcGIS 10 software.

    Supervised classification unlike unsupervised classification involves the user creating

    training samples from land use/land cover classes that are determined to be present in the

    imagery. The training sets called region of interest (ROI) were created using ENVI

    software. This training data was used throughout the supervised classifications performed

    on the original imagery, texture images, PCA images, and the filtered images. The final

    training sets for both study areas were determined by trial and error. A training set was

    developed which had about twice as many samples, but this set did not significantly

    improve classification results for either image. These larger sets did however increase

    processing time, so in the interest of efficiency smaller training sets were used throughout

    (Figure 11 and 12). Training sets are inherently subjective and do require the analyst to

    be able to distinguish land use/land cover types.

  • 8/12/2019 Joan Bie Dig Er Thesis

    43/119

    34

    Figure 11 Training sample distribution for the Ogden image

    Figure 12 Training sample distribution for Salt Lake City Image

  • 8/12/2019 Joan Bie Dig Er Thesis

    44/119

    35

    Several supervised classifiers were used to evaluate the imagery using ENVI software.

    The minimum distance classifier, the maximum likelihood classifier, neural net, and

    SVM classifiers were examined. Each classifier provides distinct advantages and

    disadvantages. The minimum distance to means classifier determines the mean of each

    pre-defined class and then classifies pixels into the appropriate class by using the

    Euclidean distance of the closest mean. One of the advantages to this algorithm is that it

    classifies all pixels and processes very quickly. The maximum likelihood classifier

    assumes that each class is normally distributed and is based on the highest probability

    that a pixel will be assigned to a particular class. When classes have a multimodaldistribution this classifier will not provide optimum results. An advantage of this method

    is that the classifier considers the mean and covariance of the samples. The neural net

    classifier provided by ENVI software uses back propagation to determine class

    assignment of pixels. An advantage of the neural net classifier is that it does not make

    assumptions about the distribution of the data. The Support Vector Machine (SVM)

    classifier available in the ENVI software works with any number of bands and has good

    accuracy when automatically separating pixels into classes. This classifier also

    maximizes the boundary between classes, which may be useful for distinguishing land

    use/land cover types with similar characteristics. Another advantage of this classifier is

    that it works well on imagery that has a lot of noise (ENVI 2011, Jensen 2005).

    3.3.6 Image Enhancement and Texture Analysis

    Digital image processing techniques were explored to determine if classification

    results could be improved. Texture analysis, convolution filtering, and contrast stretching

    enhance some of the spatial characteristics of the imagery. For example, contrast

  • 8/12/2019 Joan Bie Dig Er Thesis

    45/119

    36

    stretching brings out more differences between light and dark areas of the imagery, and

    convolution filters can enhance edges. Low pass filters can smooth out areas of noise in

    an image such as the variations found throughout the field areas in the Ogden imagery,

    while high pass filters make the image appear more crisp or sharp (Jensen 2005).

    Convolution filtering, contrast stretching and texture filtering were used in a variety of

    combinations to enhance the study areas and try to improve classification.

    A two standard deviation contrast stretch was applied to both study areas to enhance

    the contrast and sharpness of the imagery. Both original images lacked definition in the

    light and dark areas of the image (Figure 13). The Ogden study area had a DN range of0-235 and the Salt Lake City study area had a DN range of 0-187. All subsequent

    filtering and texture analysis was performed on the stretched images.

    Unstretched Stretched

    Figure 13 Unstretched images compared to contrast stretched images

  • 8/12/2019 Joan Bie Dig Er Thesis

    46/119

    37

    Convolution filtering was performed on the study areas using ENVI software. High

    pass filtering was used to help sharpen the imagery using a variety of kernel sizes: 3x3,

    5x5, 7x7, and 11x11. Low pass filtering was applied to the imagery to smooth out noise

    in the field areas. Again 3x3, 5x5, 7x7, and 11x11 kernels were examined. As the kernel

    gets larger with low pass filtering, the detail becomes more generalized or blurred as this

    type of filtering preserves the low frequency parts of the image. A median filter was also

    examined using the previously mentioned kernel sizes. This filter has a smoothing effect

    on the image but the edges remain somewhat crisper than the low pass filter. ENVI also

    provides several edge enhancing filters that were used to process the original studyimages. The filters used in this study were Laplacian, Roberts and Sobel. The Laplacian

    filter has an editable window size whereas the Roberts and Sobel filters do not have

    editable kernels or window sizes. Edge filtered images were created using the Laplacian

    filter using window sizes of 3x3, 5x5, 7x7 and 11x11. The Laplacian filter was also used

    in combination with the Gaussian low pass filter to try and reduce some of the noise that

    results when creating the Laplacian filtered images.

    Texture images were created using ENVI software and are based on the GLCM which

    includes the following texture characteristics: mean, variance, homogeneity, contrast,

    dissimilarity, entropy, second moment and correlation. Another set of texture images

    were created using the Occurrence measures which consist of data range, mean, variance,

    entropy, and skewness. Each set of texture images was created using a 3x3, 5x5, 7x7 and

    11x11 processing window. The processing window measures the number of times each

    gray level occurs in that particular part of the image (ENVI 2011). As the processing

    window becomes larger, image detail is lost. The texture images created using the

  • 8/12/2019 Joan Bie Dig Er Thesis

    47/119

    38

    GLCM are eight band images, and the texture occurrence images are five band images;

    thus the dimensionality of the imagery is significantly increased by the use of texture.

    These two texture images were also layer stacked with the original imagery to create

    nine-band and six-band images. Additional nine-band and six-band images were also

    created from these two texture images layer stacked with a filtered original image. The

    resulting images were then classified using unsupervised and supervised classifiers. The

    accuracy of these classifications was then compared to the classification baseline

    information using an error matrix.

    Principle components analysis was used to reduce the number of bands on severalcomposite images. In this way the dimensionality of the imagery is reduced but most of

    the information in the imagery is maintained. PCA was performed on a multi-layer

    image consisting of images created from variance, mean, and homogeneity texture

    operators, plus the original unprocessed image. The result was a two-layer image which

    incorporates information from the original image and the texture layers.

    ENVI software also provides tools to perform mathematical morphology filtering

    which is a non-linear process based on shape. Morphology filtering was performed on

    both the original imagery and 5x5 occurrence texture images. Supervised and

    unsupervised classification was then performed to determine the accuracy as compared to

    the classification baseline information.

    3.3.7. Object-based Image Analysis

    Another digital image processing technique which was explored in this research was

    object-based image analysis. Object-based image analysis is based on regions or groups

    of pixels in an image rather than single pixels. Feature extraction was performed using

  • 8/12/2019 Joan Bie Dig Er Thesis

    48/119

    39

    ENVI EX which provides object-based tools that utilize spatial, spectral, and textural

    features. The object-based analysis provided by the ENVI software uses an edge-based

    segmentation algorithm and requires only the scale level as an input parameter. The scale

    levels range from 0-100 where a high scale level reduces the number of segments that are

    defined, and a low scale level increases the number of segments that are defined. There

    should be a balance in determining the scale level by trying to choose a scale that

    delineates the image object boundaries as well as possible. This level is likely to be

    different depending on the characteristics of the imagery being analyzed. ENVI provides

    an interactive preview window to help determine an appropriate scale level for an image.The preview window allows you to see what kind of effect changing the scale level of the

    segmentation has on the objects of interest in the image scene before the segmentation

    runs. This helps to avoid creating numerous unsuccessful segmentation images. After

    the initial segmentation has been performed, image segment merging can be done. ENVI

    uses the Lambda-Schedule algorithm that iteratively merges segments by using a

    combination of spectral and spatial information. This step is especially helpful when an

    image has been over segmented as it enables the aggregation of small segments that may

    occur from image object variation (ENVI 2011). After segmentation the next step is to

    find objects and classify the imagery. Objects were chosen interactively from the

    segmented image and the image was then classified. ENVI EX offers either a K-means

    classifier or a SVM classifier. Classification and post processing was performed using

    both available OBIA classifiers. The final step before classification in the ENVI EX

    feature extraction workflow is the refine results window. In this window there are

    options to export vectors and smooth the results similar to using a majority filter on a

  • 8/12/2019 Joan Bie Dig Er Thesis

    49/119

    40

    classified image. The process for using the feature extraction tools in ENVI EX is

    designed to make the process of OBIA user friendly.

    ENVI 4.8 also offers an OBIA classification method called size-constrained region

    merging (SCRM). This tool is an extension that can be added to ENVI. The tool

    partitions an image into reasonably homogenous polygons based on a minimum size

    threshold. The output of the tool is a vector file and an image file. The vector file can be

    used directly as an initial source to assist visual interpretation, and the image can be

    further classified using either unsupervised or supervised classification. One of the

    limitations of this extension is that there is a size limitation of 2MB for the image(Castilla and Hay 2007). All of the layer stacked imagery exceeded the size limitation for

    using this tool. SCRM was used on the original imagery, the one band dissimilarity,

    mean, homogeneity, and variance texture images. The second moment, entropy, and

    contrast bands were not used, as there appears to be a lot of correlation between them and

    the bands that were selected. The correlation band does not have enough usable

    information in it to segment it into objects. The output image was then classified using

    the SVM classifier.

    3.3.8. Post Processing and Automation

    The classified images created from the previously mentioned digital processing

    techniques and classifiers contained varying quantities of island pixels and salt and

    pepper noise. There are numerous methodologies to reduce these types of areas in a

    classified image. Majority and minority filtering, clump, sieve, and combine classes are

    some of the commonly available tools provided in GIS and image analysis software.

    These processes reduce the complexity of the classification and allow a more cohesive

  • 8/12/2019 Joan Bie Dig Er Thesis

    50/119

    41

    result for further analysis. Post classification processing may also produce error in the

    final imagery by smoothing and combining the wrong classes together. It is also not

    practical to remove noise pixel by pixel, as there may be thousands of areas to examine.

    The next step in this research was to produce a vector polygon layer that can assist in

    visual interpretation of the imagery. In order to simplify the procedure of processing the

    classified rasters and converting them to a vector layer that facilitates visual

    interpretation, a model was developed using ArcGIS Model Builder (Figure 14). This

    model allows the user to input a classified image, apply a smoothing kernel, aggregate

    island pixels to a specified tolerance, convert the raster to a vector layer, and smooth andsimplify the resulting polygons. For consistency a majority filter using a 3x3 window

    and aggregation using a minimum threshold of 25 was used on all the classified images

    examined. The model parameters for smoothing and simplifying polygons were left open

    so that adjustments can be made for different images.

    Figure 14 Post Processing ArcGIS Model

  • 8/12/2019 Joan Bie Dig Er Thesis

    51/119

    42

    One of the challenges of using vector files that have been converted from raster files is

    that polygons have a stepped appearance that follows pixel boundaries. This

    characteristic appearance is much different from a vector file created through heads up

    digitizing. A human digitizer classifies an image into recognizable objects using shape,

    context, texture, shadows, etc. to help determine the boundaries of objects. This would

    be very difficult if not impossible for a human digitizer to create land use/land cover

    boundaries at the pixel level. This is one of the main differences between automated

    classification and classification performed by visual interpretation.

    The polygon smoothing and aggregation steps used in the model help to reduce someof the stepped appearance created by the raster to vector conversion process (Figure 15).

    After polygons underwent smoothing and simplification, the result appeared much closer

    to results obtained through visual interpretation. This process was also an advantage if

    polygons needed to be reshaped. There are fewer vertices for each polygon after

    completing these operations.

    Once the vector layer had been processed through the model, it was edited using a

    custom toolbar in ArcGIS 10 software. The custom toolbar includes a combination of

    Figure 15 Polygon raster to vector, smoothing, and smooth and simplify

  • 8/12/2019 Joan Bie Dig Er Thesis

    52/119

    43

    out of the box tools (Selection Tool and Cut Polygon Tool) and several custom tools

    created using C#.net and ArcObjects. The purpose of the custom toolbar is to provide

    functions to remove small islands by merging them to other neighboring pixels. It was

    implemented as an Add-in which was easily added to the ArcGIS 10 user interface.

    The toolbar consists of four custom tools: select by area, merge with smallest neighbor,

    merge with largest neighbor, and merge with selected polygon. These tools are very

    similar to raster majority and minority filtering except that the user has more control over

    them. The tools were then used to further refine the classification using visual

    interpretation. The automated classifications in essence become the starting point for themanual digitizing effort for the study areas.

    3.3.9. Accuracy Assessment

    One of the most serious limitations of historical imagery is ground-truthing. The

    imagery is between 33 and 52 years old, and it is likely that many of the objects in the

    imagery have changed or no longer exist today. Ground-truthing was limited to visual

    interpretation and image accuracy. The baseline information derived from heads up

    digitizing was used as ground truth to evaluate the accuracy of the classification of both

    the original images and the images where digital image processing has been used (i.e.

    filtering, texture, PCA and segmentation).

    An evaluation tool called a confusion matrix (or error matrix) was used between

    classification baseline information and classifications after image processing

    enhancement so that there is a comparison of accuracy results. To save time and labor,

    only the classifications deemed best were evaluated. The confusion matrix can help to

  • 8/12/2019 Joan Bie Dig Er Thesis

    53/119

  • 8/12/2019 Joan Bie Dig Er Thesis

    54/119

    45

    accuracy, errors of commission, single class accuracy, and the Kappa coefficient. Refer to

    Appendix 1 for a sample of the error matrixes used in this research.

  • 8/12/2019 Joan Bie Dig Er Thesis

    55/119

    46

    CHAPTER 4: ANALYSIS RESULTS AND DISCUSSION

    4.1. Manual Digitizing

    Heads up digitizing of land cover classes on any type of imagery whether it is multi

    spectral or panchromatic allows the user more control over the results of the

    classification. The results of this method of classification in general do not require

    further editing or post processing. On the other hand subjectivity of the digitizer has an

    effect on the results of the classification. It is unlikely that a digitizer would be able to

    classify an image exactly the same every time.

    Digitizing took place in two sessions with the Ogden imagery taking approximatelyfive hours to complete and the Salt Lake City image took approximately three hours to

    complete. The Salt Lake City image has 331 polygons compared to the Ogden image

    which has 172 polygons. The Ogden image took much longer to digitize even though

    there are approximately half the amount of polygons. The polygons and land cover

    configurations were more complicated when considering the integration of grassland,

    forest and water areas on the image. The features on the Salt Lake City image are laid

    out in a grid pattern separated by wide streets so even though there were almost twice as

    many polygons to digitize the process went more quickly. An important aspect of this

    research was to show that digital image processing of historical panchromatic imagery

    could enhance and facilitate visual interpretation of the imagery on a variety of terrains

    and features.

    The visual interpretation of the imagery required a zoom level of between 1:1,500 and

    1:3,000 on the Salt Lake City image and 1:1,000 and 1:4,000 on the Ogden image. These

    zoom levels were determined by the digitizer by how well they could see the details in

  • 8/12/2019 Joan Bie Dig Er Thesis

    56/119

    47

    the imagery while still being able to have some reference to the context of objects being

    examined. In the experience of the digitizer a more consistent result is also achieved if

    there is not a large variance in the viewing scale of the objects in the scene. If an area is

    digitized at 1:3,000 and another area at 1:24,000 then the details being observed will not

    be consistent throughout the study area. Digital image processing on the other hand

    classifies by pixel without involving scale issues. This is a major difference in the

    methodology of classification. Digitizing at varying scales is both an advantage and

    disadvantage compared to digital classification. If the scale is zoomed in at the pixel

    level, it was impossible to discern what the objects in the imagery were. A large variancein scale can lead to inconsistency, but a small variance in digitizing scale can help the

    digitizer to consider a features relationship to surrounding objects when determining

    what the object is, unlike most per pixel digital classifications. By using a small variance

    in digitizing scale for land use/land cover classification of panchromatic imagery, both

    detail and consistency can be maintained while the expert knowledge of relationships and

    contexts of features can be utilized.

    This project used relatively small areas of interest. After examining the land use/land

    cover classes from the beginning of the project to its conclusion, there were areas of the

    initial digitizing which on further analysis could have been refined or changed, especially

    in diverse areas containing many intricate changes in the landscape. There was a

    tendency to generalize areas where the land use/land cover types are fragmented. This

    tendency is most notable in the Ogden image in the southern half of the image where the

    forested areas are broken up by water and grasslands. The initial digitizing was not

    changed to reflect new perceptions of the land class areas on the imagery. Some of these

  • 8/12/2019 Joan Bie Dig Er Thesis

    57/119

    48

    inconsistencies have an effect on the final accuracy of the digital classifications, as it was

    apparent that at some points the digital classification was more correct than the visual

    interpretation. This is a limitation of the research.

    One of the major differences found in this research between the manual digitizing

    classification and the digital image processing classification was the level of detail

    achieved in the classifications. In the Ogden image the total number of polygons

    digitized was 172 (Figure 16) and the total number of polygons digitized for the Salt

    Lake City image was 331 (Figure 17). The digital classifications in comparison before

    post processing yielded several thousand polygons. After post processing most digitalimage classifications still exceeded the digitized baseline information but results

    averaged about 500-1000 polygons. It was a difficult task to digitize very detailed areas

    on the imagery. This study has shown that by utilizing digital image processing

    techniques to help facilitate visual interpretation of land use/land cover classes, the

    analyst can take advantage of the detail and repeatability that digital processes provide

    while improving the classification accuracy using a GIS in post processing the results.

    Results using visual interpretation and heads up digitizing may provide more initial

    accuracy, but digital image processing lends some added consistency to the process.

    4.2. Unsupervised Classification

    Supervised and unsupervised classification results varied depending on the image, the

    classification method, pre-processing, and post-processing. Panchromatic imagery

    presents many challenges as previously mentioned in this study. The heterogeneity of

    the study area also has an effect on how successful classification is. This study has

  • 8/12/2019 Joan Bie Dig Er Thesis

    58/119

    49

    Figure 16 Classification using visual interpretation of the Ogden image

    Figure 17 - Classification using visual interpretation of the Salt Lake City image

  • 8/12/2019 Joan Bie Dig Er Thesis

    59/119

  • 8/12/2019 Joan Bie Dig Er Thesis

    60/119

    51

    Although unsupervised classification showed low accuracy in both study areas, the

    results showed some important trends in the data. In the Ogden image it was very

    difficult to extract more than five classes which was an indication that land cover types

    such as water, medium fields and grassland are very similar. Panchromatic imagery

    would require more pre- and post-processing to achieve a more accurate classification

    using eight land cover types. As the classes are aggregated into larger parent classes the

    classification accuracy increased accordingly. Unsupervised classification even on a

    small study area such as this was more time consuming than supervised classification and

    provided somewhat unsatisfactory results.The Salt Lake City image proved difficult in a different way in that the mixed urban

    area consisted of commercial, residential and transportation areas which appear very

    distinct using visual interpretation but present difficulties for digital classifiers. Urban

    areas are uniquely difficult to classify on multispectral imagery, as there is such a mixture

    of impervious surfaces. Black and white high spatial resolution imagery complicates this

    10 spectral classes 25 spectral classes 100 spectral classes

    Figure 18 Ogden image ISODATA classifications

  • 8/12/2019 Joan Bie Dig Er Thesis

    61/119

    52

    situation, as there was an extreme overlap between classes, because features such as

    buildings and mixed surfaces like parking lots and vegetation exist in both residential and

    commercial areas making it difficult to distinguish these areas. None of the ISODATA

    classifications of the Salt Lake City imagery were able to distinguish between all five,

    detail level land cover types. Trees, transportation, and commercial land cover types

    were the only three land cover types that could be classified from the 10, 25, and 100

    spectral class ISODATA classifications (Figure 19). Many areas of overlap exist

    between the commercial and transportation classes in all three unsupervised

    classifications. The transportation network in this image is a very distinct linear featurewhen classifying the imagery through visual interpretation, but there are many tonal

    variations in the pavement which causes a great deal of confusion for most traditional

    unsupervised classifiers. Grass and residential land cover types were unable to be

    distinguished from commercial, transportation and trees as there was considerable tonal

    overlap between these areas.

    A 10 and 25 spectral class K-Means unsupervised classification was performed on the

    Ogden imagery using a layer stacked image consisting of the original image and the

    following texture characteristics: mean, variance, and homogeneity. Surprisingly the use

    of texture did not improve the unsupervised classification using the level 1 land use/land

    cover types. Overall accuracy was 25% for 10 classes and 34% for 25 classes. This is

    most likely due to the fact that there was little to no distinction between the field areas as

    most of them exhibit a smooth surface. Also the field areas and the water areas were

    confused as well. Aggregating the classification into the more generalized classes

    increased accuracy significantly in the unsupervised classification. This was particularly

  • 8/12/2019 Joan Bie Dig Er Thesis

    62/119

    53

    10 spectral classes 25 spectral classes 100 spectral classes

    Figure 19 Salt Lake City image ISODATA classifications

    apparent in the texture image. Accuracy increased to 54% for the level 2 classification (3

    land use/land cover types classification scheme) and increased to 71% for the level 3

    classification (2 land use/land cover types classification scheme). The Halounova image

    which consisted of texture and filtered layers did not provide improvement for the Ogden

    image level 1 classification scheme using unsupervised classification, but did slightly

    improve the Salt Lake City level 1 overall accuracy. Due to the poor accuracy results

    using texture and unsupervised classification no further analysis was performed in either

    study area.

    4.3 Supervised Classification

    Supervised classification of panchromatic imagery again presents many challenges.

    The SVM classifier was used to perform the supervised classification as it has the ability

    to process single band imagery and it provided better results. The supervised classifiers

    available in ENVI are limited when using single band data as many options such as

    maximum likelihood, spectral angle divergence, and neural net all require more than one

  • 8/12/2019 Joan Bie Dig Er Thesis

    63/119

  • 8/12/2019 Joan Bie Dig Er Thesis

    64/119

    55

    Table 4 Training sample statistics from original Ogden image

    Land Cover Type Min Max Mean StDev Points

    Water 129 177 151.6 13.1 1611

    Forest 0 162 70.3 33.7 3246

    Grassland 81 197 129.9 13.8 1944

    Dark Field 53 102 78.2 12 2706

    Medium Field 94 150 124.1 13.5 5056

    Light Field 148 194 179 6.5 1918

    Impervious Surface 74 223 170.7 28.3 732

    Bare Earth 180 227 197.8 7.5 1393

    to illustrate the overlap which occurs between commercial, residential and transportation

    classes throughout this image (Table 5). The histograms were either bimodal or

    multimodal. Although the histogram for transportation approached a normal distribution,

    there were still many peaks and valleys indicating variations in gray levels in the image

    for this land cover type.

    Supervised classification results showed that it was very difficult to extract more than

    8 classes on the Ogden image and 5 classes on the Salt Lake City image. One of the

    limitations of using panchromatic imagery for land use/land cover classification is that

    the DN values which make up the signature for many land use/land cover types contain a

    significant amount of confusion. Real world features may be difficult to identify without

    taking into account their spatial context (Hung and Wu 2005). Land use/land cover types

    may need to be generalized. For example, detail like corn or wheat fields may not be

    characterized using panchromatic imagery, but dark fields and light fields or cropland

    may be possible. The increased accuracy achieved when aggregating land use/land cover

  • 8/12/2019 Joan Bie Dig Er Thesis

    65/119

    56

    types into the level 2 and level 3 classification schemes support this conclusion. Training

    samples tested with a greater number of pixels increased the confusion between classes