Parallel implementations of hyperspectral remote ...PARALLEL IMPLEMENTATIONS OF HYPERSPECTRAL REMOTE

download Parallel implementations of hyperspectral remote ...PARALLEL IMPLEMENTATIONS OF HYPERSPECTRAL REMOTE

of 77

  • date post

    12-Jul-2018
  • Category

    Documents

  • view

    212
  • download

    0

Embed Size (px)

Transcript of Parallel implementations of hyperspectral remote ...PARALLEL IMPLEMENTATIONS OF HYPERSPECTRAL REMOTE

  • PARALLEL IMPLEMENTATIONS OF HYPERSPECTRALREMOTE SENSING ALGORITHMS

    A Thesis Presented

    by

    Cory James Crothers Brett

    to

    The Department of Electrical and Computer Engineering

    in partial fulfillment of the requirementsfor the degree of

    Master of Science

    in

    Electrical Engineering

    Northeastern UniversityBoston, Massachusetts

    May 2014

  • c Copyright 2014 by Cory James Crothers Brett

    All Rights Reserved

  • Parallel Implementations of Hyperspectral Remote

    Sensing Algorithms

    Cory James Crothers Brett

    March 26, 2014

  • Abstract

    Remote sensing of materials and chemical clouds using Hyperspectral imaging sensors has

    many different applications. Some of these applications, such as detecting plumes in the af-

    termath of natural disasters, are time sensitive. As of now, the detection must be done on an

    offline system. In this thesis, we consider the matched filter, the normalized matched filter,

    and sequential maximum angle convex cone detection techniques for parallel implementa-

    tions. The detectors are mapped to a multicore CPU using multithreading, and efficient

    data management to achieve a real time result. A Graphics Processing Unit (GPU) is also

    investigated as a possible architecture for the time sensitive problem. Finally, we assess the

    performance of the implementations in terms of run time, and conclude how the performance

    can be improved further.

    2

  • Acknowledgements

    This thesis was done with the help and guidance of many people. First, I would like to name

    Dimitris G. Manolakis for his immense help and guidance during research, presentation,

    and tasks of the work. I would also like to thank Professor Vinay K. Ingle for allowing

    the opportunity to work with him and Dr. Manolakis over the last two years. I would

    further like to thank Robert DiPietro for his help with understanding these algorithms

    and the background information necessary to complete this thesis. Finally, I would like to

    thank Professor Miriam Leeser for her class, High Performance Computing, which taught

    me different techniques used throughout this thesis, as well as reading this thesis.

    3

  • Contents

    Abstract 2

    Acknowledgements 3

    1 Introduction 8

    1.1 Hyperspectral Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    1.2 Signal Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.2.1 Signal Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    2 Multicore CPU 15

    2.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    2.1.1 Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    2.1.2 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    2.2 Mapping Signal Detection to Multicore CPU . . . . . . . . . . . . . . . . . . 21

    2.2.1 Basic Linear Algebra Routines . . . . . . . . . . . . . . . . . . . . . . 21

    GEMM: Subroutine Optimization . . . . . . . . . . . . . . . . . . . . 22

    2.2.2 OpenMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    2.2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    3 Graphics Processing Unit (GPU) 28

    3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    3.2 Running the GPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    3.3 Mapping Signal Detection to GPU . . . . . . . . . . . . . . . . . . . . . . . 33

    4

  • 3.3.1 cuBLAS and MAGMA . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    4 Matched Filter 40

    4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    4.2.1 Mean and Mean Subtraction . . . . . . . . . . . . . . . . . . . . . . . 43

    4.2.2 Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

    4.2.3 Whitening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    4.2.4 Complete Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    5 Normalized Matched Filter 56

    5.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    5.2 Whitening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    5.3 Mahalanobis Distance and NMF scores . . . . . . . . . . . . . . . . . . . . . 58

    5.4 Complete Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

    6 Sequential Maximum Angle Convex Cone: Endmember Extraction 62

    6.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

    6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

    6.2.1 CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

    6.2.2 GPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

    7 Conclusions 71

    7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

    7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

    Bibliography 74

    5

  • List of Figures

    1.1 Example of a graphical approach . . . . . . . . . . . . . . . . . . . . . . . . 12

    1.2 Example of graphical approach: Demean . . . . . . . . . . . . . . . . . . . . 12

    1.3 Example of graphical approach: Whitened . . . . . . . . . . . . . . . . . . . 13

    2.1 Example of CPU Architecture [1] . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.2 Memory Pyramid [2] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    2.3 2-set way associative cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    2.4 SSE example [3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    2.5 Spawning thread in openMP . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    3.1 Fermi Architecuture: Green is execution units, orange is sechduler and dis-

    patch, dark blue is main memory and light blue is registers L1 cache [4]. . . 29

    3.2 Streaming Multiprocessor [4] . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    3.3 Thread organization [4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    3.4 Coalesce vs. Not coalesce memory accesses . . . . . . . . . . . . . . . . . . . 35

    3.5 GEMM GPU breakdown [5]. . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    4.1 Matched-filter (MF) and Normalized Match-filter (NMF) components . . . . 40

    4.2 cuBLAS implementation of Equations 4.1 and 4.2. . . . . . . . . . . . . . . . 44

    4.3 Technique in [6]. One block of many launched . . . . . . . . . . . . . . . . . 45

    4.4 First Memory Overlap attempt. 256 threads and 160 blocks were spawned . 45

    6

  • 4.5 Second memory overlap attempt. 256 threads and 20 blocks were spawned . 46

    4.6 Third memory overlap attempt. 256 threads and 20 blocks were spawned . . 47

    4.7 Mean Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

    4.8 Fourth memory overlap attempt. 256 threads and 43 blocks were spawned . 48

    4.9 Fifth memory overlap attempt. 256 threads and 16-17 blocks were spawned . 49

    5.1 Timing breakdown CPU vs. GPU . . . . . . . . . . . . . . . . . . . . . . . . 60

    6.1 SMACC End members . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

    6.2 Projection of pixels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    6.3 Oblique projection illistration [7] . . . . . . . . . . . . . . . . . . . . . . . . 65

    6.4 Norm thread configuration on GPU . . . . . . . . . . . . . . . . . . . . . . . 69

    7

  • Chapter 1

    Introduction

    Hyperspectral images (HSI) are similar to a normal camera in that it measures energy off of

    object illuminated by the sun in spectral bands. The difference is in the wavelength of the

    bands captured and the amount of spectral bands. For sensing chemical gas plumes, we look

    at the Long-wave Infrared (LWIR) bands encompassing [8-13]m. We use the LWIR sensor

    range because these gas plumes have a unique spectral signatures in this range as opposed

    to the short-wave infrared (SWIR) sensor range [8]. This technology has a wide range of

    applications such as measuring gas pollutants in the factory settings, response to natural

    disasters to see if there are hazardous chemical gases in the air, and the chemical warfare

    agents in protecting our military. Due to the nature of these applications, implementing

    them efficiently will cut down on crucial response times.

    1.1 Hyperspectral Image

    Before we explain how we implement the remote sensing, we need to know how the image is

    made, and what it comprised of. The sensor is made up of three major components, an HSI

    spectrometer, an image scanning device, and optical system.

    8

  • The optical system is what grabs the radiant energy from the objects and atmosphere

    below it. The image scanning device has different ways of moving the optical system in order

    to collect the data needed to populate the image. The spectrometer is what takes the energy

    that was collected, separates it by wavelength, and gives it an intensity of how much energy

    was collected in each wavelength.