Spatial Data Analysis

30
Spatial Data Analysis Why Geography is important.

description

Spatial Data Analysis. Why Geography is important. What is spatial analysis?. From Data to Information beyond mapping: added value transformations, manipulations and application of analytical methods to spatial (geographic) data Lack of locational invariance - PowerPoint PPT Presentation

Transcript of Spatial Data Analysis

Page 1: Spatial Data Analysis

Spatial Data Analysis

Why Geography is important.

Page 2: Spatial Data Analysis

What is spatial analysis?

• From Data to Information– beyond mapping: added value– transformations, manipulations and application of

analytical methods to spatial (geographic) data

• Lack of locational invariance– analyses where the outcome changes when the

locations of the objects under study changes» median center, clusters, spatial autocorrelation

– where matters• In an absolute sense (coordinates)• In a relative sense (spatial arrangement, distance)

Page 3: Spatial Data Analysis

Components of Spatial Analysis

• Visualization– Showing interesting patterns

• Exploratory Spatial Data Analysis (ESDA)– Finding interesting patterns

• Spatial Modeling, Regression– Explaining interesting patterns

Page 4: Spatial Data Analysis

Implementation of Spatial Analysis

• Beyond GIS– Analytical functionality not part of typical commercial

GIS» Analytical extensions

– Exploration requires interactive approach» Training requirements» Software requirements

– Spatial modeling requires specialized statistical methods

» Explicit treatment of spatial autocorrelation» Space-time is not space + time

• ESDA and Spatial Econometrics

Page 5: Spatial Data Analysis

What Is Special About Spatial Data?

• Location, Location, Location– “where” matters

• Dependence is the rule– spatial interaction, contagion, externalities,

spill-overs, copycatting– First Law of Geography (Tobler)

• everything depends on everything else, but closer things more so

Page 6: Spatial Data Analysis

• Spatial heterogeneity– Lack of stationarity in first-order statistics

• Pertains to the spatial or regional differentiation observed in the value of a variable– Spatial drift (e.g., a trend surface)– Spatial association

Page 7: Spatial Data Analysis

Nature of Spatial Data

• Spatially referenced data “georeferenced”» “attribute” data associated with location

» where matters

• Example: Spatial Objects– points: x, y coordinates

» cities, stores, crimes, accidents

– lines: arcs, from node, to node» road network, transmission lines

– polygons: series of connected arcs» provinces, cities, census tracts

Page 8: Spatial Data Analysis

GIS Data Model

• Discretization of geographical reality necessitated by the nature of computing devices (Goodchild)– raster (grid) vs. vector (polygon)– field view (regions, segments) vs. object view

(objects in a plane)

• Data model implies spatial sampling and spatial errors

Page 9: Spatial Data Analysis

3 Classes of Spatial Data

• Geostatistical Data– points as sample locations (“field” data as

opposed to “objects”)• Continuous variation over space

• Lattice/Regional Data– polygons or points (centroids)

• Discrete variation over space, observations associated with regular or irregular areal units

Page 10: Spatial Data Analysis

• Point Patterns– points on a map (occurrences of events at

locations in space)• Observations of a variable are made at location X• Assumption that the spatial arrangement is directly

related to the interaction between units of observation

Page 11: Spatial Data Analysis
Page 12: Spatial Data Analysis

Visualization and ESDA

• Objective– highlighting and detecting pattern

• Visualization– mapping spatial distributions– outlier detection– smoothing rates

• ESDA– dynamically linked windows– linking and brushing

Page 13: Spatial Data Analysis

Mapping patterns

http://www.cdc.gov/nchs/data/gis/atmapfh.pdf

Page 14: Spatial Data Analysis

ESDAhttp://www.public.iastate.edu/~arcview-xgobi/

Page 15: Spatial Data Analysis

Spatial Process

• Spatial Random Field– { Z(s): s ∈ D }

» s R∈ d : generic data location (vector of coordinates)

» D R⊂ d : index set(subset of potential locations)

» Z(s) random variable at s, with realization z(s)

– Examples• s are x, y coordinates of house sales, Z sales price

at s• s are counties, Z is crime rate in s

Page 16: Spatial Data Analysis

Point Pattern Analysis

• Objective– assessing spatial randomness

• Interest in location itself– complete spatial randomness– clustering, dispersion

• Distance-based statistics– nearest neighbors– number of events within given radius

Page 17: Spatial Data Analysis

Point Patterns

• Spatial process– index set D is point process, s is random

• Data– mapped pattern

» examples: location of disease, gang shootings

• Research question– interest focuses on detecting absence of

spatial randomness (cluster statistics)– clustered points vs dispersed points

Page 18: Spatial Data Analysis
Page 19: Spatial Data Analysis

Geostatistical Data

• Spatial Process– index set D is fixed subset of Rd (continuous)

• Data– sample points from underlying continuous surface

» examples: mining, air quality, house sales price

• Research Question– interest focuses on modeling continuous spatial

variation– spatial interpolation (kriging)

Page 20: Spatial Data Analysis

Variogram Modeling (Geostatistics)

• Objective– modeling continuous variation across space

• Variogram– estimating how spatial dependence varies

with distance– modeling distance decay

• Kriging– optimal spatial prediction

Page 21: Spatial Data Analysis
Page 22: Spatial Data Analysis
Page 23: Spatial Data Analysis

Lattice or Regional Data

• Spatial process– index set D is fixed collection of countably many

points in Rd

– finite, discrete spatial units

• Data– fixed points or discrete locations (regions)

» examples: county tax rates, state unemployment

• Research question– interest focuses on statistical inference– estimation, specification tests

Page 24: Spatial Data Analysis

Spatial Autocorrelation

• Objective– hypothesis test on spatial randomness of

attributes = value and location

• Global and local autocorrelation statistics: Moran’s I, Geary’s c, G(d), LISA

• Visualization of spatial autocorrelation– Moran scatterplot– LISA maps

Page 25: Spatial Data Analysis
Page 26: Spatial Data Analysis
Page 27: Spatial Data Analysis

Spatial process models

• How is the spatial association generated?– Spatial autoregressive process (SAR)

• Y = ρWY + ε

– Spatial moving average process (SMA)• Y = (I + ρW) ε

– ε – vector of independent errors

– W = distance weights matrix

– In SAR, correlation is fairly persistent with increasing distance, whereas with SMA is decays to zero fairly quickly.

Page 28: Spatial Data Analysis

• Spatial process—the rule governing the trajectory of the system as a chain of changes in state.

• Spatial pattern—the map of a single realization of the underlying spatial process (the data available for analysis).

• Say you conduct a regression analysis. If the residuals do not display spatial autocorrelation, then there is no need to add “space” to the model. Examine s.a. in the residuals using Moran’s I or Geary’s c or G(d).

Page 29: Spatial Data Analysis

Perspectives on spatial process models

• Finding out how the variable Y relates to its value in surrounding locations (the spatial lag) while controlling for the influence of other explanatory variables.

• When the interest is in the relation between the explanatory variables X and the dependent variable, after the spatial effect has been controlled for (this is referred to as spatial filtering or spatial screening).

Page 30: Spatial Data Analysis

• The expected value of the dependent variable at each location is a function not only of explanatory variables at that location, but of the explanatory variables at all other locations as well.