Sql Server 2008 Spatial Analysis

24
SQL Server 2008 Spatial Analysis Dan Crawford Integrated Network Strategies [email protected] http://www.insindy.com

description

Presented at December 2010 IndyPASS

Transcript of Sql Server 2008 Spatial Analysis

Page 1: Sql Server 2008 Spatial Analysis

SQL Server 2008 Spatial Analysis

Dan CrawfordIntegrated Network Strategies

[email protected]://www.insindy.com

Page 2: Sql Server 2008 Spatial Analysis

What is spatial data?

• Geometric

Represents data in a 2D plain, similar to graph paper in high school. Units are user-defined and could be inches, miles, pixels, etc.

Page 3: Sql Server 2008 Spatial Analysis

What is spatial data?

• GeographicRepresents data points using angles of Latitude and Longitude. Latitude measures North/South, and Longitude measures degrees East/West of Prime Meridian

Page 4: Sql Server 2008 Spatial Analysis

System Requirements

• SQL Server 2008 Express or higher – recommend R2 to use maps in SSRS

• Dev Tools– Visual Studio 2005, 2008, or 2010– SQL Management Studio 2008

• Now supported on SQL Azure

Page 5: Sql Server 2008 Spatial Analysis

Uses of spatial data

• Used by central cancer registries for statistical analysis with other geography specific data sources, such as census data

• Integrated route mapping with MapPoint, Google Maps, etc

• Geographical business intelligence analytics

Page 6: Sql Server 2008 Spatial Analysis

Geometry data type

• Geometry data type stores points, lines, polygons, and collections of geometric objects

• Represent using WKT (well-known text), WKB (well-known binary), or GML (geography markup language)

• WKT seems to be most common

Page 7: Sql Server 2008 Spatial Analysis

WKT Markup

• POINT(x y)

• LINESTRING(x1 y1,x2 y2)

• POLYGON((x1 y1,x2 y2,x3 y3,x4 y4,x1 y1))

• GEOMETRYCOLLECTION(Geo1, Geo2, …)

Page 8: Sql Server 2008 Spatial Analysis

Spatial ExpressionsMethod Purpose

STDistance Distance between two geographical points

STBuffer Creates buffer around a geographical region. Useful for making points more easily seen

STIntersects Do two objects intersect?

STIntersection Creates geographical object defined by intersection of two geographies

STLength Line segment length

STDifference Returns all points from the base object that do not intersect with the parameter object

Page 9: Sql Server 2008 Spatial Analysis

More Spatial ExpressionsMethod Purpose

STUnion All points of base and parameter instance merged together

STCentroid Middle of object

STWithin Is the object within the parameter object

STContains Does the object contain the parameter object

STX X coordinate for geometry object

STY Y coordinate for geometry object

Lat Latitude for geography object

Long Longitude for geography object

Page 10: Sql Server 2008 Spatial Analysis

Geocoding

• Geography data type does not directly understanding mailing address data

• Mailing addresses must be converted to latitude/longitude coordinates

• Geocoding = conversion of geographic data like address or zip code to geographic coordinates

• Options – MapPoint/Bing Map Services, Google Maps API, many others

Page 11: Sql Server 2008 Spatial Analysis

Rendering Options

• SQL Management Studio 2008 – very basic for query testing

• VirtualEarth• Google Maps or similar• 3rd party mapping component (e.g. Dundas)• SSRS/Report Builder in R2

Page 12: Sql Server 2008 Spatial Analysis

Spatial Indexing

Images from Microsoft Technet

Page 13: Sql Server 2008 Spatial Analysis

Spatial Indexing

Grid Type Grid Density Cells

LOW 4 x 4 16

MEDIUM 8 x 8 64

HIGH 16 x 16 256

CREATE SPATIAL INDEX SPATIAL_Hospitals ON dbo.Hospitals(LocationGeography) USING GEOGRAPHY_GRID

WITH( GRIDS = ( LEVEL_1 = MEDIUM, LEVEL_2 = MEDIUM, LEVEL_3 = MEDIUM, LEVEL_4 = MEDIUM),

CELLS_PER_OBJECT = 16, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

Page 14: Sql Server 2008 Spatial Analysis

Spatial Indexing - Utilization

SELECT *FROM Hospitals WITH (INDEX(SPATIAL_Hospitals))WHERELocationGeography.STIntersects(@P.STBuffer(@eps*1609.344)) = 1

Page 15: Sql Server 2008 Spatial Analysis

Goal of Geographic Analysis

“I want SQL Server to tell me when there are clusters of geographic data points and where they are located.”

- Dan Crawford, 2010

Page 16: Sql Server 2008 Spatial Analysis

It’s easy to see points on a map with SQL Server

Page 17: Sql Server 2008 Spatial Analysis

Why use cluster analysis?

• Analysis of injury severity and hospital resource use in a regional health care system

• Customer purchasing patterns• Choosing a business or advertising location• Crime analysis• Easy visualization for dashboard

Page 18: Sql Server 2008 Spatial Analysis

What is a geographic cluster?

For our purposes a cluster is a group of a significant number of data points which are geographically close to each other.

There are two variables:-The number of data points which are required in order to be considered a cluster-Distance which defines being “geographically close”

Page 19: Sql Server 2008 Spatial Analysis

What we want…

Page 20: Sql Server 2008 Spatial Analysis

Or better yet…

Page 21: Sql Server 2008 Spatial Analysis

DBSCAN

DBSCAN(D, eps, MinPts) C = 0 for each unvisited point P in dataset D mark P as visited N = getNeighbors (P, eps) if sizeof(N) < MinPts mark P as NOISE else C = next cluster expandCluster(P, N, C, eps, MinPts)

Source: http://en.wikipedia.org/wiki/DBSCAN

Page 22: Sql Server 2008 Spatial Analysis

DBSCAN (cont’d)

expandCluster(P, N, C, eps, MinPts) add P to cluster C for each point P' in N if P' is not visited mark P' as visited N' = getNeighbors(P', eps) if sizeof(N') >= MinPts N = N joined with N' if P' is not yet member of any cluster add P' to cluster C

Source: http://en.wikipedia.org/wiki/DBSCAN

Page 23: Sql Server 2008 Spatial Analysis
Page 24: Sql Server 2008 Spatial Analysis

To make life easier

• Report Builder 3.0• SQL Server Spatial Tools –

http://sqlspatialtools.codeplex.com