Sql Server 2008 Spatial Analysis
-
Upload
integrated-network-strategies -
Category
Documents
-
view
2.339 -
download
9
description
Transcript of Sql Server 2008 Spatial Analysis
SQL Server 2008 Spatial Analysis
Dan CrawfordIntegrated Network Strategies
[email protected]://www.insindy.com
What is spatial data?
• Geometric
Represents data in a 2D plain, similar to graph paper in high school. Units are user-defined and could be inches, miles, pixels, etc.
What is spatial data?
• GeographicRepresents data points using angles of Latitude and Longitude. Latitude measures North/South, and Longitude measures degrees East/West of Prime Meridian
System Requirements
• SQL Server 2008 Express or higher – recommend R2 to use maps in SSRS
• Dev Tools– Visual Studio 2005, 2008, or 2010– SQL Management Studio 2008
• Now supported on SQL Azure
Uses of spatial data
• Used by central cancer registries for statistical analysis with other geography specific data sources, such as census data
• Integrated route mapping with MapPoint, Google Maps, etc
• Geographical business intelligence analytics
Geometry data type
• Geometry data type stores points, lines, polygons, and collections of geometric objects
• Represent using WKT (well-known text), WKB (well-known binary), or GML (geography markup language)
• WKT seems to be most common
WKT Markup
• POINT(x y)
• LINESTRING(x1 y1,x2 y2)
• POLYGON((x1 y1,x2 y2,x3 y3,x4 y4,x1 y1))
• GEOMETRYCOLLECTION(Geo1, Geo2, …)
Spatial ExpressionsMethod Purpose
STDistance Distance between two geographical points
STBuffer Creates buffer around a geographical region. Useful for making points more easily seen
STIntersects Do two objects intersect?
STIntersection Creates geographical object defined by intersection of two geographies
STLength Line segment length
STDifference Returns all points from the base object that do not intersect with the parameter object
More Spatial ExpressionsMethod Purpose
STUnion All points of base and parameter instance merged together
STCentroid Middle of object
STWithin Is the object within the parameter object
STContains Does the object contain the parameter object
STX X coordinate for geometry object
STY Y coordinate for geometry object
Lat Latitude for geography object
Long Longitude for geography object
Geocoding
• Geography data type does not directly understanding mailing address data
• Mailing addresses must be converted to latitude/longitude coordinates
• Geocoding = conversion of geographic data like address or zip code to geographic coordinates
• Options – MapPoint/Bing Map Services, Google Maps API, many others
Rendering Options
• SQL Management Studio 2008 – very basic for query testing
• VirtualEarth• Google Maps or similar• 3rd party mapping component (e.g. Dundas)• SSRS/Report Builder in R2
Spatial Indexing
Images from Microsoft Technet
Spatial Indexing
Grid Type Grid Density Cells
LOW 4 x 4 16
MEDIUM 8 x 8 64
HIGH 16 x 16 256
CREATE SPATIAL INDEX SPATIAL_Hospitals ON dbo.Hospitals(LocationGeography) USING GEOGRAPHY_GRID
WITH( GRIDS = ( LEVEL_1 = MEDIUM, LEVEL_2 = MEDIUM, LEVEL_3 = MEDIUM, LEVEL_4 = MEDIUM),
CELLS_PER_OBJECT = 16, STATISTICS_NORECOMPUTE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
Spatial Indexing - Utilization
SELECT *FROM Hospitals WITH (INDEX(SPATIAL_Hospitals))WHERELocationGeography.STIntersects(@P.STBuffer(@eps*1609.344)) = 1
Goal of Geographic Analysis
“I want SQL Server to tell me when there are clusters of geographic data points and where they are located.”
- Dan Crawford, 2010
It’s easy to see points on a map with SQL Server
Why use cluster analysis?
• Analysis of injury severity and hospital resource use in a regional health care system
• Customer purchasing patterns• Choosing a business or advertising location• Crime analysis• Easy visualization for dashboard
What is a geographic cluster?
For our purposes a cluster is a group of a significant number of data points which are geographically close to each other.
There are two variables:-The number of data points which are required in order to be considered a cluster-Distance which defines being “geographically close”
What we want…
Or better yet…
DBSCAN
DBSCAN(D, eps, MinPts) C = 0 for each unvisited point P in dataset D mark P as visited N = getNeighbors (P, eps) if sizeof(N) < MinPts mark P as NOISE else C = next cluster expandCluster(P, N, C, eps, MinPts)
Source: http://en.wikipedia.org/wiki/DBSCAN
DBSCAN (cont’d)
expandCluster(P, N, C, eps, MinPts) add P to cluster C for each point P' in N if P' is not visited mark P' as visited N' = getNeighbors(P', eps) if sizeof(N') >= MinPts N = N joined with N' if P' is not yet member of any cluster add P' to cluster C
Source: http://en.wikipedia.org/wiki/DBSCAN
To make life easier
• Report Builder 3.0• SQL Server Spatial Tools –
http://sqlspatialtools.codeplex.com