SQLBits X SQL Server 2012 Spatial Indexing

46
Taking SQL Server Beyond Relational: Deep Dive into Spatial Performance and Spatial Indexing Michael Rys Principal Program Manager @SQLServerMike

description

SQLBits X Training Day Presentation on SQL Server 2012 Spatial IndexingCopyright (c) Microsoft Corp.

Transcript of SQLBits X SQL Server 2012 Spatial Indexing

Page 1: SQLBits X SQL Server 2012 Spatial Indexing

Taking SQL Server Beyond Relational: Deep Dive into Spatial Performance and Spatial Indexing

Michael RysPrincipal Program Manager@SQLServerMike

Page 2: SQLBits X SQL Server 2012 Spatial Indexing

Q: Why is my Query so Slow?

A: Usually because the index isn’t being used.

Q: How do I tell?A: SELECT * FROM T WHERE g.STIntersects(@x) =

1

NO INDEX

INDEX!

Page 3: SQLBits X SQL Server 2012 Spatial Indexing

Hinting the Index

Spatial indexes can be forced if needed.

SELECT * FROM T WHERE g.STIntersects(@x) = 1

Use SQL Server 2008 SP1 or later!

WITH(INDEX(T_g_idx))

Page 4: SQLBits X SQL Server 2012 Spatial Indexing

But Why Isn't My Index Used?

Plan choice is cost-basedQO uses various information, including cardinality

When can we estimate cardinality?Variables: neverLiterals: not for spatial since they are not literals under the coversParameters: yes, but cached, so first call matters

DECLARE @x geometry = 'POINT (0 0)'SELECT *FROM TWHERE T.g.STIntersects(@x) = 1

SELECT *FROM TWHERE T.g.STIntersects('POINT (0 0)') = 1

EXEC sp_executesql N'SELECT * FROM T WHERE T.g.STIntersects(@x) = 1', N'@x geometry', N'POINT (0 0)'

Page 5: SQLBits X SQL Server 2012 Spatial Indexing

Spatial Indexing Basics

In general, split predicates in twoPrimary filter finds all candidates, possibly with false positives (but never false negatives)Secondary filter removes false positives

The index provides our primary filterOriginal predicate is our secondary filterSome tweaks to this scheme

Sometimes possible to skip secondary filter

A B

C

D A BD A BPrimary Filter (Index lookup)

Secondary Filter (Original predicate)E

Page 6: SQLBits X SQL Server 2012 Spatial Indexing

Using B+-Trees for Spatial Index

SQL Server has B+-TreesSpatial indexing is usually done through other structures

Quad tree, R-TreeChallenge: How do we repurpose the B+-Tree to handle spatial queries?

Add a level of indirection!

Page 7: SQLBits X SQL Server 2012 Spatial Indexing

Mapping to the B+-Tree

B+-Trees handle linearly ordered sets wellWe need to somehow linearly order 2D space

Either the plane or the globeWe want a locality-preserving mapping from the original space to the line

i.e., close objects should be close in the indexCan’t be done, but we can approximate it

Page 8: SQLBits X SQL Server 2012 Spatial Indexing

SQL Server Spatial Indexing Story

Requires bounding boxOnly one grid

Planar Index Geographic IndexNo bounding boxTwo top-level projection grids

3.

2.

1.1 2 15 16

4 3 14 13

5 8 9 12

6 7 10 11

1 2 15 16

4 3 14 13

5 8 9 12

6 7 10 11

1 2 15 16

4 3 14 13

5 8 9 12

6 7 10 11

1. Overlay a grid on the spatial object2. Identify grids for spatial object to store in index3. Identify grids for query object(s)4. Intersecting grids identifies candidates

Indexing PhasePrimary Filter

Secondary Filter

5. Apply actual CLR method on candidates to find matches

Page 9: SQLBits X SQL Server 2012 Spatial Indexing

SQL Server Spatial Indexing Story

Multi-Level GridMuch more flexible than a simple gridHilbert numberingModified adaptable QuadTree

Grid index features4 levelsCustomizable grid subdivisionsCustomizable maximum number of cells per object (default 16)NEW IN SQL Server 2012: New Default tessellation with 8 levels of cell nesting

Page 10: SQLBits X SQL Server 2012 Spatial Indexing

Multi-Level Grid

Deepest-cell Optimization: Only keep the lowest level cell in indexCovering Optimization: Only record higher level cells when all lower cells are completely covered by the object

/ (“cell 0”)

Cell-per-object Optimization: User restricts max number of cells per object

/4/2/3/1

Page 11: SQLBits X SQL Server 2012 Spatial Indexing

Implementation of the Index

Persist a table-valued functionInternally rewrite queries to use the table

Prim_key geography

1 g1

2 g2

3 g3

Prim_key cell_id srid cell_attr

1 0x00007 42 0

3 0x00007 42 1

3 0x0000A 42 2

3 0x0000B 42 0

3 0x0000C 42 1

1 0x0000D 42 0

2 0x00014 42 1

Base Table T

Internal Table for sixdCREATE SPATIAL INDEX sixdON T(geography)

0 – cell at least touches the object (but not 1 or 2)1 – guarantee that object partially covers cell2 – object covers cell 15 columns and 895 byte limitation

Spatial Reference IDHave to be the same to produce match

Varbinary(5) encoding of grid cell id

Page 12: SQLBits X SQL Server 2012 Spatial Indexing

Auto Grid Spatial Index

New spatial index Tessellations:

geometry_auto_gridgeography_auto_grid

Uses 8 Grid levels instead of the previous 4No GRIDS parameter needed (or available)

Fixed at HLLLLLLLdefault number of cells per object:

8 for geometry 12 for geography

More stable performance for windows of different sizefor data with different spatial density

For default values:Up to 2x faster for longer queries > 500 ms

More efficient primary filter Fewer rows returned

10ms slower for very fast queries < 50 ms

Increased tessellation time which is constant

Page 13: SQLBits X SQL Server 2012 Spatial Indexing

Spatial Index Performance

New grid gives much stable performance for query windows of different sizeBetter grid coverage gives fewer high peaks

Page 14: SQLBits X SQL Server 2012 Spatial Indexing

14

Index Creation and MaintenanceCreate index example GEOMETRY:

CREATE SPATIAL INDEX sixd ON spatial_table(geom_column)WITH (BOUNDING_BOX = (0, 0, 500, 500),

GRIDS = (LOW, LOW, MEDIUM, HIGH), CELLS_PER_OBJECT = 20)

Create index example GEOGRAPHY:CREATE SPATIAL INDEX sixd ON spatial_table(geogr_column)USING GEOGRAPHY_GRIDWITH (GRIDS = (LOW, LOW, MEDIUM, HIGH),

CELLS_PER_OBJECT = 20)

NEW IN SQL Server 2012 (equivalent to default creation):CREATE SPATIAL INDEX sixd ON spatial_table(geom_column)USING GEOGRAPHY_AUTO_GRIDWITH (CELLS_PER_OBJECT = 20)

Use ALTER and DROP INDEX for maintenance.

Page 15: SQLBits X SQL Server 2012 Spatial Indexing

DEMOIndexing and Performance

Page 16: SQLBits X SQL Server 2012 Spatial Indexing

Spatial queries supported by index in SQL Server

Geometry:STIntersects() = 1

STOverlaps() = 1

STEquals()= 1

STTouches() = 1

STWithin() = 1

STContains() = 1

STDistance() < val

STDistance() <= val

Nearest Neighbor

Filter() = 1

Geography• STIntersects() = 1 • STOverlaps() = 1• STEquals()= 1• STWithin() = 1• STContains() = 1• STDistance() < val • STDistance() <= val• Nearest Neighbor• Filter() = 1

New in SQL Server 2012

Page 17: SQLBits X SQL Server 2012 Spatial Indexing

How Costing is Done

• The stats on the index contain a trie constructed on the string form of the packed binary(5) typed Cell ID.

• When a window query is compiled with a sniffable window object, the tessellation function on the window object is run at compile time. The results are used to construct a trie for use during compilation.

May lead to wrong compilation for later objects

• No costing on:Local variables, constants, results of expressions

• Use different indices and different stored procs to account for different query characteristics

Page 18: SQLBits X SQL Server 2012 Spatial Indexing

Understanding the Index Query Plan

Page 19: SQLBits X SQL Server 2012 Spatial Indexing

Seeking into a Spatial Index

Minimize I/O and random I/OIntuition: small windows should touch small portions of the indexA cell 7.2.4 matches

ItselfAncestorsDescendants

Spatial Index S

7 7.2 7.2.4

Page 20: SQLBits X SQL Server 2012 Spatial Indexing

Understanding the Index Query Plan

T(@g)

Spatial Index Seek

Ranges

Remove dup ranges

Optional Sort

Page 21: SQLBits X SQL Server 2012 Spatial Indexing

Spatial index tessellation

Better and more continuous coverage

64 cells 128 cells 256 cells

Fully contained

cellsPartially contained

cells

Page 22: SQLBits X SQL Server 2012 Spatial Indexing

Query window number of cells

Typical spatial query performanceOptimal value (theoretical) is

somewhere between two extremes

Time needed to process false

positives

Default values:512 - Geometry AUTO grid768 - Geography AUTO grid1024 - MANUAL grids

SELECT * FROM table t WITH (SPATIAL_WINDOW_MAX_CELLS=256)WHERE t.geom.STIntersects(@window)=1;

Page 23: SQLBits X SQL Server 2012 Spatial Indexing

Query Window Hinting (SQL Server 2012)• SELECT * FROM table t

with(SPATIAL_WINDOW_MAX_CELLS=1024)WHERE t.geom.STIntersects(@window)=1

• Used if an index is chosen (does not force an index)• Overwrites the default (512 for geometry, 768 for geography)• Rule of thumb:

• Higher value makes primary filter phase longer but reduces work in secondary filter phase

• Set higher for dense spatial data • Set lower for sparse spatial data

Page 24: SQLBits X SQL Server 2012 Spatial Indexing

Index Hinting

• FROM T WITH (INDEX (<Spatial_idxname>))• Spatial index is treated the same way a non-

clustered index is• the order of the hint is reflected in the order of the indexes

in the plan• multiple index hints are concatenated• no duplicates are allowed

• The following restrictions exist:• The spatial index must be either first in the first index hint

or last in the last index hint for a given table.• Only one spatial index can be specified in any index hint for

a given table.

Page 25: SQLBits X SQL Server 2012 Spatial Indexing

Query Hinting

demo

Page 26: SQLBits X SQL Server 2012 Spatial Indexing

Additional Query Processing Support

• Index intersectionEnables efficient mixing of spatial and non-spatial predicates

• MatchingNew in SQL Server 2012: Nearest Neighbor queryDistance queries: convert to STIntersectsCommutativity: a.STIntersects(b) = b.STIntersects(a)Dual: a.STContains(b) = b.STWithin(a)Multiple spatial indexes on the same column

Various bounding boxes, granularities

Outer references as window objectsEnables spatial join to use one index

Page 27: SQLBits X SQL Server 2012 Spatial Indexing

Other Spatial Performance Improvements in SQL Server 2012

• Spatial index build time for point data can be as much as four to five times faster

• Optimized spatial query plan for STDistance and STIntersects like queries

• Faster point data queries• Optimized STBuffer, lower memory footprint

Page 28: SQLBits X SQL Server 2012 Spatial Indexing

Spatial Nearest Neighbor

Main scenarioGive me the closest 5 Italian restaurants

Execution plan SQL Server 2008/2008 R2: table scanSQL Server 2012: uses spatial index

Specific query pattern requiredSELECT TOP(5) *FROM Restaurants rWHERE r.type = ‘Italian’ AND r.pos.STDistance(@me) IS NOT NULLORDER BY r.pos.STDistance(@me)

Page 29: SQLBits X SQL Server 2012 Spatial Indexing

Spatial Performance in SQL Server 2012

demo

Page 30: SQLBits X SQL Server 2012 Spatial Indexing

Nearest Neighbor Performance

NN query vs best current workaround (sort all points in 10km radius)

*Average time for NN query is ~236ms

Find the closest 50 business points to a specific location (out of 22 million in total)

Page 31: SQLBits X SQL Server 2012 Spatial Indexing

Limitations of Spatial Plan Selection

• Off whenever window object is not a parameter:

Spatial join (window is an outer reference)Local variable, string constant, or complex expression

• Has the classic SQL Server parameter-sensitivity problem

SQL compiles once for one parameter value and reuses the plan for all parameter valuesDifferent plans for different sizes of window require application logic to bucketize the windows

Page 32: SQLBits X SQL Server 2012 Spatial Indexing

Error 8635: Cannot find a plan

Error: The query processor could not produce a query plan for a query with a spatial index hint.  Reason: %S_MSG.  Try removing the index hints or removing SET FORCEPLAN.Possible Reasons (%S_MSG):

The spatial index is disabled or offlineThe spatial object is not defined in the scope of the predicateSpatial indexes do not support the comparand supplied in the predicateSpatial indexes do not support the comparator supplied in the predicateSpatial indexes do not support the method name supplied in the predicateThe comparand references a column that is defined below the predicateThe comparand in the comparison predicate is not deterministicThe spatial parameter references a column that is defined below the predicateCould not find required binary spatial method in a conditionCould not find required comparison predicate

Page 33: SQLBits X SQL Server 2012 Spatial Indexing

Index Support

• Can be built in parallel• Can be hinted• File groups/Partitioning

• Aligned to base table or Separate file group• Full rebuild only

• New catalog views, DDL Events• DBCC Checks• Supportability stored procedures• New in SQL Server 2012: Index Page and Row

Compression• Ca. 50% smaller indices, 0-15% slower queries

• Not supported• Online rebuild• Database Tuning advisor

Page 34: SQLBits X SQL Server 2012 Spatial Indexing

SET Options

Spatial indexes requires:ANSI_NULLS: ONANSI_PADDING: ONANSI_WARNINGS: ONCONCAT_NULL_YIELDS_NULL: ONNUMERIC_ROUNDABORT: OFFQUOTED_IDENTIFIER: ON

Page 35: SQLBits X SQL Server 2012 Spatial Indexing

Spatial Indices and Partitions and Filegroups

Default partitioned to the same filegroups as the base table. Overwrite with: [ ON { filegroup_name | "default" } ]

If filegroup_name is specified, the index will be placed on the specified filegroup regardless of the table’s partitioning scheme. If “default” is specified, the base table’s default filegroup/partitioning scheme is applied.

Altering the base table’s partition scheme is not allowed unless the spatial index was created with the “ON filegroup” option (and is hence not aligned with the partitioning anyway). The index has to be dropped and then the base table repartitioned.

Page 36: SQLBits X SQL Server 2012 Spatial Indexing

Spatial Catalog Views

• sys.spatial_indexes catalog view• sys.spatial_index_tessellations catalog view• Entries in sys.indexes for a spatial index:

• A clustered index on the internal table of the spatial index

• A spatial index (type = 4) for spatial index

• An entry in sys.internal_tables• An entry to sys.index_columns

Page 37: SQLBits X SQL Server 2012 Spatial Indexing

sp_spatial_help_geometry_histogramsp_spatial_help_geography_histogramUsed for spatial data and index analysis

New Spatial Histogram Helpers

Histogram of 22 million business points over USLeft: SSMS view of a histogramRight: Custom drawing on top of Bing Maps

Page 38: SQLBits X SQL Server 2012 Spatial Indexing

Indexing Support Procedures

• sys.sp_help_spatial_geometry_index• sys.sp_help_spatial_geometry_index_xml• sys.sp_help_spatial_geography_index• sys.sp_help_spatial_geography_index_xml

• Provide information about index:• 64 properties• 10 of which are considered core

Page 39: SQLBits X SQL Server 2012 Spatial Indexing

sys.sp_help_spatial_geometry_indexArguments

Results in property name/value pair table of the format:

Parameter Type Description

@tabname nvarchar(776) the name of the table for which the index has been specified

@indexname sysname the index name to be investigated

@verboseoutput tinyint 0 core set of properties is reported1 all properties are being reported

@query_sample geometry A representative query sample that will be used to test the usefulness of the index. It may be a representative object or a query window.

PropName: nvarchar(256) PropValue: sql_variant

Page 40: SQLBits X SQL Server 2012 Spatial Indexing

sys.sp_help_spatial_geography_index_xml

ArgumentsParameter Type Description

@tabname nvarchar(776) the name of the table for which the index has been specified

@indexname sysname the index name to be investigated

@verboseoutput tinyint 0 core set of properties is reported1 all properties are being reported

@query_sample geography A representative query sample that will be used to test the usefulness of the index. It may be a representative object or a query window.

@xml_output xml This is an output parameter that contains the returned properties in an XML fragment

Page 41: SQLBits X SQL Server 2012 Spatial Indexing

Some of the returned Properties

Property Type DescriptionBase_Table_Rows Bigint All Number of rows in the base table

Index properties - All index properties: bounding box, grid densities, cell per object

Total_Primary_Index_Rows

Bigint All Number of rows in the index

Total_Primary_Index_Pages

Bigint All Number of pages in the index

Total_Number_Of_ObjectCells_In_Level0_For_QuerySample

Bigint Core Indicates whether the representative query sample falls  outside of the bounding box of the geometry index and into the root cell (level 0 cell). This is either 0 (not in level 0 cell) or 1. If it is in the level 0 cell, then the investigated index is not an appropriate index for the query sample.

Total_Number_Of_ObjectCells_In_Level0_In_Index

Bigint Core Number of cell instances of indexed objects that are tessellated in level 0. For geometry indexes, this will happen if the bounding box of the index is smaller than the data domain. A high number of objects in level 0 may require a costly application of secondary filters if the query window falls partially outside the bounding box. If the query window falls inside the bounding box, having a high number of objects in level 0 may actually improve the performance.

Page 42: SQLBits X SQL Server 2012 Spatial Indexing

Some of the returned Properties

Property Type DescriptionNumber_Of_Rows_Selected_By_Primary_Filter

bigint Core P = Number of rows selected by the primary filter.

Number_Of_Rows_Selected_By_Internal_Filter

bigint Core S = Number of rows selected by the internal filter. For these rows, the secondary filter is not called.

Number_Of_Times_Secondary_Filter_Is_Called

bigint Core Number of times the secondary filter is called.

Percentage_Of_Rows_NotSelected_By_Primary_Filter

float Core Suppose there are N rows in the base table, suppose P are selected by the primary filter. This is (N-P)/N as percentage.

Percentage_Of_Primary_Filter_Rows_Selected_By_Internal_Filter

float Core This is S/P as a percentage. The higher the percentage, the better is the index in avoiding the more expensive secondary filter.

Number_Of_Rows_Output bigint Core O=Number of rows output by the query.

Internal_Filter_Efficiency float Core This is S/O as a percentage.

Primary_Filter_Efficiency float Core This is O/P as a percentage. The higher the efficiency is, the less false positives have to be processed by the secondary filter.

Page 43: SQLBits X SQL Server 2012 Spatial Indexing

Spatial Tips on index settingsSome best practice recommendations (YMMV):• Start out with new default tesselation• Point data: always use HIGH for all 4 level. CELL_PER_OBJECT

are not relevant in the case.• Simple, relatively consistent polygons: set all levels to LOW or

MEDIUM, MEDIUM, LOW, LOW • Very complex LineString or Polygon instances:

• High number of CELL_PER_OBJECT (often 8192 is best)• Setting  all 4 levels to HIGH may be beneficial

• Polygons or line strings which have highly variable sizes: experimentation is needed. 

• Rule of thumb for GEOGRAPHY: if MMMM is not working, try HHMM

Page 44: SQLBits X SQL Server 2012 Spatial Indexing

What to do if my Spatial Query is slow?• Make sure you are running SQL Server 2008 SP1, 2008 R2 or

2012• Check query plan for use of index• Make sure it is a supported operation• Hint the index (and/or a different join type)• Do not use a spatial index when there is a highly selective non-

spatial predicate• Run above index support procedure:

• Assess effectiveness of primary filter (Primary_Filter_Efficiency)• Assess effectiveness of internal filter (Internal_Filter_Efficiency)• Redefine or define a new index with better characteristics

• More appropriate bounding box for GEOMETRY• Better grid densities

Page 45: SQLBits X SQL Server 2012 Spatial Indexing

Summary: Spatial Index Improvements in SQL Server 2012

Auto Grid Spatial Index

Spatial Index Hint

More supported Operations

Spatial Index Compression

Improved “Create Spatial Index” Time For Point Data

Page 46: SQLBits X SQL Server 2012 Spatial Indexing

Related ContentSQL Server and SQL Azure Whitepapers and information:

http://www.sqlserverlaunch.com/http://sqlcat.com/sqlCat/b/whitepapers/archive/2011/08/08/new-spatial-features-in-sql-server-code-named-denali-community-technology-preview-3.aspx http://social.technet.microsoft.com/wiki/contents/articles/4136.aspxhttp://social.technet.microsoft.com/wiki/contents/articles/updated-spatial-features-in-the-sql-azure-q4-2011-service-release.aspxSIGMOD 2008 Paper: Spatial Indexing in Microsoft SQL Server 2008

Spatial Tools:SQL Server Spatial Codeplex site: http://sqlspatialtools.codeplex.com/http://www.sharpgis.net/page/SQL-Server-2008-Spatial-Tools.aspxhttp://www.codeplex.com/ProjNET http://www.geoquery2008.com/

Forum: http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=1629&SiteID=1

Find Us Later At…On Twitter: @SQLServerMike, @Spatial_EdBlogs: http://sqlblog.com/blogs/michael_rys, http://blogs.msdn.com/b/edkatibah/