Spatial databases

28
SPATIAL DATABASES By- Neha Kulkarni ME Computer Pune Insitute of Computer Technology

Transcript of Spatial databases

Page 1: Spatial databases

SPATIAL DATABASES

By-Neha KulkarniME Computer

Pune Insitute of Computer Technology

Page 2: Spatial databases

What is spatial data? Types of spatial data Types of queries Applications Indexing Techniques Comparison of Indexing techniques GiST Indexing High-dimensional data Conclusion

Agenda

Page 3: Spatial databases

Spatial Data• Spatial data represent the location ,size

and shape of an object on earth• Ex. Building, lake

Page 4: Spatial databases

Point data: Line data: polygon data:

Representation of Spatial Data

Page 5: Spatial databases

Point Data: Simplest form of representing spatial data No space and has no associated area or volume Consists of collection of points Ex. Raster data

Types of Spatial Data

Page 6: Spatial databases

Region data: Has spatial extend with location and boundaries Represented using of points, line, polygons Ex. roads, rivers: line data

Page 7: Spatial databases
Page 8: Spatial databases

1) Spatial range queries: Related with region dataEx. “Find all cities within 50 miles of Pune”

2)Nearest Neighbor queries:Related with point dataEx. “Find 10 cities nearest to Pune”In ordered citiesUse in multimedia database

Types Of Queries

Page 9: Spatial databases

3) Spatial Join Queries:- use both point and region data.- Ex. “Find pairs of cities within 200 miles of each other AND “ Find all cities near a lake”- More complex- Expensive to evaluate

Page 10: Spatial databases

1) Geographic Information System(GIS) Ex. MAP2) Computer Aided Design/Manufacturing(CAD/CAM) Ex. Surface of design object Range and Spatial join queries used3) Multimedia Database video, audio, image, text also required spatial data Nearest neighbor queries and point data

Applications of Spatial Data

Page 11: Spatial databases

Point Data: Grid files, ḥE trees, Kdtrees, point quad trees Region data: Quad trees, R trees, SKD trees, -Yet no best indexing technique- R trees are commonly used : due to simplicity, ability to handle both data performance to complex queries

Indexing Spatial Data

Page 12: Spatial databases

Three main indexing techniques :

Region Quad-Trees and Z-Ordering – handle both point and region data

Grid Files – only point data R-Trees – handle both point and region data

Indexing techniques

Page 13: Spatial databases

Z-ordering gives us a way to group points according to spatial proximity.

Consider X-01 and Y-11 Z-value is 0111 by interleaving X and Y values. This gives us the value for the point 7.

Region Quad-Trees and Z-Ordering

Space filling curves

Page 14: Spatial databases

The Region Quad tree structure corresponds directly to the recursive decomposition of the data space.

Each node in the tree corresponds to a square-shaped region of the data space.

Page 15: Spatial databases

Grid files rely upon a grid directory to identify the data page containing a desired point.

The Grid file partitions space into rectangular regions using lines that are parallel to the axes.

If the X axis is cut into i segments and the Y axis is cut into j segments, we have a total of i x j partitions. The grid directory is an i by j array with one entry per partition.

This description is maintained in an array called a linear scale; there is one linear scale per axis.

Grid Files

Page 16: Spatial databases

Searching for a point in a grid file

Page 17: Spatial databases

Inserting points in a Grid File

Page 18: Spatial databases

Adaptation of B+ Tree Height-balanced data structure Search key values are referred to as

Bounding Boxes A data entry consists of a pair (n-

dimensional box, Rid) Rid – object Identifier N-dimensional box is the smallest box that

contains the object

R-Tree

Page 19: Spatial databases

An example R-Tree

Page 20: Spatial databases

Search for Objects Overlapping Box Q

Start at root.1. If current node is non-leaf, for each entry <E, ptr>, if box E overlaps Q, search subtree identified by ptr.2. If current node is leaf, for each entry <E, rid>, if E overlaps Q, rid identifies an object that might overlap Q.

Searching in a R-Tree

Page 21: Spatial databases

Insert Entry <B, ptr> Start at root and go down to “best-fit” leaf L. Go to child whose box needs least enlargement to cover B; resolve ties by going to smallest area child. If best-fit leaf L has space, insert entry and stop. Otherwise, split L into L1 and L2. Adjust entry for L in its parent so that the box now covers (only) L1. Add an entry (in the parent node of L) for L2. (This could cause the parent node to recursively split.)

Insertion in a R-Tree

Page 22: Spatial databases

Region Quad Trees

Grid Files(point data)

R-Trees

Range Queries Easily handled

Easily handled for point data.

Handled by calculating bounding box

Nearest Neighbour Queries

Can be handled.Sometimes tricky due to long diagonal jumps

Easily handled for point data.

Handled well by traversing for the point or region

Spatial Joins Can be handled with some extension to range queries

Easily handled for point data.

Handled very well

Comparison between indexing techniques wrt. queries

Page 23: Spatial databases

The Generalized Search Tree (GiST) abstracts the “tree” nature of a class of indexes including B+ trees and R-tree variants. Striking similarities in insert/delete/search and even concurrency control algorithms make it possible to provide “templates” for these algorithms that can be customized to obtain the many different tree index structures. GiST provides an alternative for implementing other tree indexes in an ORDBMS.

GiST

Page 24: Spatial databases

Typically, high-dimensional datasets are collections of points, not regions. E.g., Feature vectors in multimedia applications. Very sparse Nearest neighbor queries are common. R-tree becomes worse than sequential scan for most datasets with more than a dozen dimensions. As dimensionality increases contrast (ratio of distances between nearest and farthest points) usually decreases; “nearest neighbor” is not meaningful.

Indexing High-dimensional data

Page 25: Spatial databases

Spatial data management has many applications, including GIS, CAD/CAM, multimedia indexing, Point and region data

R-tree approach is widely used in GIS systems

Used in spatial data mining approaches. Popular SDBMS : MySQL(geometry

datatype), Neo4j, AllegroGraph, SpaceBase, CouchDB, PostGreSQL, SpatialDB

Conclusion

Page 26: Spatial databases

“Database Management Systems” by Raghu Ramakrishnan, 3rd Edition

www.techopedia.com/definition dna.fernuni-hagen.de/IntroSpatialDBMS www.geol-amu.org/notes

References

Page 27: Spatial databases

Thank you!!

Page 28: Spatial databases