1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method...
-
Upload
laura-greene -
Category
Documents
-
view
218 -
download
0
description
Transcript of 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method...
![Page 1: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/1.jpg)
1
CSIS 7101:CSIS 7101:Spatial Data (Part 1)
The R*-tree:An Efficient and Robust Access
Method for Points and RectanglesRollo Chan
Chu Chung ManMak Wai YipVivian LeeEric Lo
Sindy ShouHugh Wang
![Page 2: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/2.jpg)
2
Spatial Access Method (SAM) Handle spatial data efficiently Query Build Index
Retrieve data item from a database system quickly
Dynamic Update Why not use B-tree?
1 dimensional Designed for multi-dimensional points
E.g. 2D for Map
![Page 3: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/3.jpg)
3
R-tree and R*-tree R-tree [Guttman84] R*-tree [Beckmann90] Height-balanced tree (Similar to B-tree) Leaf-nodes has format:
<I, tuple-identifier> I is the Minimum Bounding Rectangle of a
spatial object Tuple-identifier id to retreive the spatial
object in the database (name, address, etc)
![Page 4: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/4.jpg)
4
The Spatial Data Minimum Bounding Box
![Page 5: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/5.jpg)
5
R-tree and R*-tree properties Leaf: <I, tuple-identifier> Non-leaf:
<I’, child-pointer> I’ covers all
rectangles in the children nodes entries Parameters:
M (max no of entries per node) m (min no of entries per node) m <= M/2
Root has at least two children All leaves in same level 1 node 1 disk page (minimize no. of I/O)
![Page 6: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/6.jpg)
6
Outline Introduction
Motivation R-tree and R*-tree structure
Searching of R*-tree Construction of R*-tree Conclusions References
![Page 7: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/7.jpg)
7
Searching May search more than one sub-tree
(why?) Try to search a rectangle S Search (S):
Search from root Find all index records overlap with S If not a leaf, check overlap, if yes Search
(subTree) Else it is a leaf, check all entries in that leaf
which entries overlap with S
![Page 8: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/8.jpg)
8
Searching examples
![Page 9: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/9.jpg)
9
Spatial Data Introduction
Motivation R-tree and R*-tree structure
Searching of R*-tree Construction of R*-tree Conclusions References
![Page 10: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/10.jpg)
10
R*-tree Optimization Criteria:
Minimize the area covered by an index rectangle
Minimize overlap between bounding rectangles Minimizes the number of paths to be traversed
Minimize the margin of a directory rectangle Creates less overlap, using same amount of area Allows for better, more structured clustering
Optimize the storage utilization Nodes in tree should be filled as much as possible
Sometimes it is impossible to optimize all the above criteria at the same time!
![Page 11: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/11.jpg)
11
R*-tree Insertion To insert a new entry, you need to choose
which leaf entry to insert ChooseSubTree: Select a leaf in which to place
a new index entry E: Start from Root If non-leaf node (children are leaves), choose the
node using the following criteria:1)Least overlap enlargement2)Least area enlargement3)Smaller area
If non-leaf node (children are not leaves), use 1 and 2 Invoke ChooseSubTree recursively If leaf, return this node to be inserted
![Page 12: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/12.jpg)
12
Splitting Node How about if a new entry E going to
add to a node N which is full? Split the full node? Reinserted?
How to split?1. Determine the axis2. Distribute the entries into 2 groups along
that axis3. Distribution may not evenly distributed!
![Page 13: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/13.jpg)
13
1. Determine the axis (M+1) entries
For each axis (i.e. x and y axis) sort entries by the lower value, then by upper
value E.g. X axis, sort by lower value, then generate
M-2m+2 = 3 distributions (M=3, m=1) kst distribution: [(m-1)+k] [the rest] E.g. 2nd distribution (1-1)+2: [E1 E2] [E3 E4] 3rd distribution (1-1)+3: [E1 E2 E3] [E4]
![Page 14: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/14.jpg)
14
1. Determining split axis (cont.) Compute S sum of all margin-value of
all (1, 2… M-2m+2) distributions Margin-value = perimeters of rectangles Choose the axis with lower S E.g. the S of 6 x-axis distributions (3 for
lower value, 3 for higher-value) < that of y-axis
Return x-axis as splitting axis
![Page 15: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/15.jpg)
15
2. Distribute entries along axis How to split?
1.Determine the axis2.Distribute the entries into 2
groups along that axis3.Distribution may not evenly distributed!
Along that axis, choose the distribution (out of 3) that with minimum overlap-value
Overlap-value: area[rect(group1)] area[rect(group2)]
![Page 16: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/16.jpg)
16
Who call split? R*-tree Insertion Algorithm Insert: Add a new entry into the level
specified Begin
End
1. Find appropriate node
Invoke ChooseSubtree to find node N in which to place the new entry E.
2. Check for space in node to insert entry
If N has less entries then M, insert E.
Else3. Split or Reinsert Invoke OverflowTreatment4. Propagate changes
upward If a split was performed, propagate upward. If a split of root node
occurred, Create new root.
5. Adjust covering rectangles
Adjust all rectangles in the insertion path to be minimum bounding box.
![Page 17: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/17.jpg)
17
R*-tree Insertion (cont.) Algorithm OverflowTreatment: Determine
whether to split the current node or try reinsertion.
Begin
End
1. Check condition
If level is not root level and this is the first call of OverflowTreatment in the given level during the insertion of one data rectangle,
2. Do Reinsert Invoke ReInsertElse
3. Do Split Invoke Split
![Page 18: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/18.jpg)
18
R*-tree Insertion (cont.) Algorithm ReInsert. Begin
End
1.Compute Distance
For all M+1 entries of a node N, compute the distance between the centers of their rectangles and the center of the bounding rectangle of N.
2.Sort entries Sort entries in decreasing order of their distances computed in 2.
3.Remove entries Remove the first p entries from N and adjust bounding rectangle
4.Reinsert entries Invoke Insert starting with maximum or minimum distance as defined in 3.
![Page 19: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/19.jpg)
19
R*-tree Split Example
R-tree R*-treeQuadratic Split m = 40% m = 40%
![Page 20: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/20.jpg)
20
R*-tree Forced Reinsert:
When R*-tree node p overflows, instead of splitting p immediately, try to see if some entries in p could possibly fit better in another node
As splitting only contribute to local re-organization of the directory rectangles
Reinsert increase slightly the construction time, BUT resulting less overlap improve query response time
Remove 30% (p) yield best performance
![Page 21: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/21.jpg)
21
Performance Comparison Using forced reinsert increases storage
efficiency, decreases overlap, causes fewer spits, and makes rectangles more quadratic (square).
CPU cost is higher when implementing forced reinsert, but due to fewer splits, the increase in disk access for insertions is only 4% (remains the lowest of all R-tree variants)!
![Page 22: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/22.jpg)
22
Outline Introduction
Motivation R-tree and R*-tree structure
Searching of R*-tree Construction of R*-tree Conclusions References
![Page 23: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/23.jpg)
23
Conclusions R*-trees performs significantly better than the
other R-tree variants. It is the most robust of the trees – requires less disk
access Gain is higher for smaller rectangles because strong
utilization is more important for larger query rectangles
400% gain over Linear, 180% gain over Quadratic split in R-tree
The best storage utilization Even with forced reinsertion, insertion cost is
decreased, due to fewer splits Spatial join has the highest gain
![Page 24: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.](https://reader035.fdocuments.net/reader035/viewer/2022070605/5a4d1b197f8b9ab059992ad7/html5/thumbnails/24.jpg)
24
References Guttman , A., “R-Trees: A Dynamic
Index Structure for Spatial Searching”, Proceedings, ACM SIGMOD, pp47-57, June 1984.
Beckmann, N., Kriegel, H.P., Schneider, R., Seeger, B. ”The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles”,Proceedings, ACM SIGMOD International Conferences on Management of Data, May 23-25, 1990.