Graduate Course Spatial Data
-
Upload
tanner-robbins -
Category
Documents
-
view
29 -
download
0
description
Transcript of Graduate Course Spatial Data
Page 1KUT
Graduate CourseSpatial Data
한국기술대학교민준기
Page 2KUT
Spatial Data
• Traditional Data– Single Dimension– value, text
• New Application– GIS,– CAD– LBS– Multimedia Data
– Multi-dimensional Data
Page 3KUT
Spatial Access Method(SAM)
• Support efficient access of Spatial Data
• B-Tree– Only one dimensional Data– Not appropriate to multi-dimensional Data
• One of famous spatial indexes– R-Tree
Page 4KUT
R-Trees : A Dynamic Index Structure for Spatial Searching
• R-Tree– A Height-balanced Tree with index records in its
leaf nodes containing pointers to data objects.– Dynamic structure: inserts and deletes can be
intermixed with searches and no periodic reorganization is required.
Page 5KUT
R-Trees : A Dynamic Index Structure for Spatial Searching
• R-Tree– It is difficult to handle pure spatial data– Based On MBR (minimum bounding rectangle)
approximation
A1 A2
R1
a3 a4a1 a2
A1
A2
a1
a2
a3
a4
Page 6KUT
R-Tree Structure
• Node = (E1,… ,EM)
• Ei = (I, pointer) where I = (I0,..,Id) , d is dimension and Ij = [a,b]
• Let M be the maximum number of entries, and m <= M/2 be the minimum number of entries of a node
Page 7KUT
Property of R-tree• Every leaf Node contains between m and M index
record unless it is the root.• For each index record (I, pointer) in a leaf node, I is
the smallest rectangle that spatially contains the n-dimensional data object represented by the indicated tuple.
• Every non-leaf node has between m and M children unless it is the root.
• For each entry (I, pointer) in a non-leaf node, I is the smallest rectangle that spatially contains the rectangles in the child node.
• The root node has at least two children unless it is a leaf.
• All leaves appear on the same level.
Page 8KUT
Property of R-Tree
• The height of an R-Tree containing N index records is at most [log_mN]-1– The maximum number of nodes is
[N/m]+[N/m^2]+...+1
– Worst case space utilization for all nodes except root node is m/M.
#of leaf nodes
Page 9KUT
R-Tree Search
• Due to the overlap of MBRs, many index nodes may be visited.
Search(MBR)
if(leaf node){
check all entries in this node which overlap MBR
}else{
for each childnode nx which overlap MBR
nx.seach(MBR)
}
Page 10KUT
R-Tree Insertion
• Algorithm Insertion (newMBR)– Find position for new record
• ChooseLeaf Call to select a leaf node
– Add record to leaf node• If full, SplitNode call
– Propagate changes upward• AdjustTree
– Grow tree taller
Page 11KUT
R-Tree Insert
• Algorithm ChooseLeafCL1 Set N to be a rootCL2 If N is a leaf
return N else
Choose the entry in N whose rectangle needs least area enlargement to include the new data. Resolve ties by choosing the entry with the smallest rectangle
CL3 Set N to be the childnode pointed to by the childpointer of the chosen entry.
CL4 Repeat CS2.
Page 12KUT
R-Tree Insert
• If there is no room invokes SplitNode– Splite MBR to minize the MBR size
• Optimal SpliteNode -> cases that make two subset with M+1 entries-> O(2M-1)
bad good
Page 13KUT
R-Tree Insert
• Approximation (see details)– Quadratic (O(M2))– Linear
• Select two entries whose lengh are fartest• Insert Remains intp groups
Page 14KUT
R-Tree Insertion• Adjust covering rectangles and propagating nodes splits as
necessary• Ascend from leaf node L to the rootAdjustTree Algorithm• [Initialize] N = L• [Check if done] if N is root, stop• [Adjust covering rectangle in parent entry]
– Let P be the parent of N, E_N be N’s entry of P– Modify E_N MBR to enclose all MBRS in N.
• [Propagate node split upward]– If N has a partnet NN resulting from an earlier split, – Create a new entry E_NN and add E_NN to P– If P has no room, invoke SplitNode
• [Move up to next node]– Set N= P and NN= PP, goto step 2.
Page 15KUT
Processing and Optimization of Multiway Spatial Joins Using R-trees
• Cost Based Query Optimizer – Join Selectivity
• probability that a tuple is result
– best efficient query execution plan generate
• Spatial Join Selectivity– Multi-dimension attribute
• commonly 2dimension
• In this work, focus computation the cost of filer Step(= consider only MBR)
Page 16KUT
Previous Work
• Assumption– [0,1)d
• d-dimensional work space• data is uniformly distributed• each dimension is independent
Page 17KUT
Previous Work
• Window Query– find all points include window q
– S(q) =|qi|d
|qi| = size of q of dimension i q
qx
qy
Page 18KUT
Previous Work
• 2-Way Join Query– find Ra interset Rb
S(Ra,Rb) = (|Sa|+ |Sb|)d
(where |Si| = average size of Ri on one dimension
d = dimension)
(|Sa,y|+|Sb,y|)
(|Sa,x|+|Sb,x|)
Page 19KUT
Previous Work
• M-Way Linear Queries(Acyclic Queries)– Ra intersect Rb and Rb intersect Rc
S(Ra,Rb,Rc) = (|Sa|+ |Sb|)d (|Sb|+ |Sc|)d
– Generalization
∏ (|Si|+|Sj|)d∀i,j:Q(i,j) = TRUE
|Sb||Sa|
|Sc|
Page 20KUT
Previous Work
• M-Way Clique Join Query(M≥3)– Papadias, Mamoulis, Theodoridis(ACM PODS99)– Clique: if a set of rectangles mutually intersect,
then they must share a common area
R1 R2
R3
S1S2
S3
Query graph Spatial relationship
Page 21KUT
Previous Work
– Common Area(qn)
– Proof(by induction): ||
||||
1 ,1
1
i
n
i
n
ijj
i
n
in
S
Sq
||||
||||||
21
212 ss
ssq
s1s2
s1s1
s2s2
||||
||
21
1
ss
s
||||
||||
21
12
ss
ss
||||
||
21
1
ss
s
2
|| 1s2
|| 1s|s1|
확률 :
대표값 :
Page 22KUT
Previous Work
– Selectivity of M-Way Clique Join QueryProb(s2 interset s1)*Prob(s3intersect s1∧s3 intersect s2|s1 s2 mutually intersect) =
Prob(s2 intersect s1)*Prob(s3 intersects common intersection area of s1 s2)
– General Case:
d
d
d sssssssss
ssss |)||||||||||(|||
||||
|||||)||(| 133223
21
2121 1
d
i
n
i
n
ijj
S
||1 ,1