CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R....
-
Upload
tavion-rootes -
Category
Documents
-
view
224 -
download
4
Transcript of CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R....
![Page 1: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/1.jpg)
CSIS 7101:CSIS 7101:Spatial Data (Part 3)
Distance Browsing in Spatial Database
GÍSLI R. HJALTASON and HANAN SAMET
Rollo ChanChu Chung Man
Mak Wai YipVivian Lee
Eric LoSindy ShouHugh Wang
![Page 2: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/2.jpg)
What is Distance Browsing?
Browsing through the database on the basis of distances from an arbitrary spatial query object Ranking data objects in their order of distance from
a given query object E.g. Find the nearest person to me who is sleeping.
2 different techniques: k-nearest neighbor algorithm (k-NN) Incremental nearest neighbor algorithm
(INN)
A collection of spatial objects stored in an R-tree spatial data structure
![Page 3: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/3.jpg)
Before All of Them
qo
Requirement - Consistency Definition:
Let d be the combination of functions d0 and dn, and e N denotethe fact that item e is contained in exactly set of nodes N. The functiond0 and dn are consistent iff for any query object q and any object or nodee in the hierarchical data structure there exists n in N, where e N, suchthat d(q, n) d(q, e)
The circle around query object q depictssearch region after reporting o as nextnearest object.
![Page 4: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/4.jpg)
ExampleR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
R0 (0)
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
Find the THREE nearest neighbors to query point q in the R-tree given.
k-Nearest Neighbor Search
Incremental Nearest Neighbor Search
![Page 5: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/5.jpg)
k-Nearest Neighbor Search
Applicable only when k is fixed in advance
Maintain a global list of candidate k nearest neighbors as traverse in depth-first manner
Only make local decisions Next node to visit must be the child node
Make use of nearest list Comparing with the max. value in the
list
![Page 6: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/6.jpg)
Pruning Strategies
Strategy 1:prunes an entry whose bounding rectangle r1 is such that
MINDIST(q, r1) > MINMAXDIST(q, r2),where r2 is some other bounding rectangle
Strategy 2:prunes an object o when
DIST(q, o) > MINMAXDIST(q, r),where r is some bounding rectangle.
b
o
a
q
r
o
b
a
q
r
MINDIST (optimistic)MINMAXDIST (pessimistic)
![Page 7: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/7.jpg)
Pruning Strategies (con’t) Strategies 1 & 2 are useful only when
k=1 Strategy 3:
prunes any node whose bounding rectangle r is such that
MINDIST(q, r) > NearestList.MaxDist
Only MINDIST() is sufficient for pruning
![Page 8: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/8.jpg)
Nearest ListNearest List
R0 (0)
Example – k-NNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
∞Max DistMax Dist.
a b
R4 R3
g hdR4: R3:
k = 3k = 3
![Page 9: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/9.jpg)
Nearest ListNearest List
R0 (0)
Example – k-NNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
∞Max DistMax Dist.
a b
R4 R3
g hdR4: R3:g hd
d(59)g(81)h(17)
81
a(17)
59b(48) 48i(21) 21
k = 3k = 3
![Page 10: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/10.jpg)
Problems with k-NN
Nodes/objects are not visited by order of distance.
May access non-optimal objects, and need to prune them.
Need to know k in advance, difficult to combine with other predicates.
![Page 11: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/11.jpg)
Incremental Nearest Neighbor Search
Top-down manner tree traversal Depth-first traversal
Breadth-first traversal
![Page 12: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/12.jpg)
Incremental Nearest Neighbor Search
INN use Best-first traversal Pick the node with least distance in the set of all
nodes that have yet to be visited Use a priority queue
Distance from the query object is the key Makes global decisions (k-NN make local
decisions) Based on priority queue Choose among the child nodes of all visited nodes
![Page 13: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/13.jpg)
Priority Priority QueueQueue
R0 (0)
Example – INNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
![Page 14: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/14.jpg)
Priority Priority QueueQueue
R2 (0)R1 (0)
R0 (0)
Example – INNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
R0 (0)
![Page 15: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/15.jpg)
Priority Priority QueueQueue
R3 (13) R4 (11)
R0 (0)
Example – INNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
R1 (0)
R2 (0)
![Page 16: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/16.jpg)
Priority Priority QueueQueue
R6 (44)R5 (0)
R0 (0)
Example – INNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
R2 (0)R4 (11)R3 (13)
![Page 17: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/17.jpg)
Priority Priority QueueQueue
[c](53)[i](0)
R0 (0)
Example – INNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
R5 (0)R4 (11)R3 (13)R6 (44)
![Page 18: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/18.jpg)
Priority Priority QueueQueue
i (21)
R0 (0)
Example – INNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
[i](0)
R4 (11)R3 (13)R6 (44)[c](53)
![Page 19: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/19.jpg)
Priority Priority QueueQueue
[h](17)[g](74)[d](30)
R0 (0)
Example – INNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
R4 (11)R3 (13)
R6 (44)[c](53)
i (21)
![Page 20: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/20.jpg)
Priority Priority QueueQueue
[b](27)[a](13)
R0 (0)
Example – INNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
R3 (13)[h](17)i (21)
R6 (44)[c](53)
[d](30)
[g](74)
![Page 21: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/21.jpg)
R0 (0)
Example – INNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
Priority Priority QueueQueue
[a](13)[h](17)i (21)
R6 (44)[c](53)
[d](30)
[g](74)
[b](27)
a (17)a (17)
![Page 22: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/22.jpg)
R0 (0)
Example – INNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
Priority Priority QueueQueue
[h](17)i (21)
R6 (44)[c](53)
[d](30)
[g](74)
[b](27)
a (17)a (17)h (17)h (17)
![Page 23: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/23.jpg)
R0 (0)
Example – INNR0R1
R2
R3
R4
R5 R6
qf
c
g
d
hba
ei
abcdefghi
174857594886811721
13275330457474170
Seg. Dist. BR Dist.R0R1R2R3R4R5R6
0001311044
BR Dist.
e fc ia b
R5 R6R3 R4
R1 R2
g hd
R0:
R1: R2:
R3: R4: R5: R6:
Priority Priority QueueQueue
i (21)
R6 (44)[c](53)
[d](30)
[g](74)
[b](27)
a (17)a (17)h (17)h (17)i (21)i (21)
![Page 24: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/24.jpg)
Variants Find Farthest Object:
Queue sorted in descending order of distance Replace <= by >=
Min and Max Distance: E.g. Find all Cities distanced from Hongkong
for 100 Miles to 200 Miles Prune unqualified nodes
Solve the Traditional k-NN Problem
![Page 25: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/25.jpg)
Priority Queue
Play a key role in performance In 2-dimension:
worst case unlikely to arise in practice expected number of points in queue = O( ) usually fit in memory
In higher-dimension: Higher dimension, larger queue size
k
![Page 26: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/26.jpg)
Priority Queue (con’t) Idea:
priority queue will be split into three-tiers first tier in memory, 2nd and 3rd in a disk file a set of ranges, first tier stores the nearest
range, 3rd tier stores the farthest when 1st tier exhausted, move elements
from 2nd tier when 2nd tier exhausted, scan elements and
rebuild 1st and 2nd tier with new ranges
![Page 27: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/27.jpg)
Comparison of k-NN and INNk-NN Depth-first recursion Make local decision k is fixed If used with k
unknown, Pick a fixed K’, do k-NN If k gradually > K’, pick
a m>=k and re-apply k-NN
Drawback: waste computational power if chosen m too large
INN Priority queue Make global decision Number of neighbors
not known in advanced
![Page 28: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/28.jpg)
Experiment Dataset
Real-world data: TIGER/Line File Howard: 17,421 line segments Water: 37,495 line segments PG: 59,551 line segments Roads: 200,482 line segments
Synthetic data Hierarchical data structure: R*-tree Utilizing buffered I/O Three measures: execution time, R-tree node
I/O, object distance calculations
![Page 29: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/29.jpg)
Cumulative Cost of Distance Browsing
![Page 30: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/30.jpg)
Incremental Cost of Distance Browsing
![Page 31: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/31.jpg)
k-Nearest Neighbor Queries
![Page 32: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/32.jpg)
Experimental Result INN outperforms k-NN in distance browsing In k-NN queries, INN algorithm is better
than k-NN algorithm For large number of neighbor, priority
queue for INN is smaller than the NearestList maintained by k-NN
k-Nearest Neighbor Search
Incremental Nearest Neighbor Search
![Page 33: CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649c745503460f949280ac/html5/thumbnails/33.jpg)
References Gisli R. Hjaltason, Hanan Samet,
“Distance Browsing in Spatial Databases”, ACM TODS, Volume 24, Number 1, pp. 265-318, March 1999
~ THE END ~~ THE END ~