Efficient Evaluation of k-Range Nearest Neighbor Queries in Road Networks Jie BaoChi-Yin ChowMohamed...

14
Efficient Evaluation of k- Range Nearest Neighbor Queries in Road Networks Jie Bao Chi-Yin Chow Mohamed F. Mokbel Department of Computer Science and Engineering University of Minnesota – Twin Cities Wei-Shinn Ku Department of Computer Science and Software Engineering Auburn University

Transcript of Efficient Evaluation of k-Range Nearest Neighbor Queries in Road Networks Jie BaoChi-Yin ChowMohamed...

Efficient Evaluation of k-Range Nearest Neighbor Queries in Road

Networks

Jie Bao Chi-Yin Chow Mohamed F. Mokbel

Department of Computer Science and Engineering

University of Minnesota – Twin Cities

Wei-Shinn Ku

Department of Computer Science and Software Engineering

Auburn University

2

What is Range NN Queries

• k-Range NN Queries in Euclidean Space– Given a spatial region, find the k

nearest objects to every points within the region

– E.g., Find the nearest hotel to a shopping mall

• k-Range NN Queries in Road Networks– Given a set of road segments, find the k

nearest objects to every points on the road segments

Region

3

Usages of Range NN Queries• Uncertain locations

– Measurement imprecision - due to the limitation of the underlying positioning techniques, e.g., 2G/3G and Wi-Fi

– Sampling imprecision - due to continuous motion, network delays, and location update frequency

• Privacy-preserving queries– Users do not want to reveal their exact

location information to service providers– Their locations are blurred into spatial

areas

iPhone's 3G Positioning

5-Anonymous Area

4

Related Works for k-RNN Queries• K-Nearest Neighbor in Road Networks

– Query processing with pre-computed information

Incremental Network Expansion (INE): a best first expansion over the road networks [Papadias et al., VLDB 2003]

– Query processing with pre-computed information

Use extra pre-computed quad-tree indexes to calculate the distances[Samet et al., SIGMOD 2008]

• K-Range Nearest Neighbor in Euclidean Space– Pre-computed Voironi Diagrams

[Chow et al., SSTD 2009]

• K-Range Nearest Neighbor in Road Networks– Range Query + INE for every boundary node

[Wang and Liu, PVLDB 2009]

5

Motivating Example• Computational redundancy in the existing solution

– Range Query + Multiple kNN Queries [Wang and Liu, PVLDB 2009]

Total number of road segments searched: 3 + 2 + 5 + 6 = 17

Total number of the road segments in the map: 6

Redundancy ratio: (17 - 6) / 6 = 183% (Worse if more boundary points)

• Can we provide the results without the computational redundancy?

Range Search

k-NN for D

k-NN for Bk-NN for

F

6

Problem Definition• Given:

– A undirected graph G=(V, E) as road networks– Set of objects O– A query region R (a set of road segments)– A K value

• Find:– Answer set A from O such that A contains the K-

nearest objects of every point in R based on the network distance in G

• Objective:– Provide A without computational redundancy

7

Efficient k-RNN Query Processing• Step 1: Inside Query Step• Step 2: Outside Network

Expansion Step– Multiple searching queues– Stop after closest node is

searched– Switch to the queue with the

smallest searched distance– Termination condition: covers

the distance of its kth object

Example 2-RNN

A

B

P1 P2

P3

1st iterationSearch fromAAnswer SetP1, P2

2nd iterationSearch fromBAnswer SetP1, P2

3rd iterationSearch fromCAnswer SetP1, P2

4th iterationSearch fromCAnswer SetP1, P2, P3

5th iterationSearch fromBAnswer SetP1, P2, P3

C

Road Segment Set (Range)

8

Distance Calculation• Case 1: By a pre-computed

shortest path table– Fast but more storage

• Case 2: Calculation on the fly– Keep the distance information as the

searching expands

• Tradeoff between storage and speed

A B E

A 0 1 2

B 1 0 3

E 2 3 0

C

D

P1

P2

A B E

A 0 1 2

B 1 0 3

E 2 3 0

C 3 4 5

D

P1

P2

A B E

A 0 1 2

B 1 0 3

E 2 3 0

C 3 2 5

D

P1 2 1 4

P2

Search collision!

A B E

A 0 1 2

B 1 0 3

E 2 3 0

C 3 2 5

D 5 4 6

P1 2 1 4

P2 4 3 5

9

Experimental Results

Parameters Default Value

Range

K value 10 1 to 20

Number of Objects 600 200 to 1000

Query region size (ratio over total space)

0.018 0.002 to 0.050

• Evaluate our algorithm without pre-computed results (KRNN-E), with pre-computed results (KRNN-F)

• Baseline algorithm: [Wang and Liu, PVLDB 2009]• Road networks (Hennepin county, Minnesota, US)

• 39,513 nodes and 54,444 road segments

Parameter settings

10

Comparison with baseline(1/2)

a) Impact of different k values

b) Impact of different total objects on the map

c) Impact of different query region size

11

Comparison with baseline(2/2)

• Impact of different distribution of the data objects– Uniform distribution– Normal distribution

• SD is the standard

deviation to simulate

the hot spot locations

like downtown area

Uniform SD=1 SD=0.1 SD=0.01SD=0.0010

10000

20000

30000

40000

50000

60000

70000

80000

Baseline KRNN-F KRNN-E

Different POI distributions

Que

ry P

roce

ssin

g Ti

me

(s)

12

Tradeoff between storage and performance• Tuning parameter P

– The percentage of the shortest distance table– Warm up process with 1000 k-RNN queries– Full size of the table is 980 MB

13

Conclusion• An efficient algorithm for k-Range Nearest Neighbor

(k-RNN) queries in road networks without computational overhead

• Experiment evaluation– Our solution outperforms the baseline algorithm– Tuning parameter P achieves a tradeoff

Privacy preserved applications Uncertain locations

14

Q&A