Fast, precise and dynamic distance queries Yair BartalHebrew U. Lee-Ad GottliebWeizmann → Hebrew...

32
Fast, precise and dynamic distance queries Yair Bartal Hebrew U. Lee-Ad Gottlieb Weizmann → Hebrew U. Liam Roditty Bar Ilan Tsvi Kopelowitz Bar Ilan → Weizmann Moshe Lewenstein Bar Ilan

Transcript of Fast, precise and dynamic distance queries Yair BartalHebrew U. Lee-Ad GottliebWeizmann → Hebrew...

Fast, precise and dynamic distance queries

Yair Bartal Hebrew U.Lee-Ad Gottlieb Weizmann → Hebrew U.Liam Roditty Bar IlanTsvi Kopelowitz Bar Ilan → WeizmannMoshe Lewenstein Bar Ilan

Fast, precise and dynamic distance queries 2

Distance oracles A distance oracle for a point set S with distance function d()

preprocesses S so that given any two points x,y in S, d(x,y) (or an approximation thereof) can be retrieved quickly.

Interesting cases Expensive to store all ~ n2 point pairs

Sublinear space Expensive to query distance function d()

for example, when d() is graph-induced

Fast, precise and dynamic distance queries 4Efficient classification for metric data 4

Preliminaries: Doubling dimension Definition: Ball B(x,r) = all points within distance r from x.

The doubling constant (of a metric M) is the minimum value such that every ball can be covered by balls of half the radius First used by [Assoud ‘83], algorithmically by [Clarkson ‘97]. The doubling dimension is ddim(M)=log2(M) Euclidean: ddim(Rd) = O(d)

Packing property of doubling spaces A set with diameter diam and minimum

inter-point distance a, contains at most

(diam/a)O(ddim) points

Here ≥7.

Fast, precise and dynamic distance queries 5

Survey of oracle results

Reference Setting Distortion Query time space

TZ-05 weighted graph 2k-1 k>1 O(k) n1+1/k

MN-06 Metric O(k) O(1) n1+1/k

Kle-02, Tho-04

Planar graph 1+ O( -1) O(n log n/)

HM-06 Doubling metric 1+ O(ddim) -O(ddim) n

BGKRL-11 Doubling metric, dynamic

1+ O(1) -O(ddim) n +

2O(ddim log ddim) n

Caveat: word RAM model, and assuming a word is sufficient to store any single interpoint distance.

Related model: Distance labeling [Tal-04, Sli-05]

Fast, precise and dynamic distance queries 6

Overview of techniquesSome tools we’ll need (both static and dynamic versions):

Point hierarchies for doubling spaces By now a standard construction…

Metric embeddings Into trees Into Euclidean space

Tree search structures Level ancestor queries in O(1) time Least common ancestor (LCA) queries in O(1) time

Fast, precise and dynamic distance queries 7

Preliminaries: Spanners Oracle central idea: Motivated by an observation originally

made in the context of low-stretch spanners. [GGN-04, GR-08a, GR-08b]

A spanner of G is a subgraph H H contains all vertices of G H contains a subset of the edges of G

Interesting properties of H: Stretch, degree, hop diameter

G

2

11

H

2

11

1

Fast, precise and dynamic distance queries 9

Point hierarchies

1-net2-net4-net8-net

Fast, precise and dynamic distance queries 10

Radius = 1

Covering: all points are covered

Packing

Point hierarchies

1-net2-net4-net8-net

Fast, precise and dynamic distance queries 11

Covering: all 1-netpoints are covered

Point hierarchies

1-net2-net4-net8-net

Fast, precise and dynamic distance queries 12

Point hierarchies

1-net2-net4-net8-net

Fast, precise and dynamic distance queries 13

Point hierarchies

1-net2-net4-net8-net

Fast, precise and dynamic distance queries 14

Point hierarchies

1-net2-net4-net8-net

Fast, precise and dynamic distance queries 15

Point hierarchies

1-net2-net4-net8-net

Fast, precise and dynamic distance queries 16

Point hierarchies

1-net2-net4-net8-net

Fast, precise and dynamic distance queries 17

Point hierarchies

1-net2-net4-net8-net

Fast, precise and dynamic distance queries 18

Another perspective

DAG

1-net2-net4-net8-net

Number of levels:

log(aspect ratio)

Fast, precise and dynamic distance queries 19

Another perspective

Make arbitrary parent-childassignments

1-net2-net4-net8-net

Number of levels:

log(aspect ratio)

DAG →Spanning tree

Fast, precise and dynamic distance queries 20

Another perspective

Spanning tree

1-net2-net4-net8-net

Number of levels:

log(aspect ratio)

Fast, precise and dynamic distance queries 21

Towards an oracle Oracle stores all tree parent-child tree links

O(n) space

Define c-neighbors: r-net point pairs within distance c = 3r/ Store all distances between c-neighbors, and between their children -O(ddim)n space

Note that the c-neighbor property is hereditary If nodes a,b are c-neighbors in tree level r Then the ancestor a’,b’ of a,b in any tree level r+i are c-neighbors as well

(or are the same node) Proof: d(a’,b’) ≤ d(a’,a) + d(a,b) + d(b,b’)

≤ 2(r+i) + cr + 2(r+i)

< c(r+i)

Fast, precise and dynamic distance queries 22

c-neighbors

1-net2-net4-net8-net

Fast, precise and dynamic distance queries 23

Spanner observation Let x,y denote two points in S, and by extension their

corresponding tree leaf nodes.

Let x’,y’ be the highest tree ancestors of x,y that are not c-neighbors. Note that d(x’,y’) is stored by the oracle, since the parents of x’,y’ are c-

neighbors.

Spanner Theorem: d(x,y) = (1±) d(x’,y’) Proof by illustration…

Fast, precise and dynamic distance queries 24

Spanner observation

1-net2-net4-net8-net

x y

x’ y’

Fast, precise and dynamic distance queries 25

Spanner observation

≤ 6

> 12/

Distortion:

(12/+12)/(12/)

≤ 1+

1-net2-net4-net8-net

x y

x’ y’

Fast, precise and dynamic distance queries 26

Oracle query Oracle query

For x,y in S, find d(x,y)

Oracle does this instead: For x,y in S, find x’,y’ (the highest ancestors that are not c-neighbors) Return stored d(x’,y’)

Left with the following question: Ancestral non-neighbors query: Find the highest tree ancestors that are

not c-neighbors We could view this as an abstract problem on trees and ignore the

metric…

Fast, precise and dynamic distance queries 27

Ancestral non-neighbors query Some ideas (static case): Recall that neighborliness is

hereditary Brute force → try all ancestors: O(log aspect ratio) Binary search → using level ancestor queries: O(log log aspect ratio) Balanced tree + brute force: O(log n) Balanced tree + binary search: O(log log n)

But we can do better: Make use of the tree structure Get some help from the metric structure

Fast, precise and dynamic distance queries 28

Ancestral neighbors query Lemma: d(x,y) is closely related to the tree level r of ancestors

x’,y’ r = log d(x,y) – log c ± O(1)

Corollary A b-approximation to d(x,y) pinpoints the level of x’,y’ to log b + O(1)

possible tree levels

Fast, precise and dynamic distance queries 29

Oracle query Oracle Step 1: Run the oracle of MS-09 (similar in flavor to TZ-

05, MN-06) on x,y with parameter k = O(log n) Approximation ratio: O(k) = O(log n) Query time: O(1) Space: n(1+1/k) = O(n)

By the Corollary, an approximation ratio of O(log n) to d(x,y) limits the tree level of x’,y’ to O(log log n) possible levels.

Fast, precise and dynamic distance queries 30

Oracle query

O(loglog n)

levels

Fast, precise and dynamic distance queries 31

Oracle query Snowflake embedding of [Ass-04] and [GKL-03]

Given a set S in metric space Embed S into O(ddim log ddim) Euclidean space Distortion O(ddim) into the snowflake d½

Oracle Step 2: Recall that the level of x’,y’ has been narrowed down to O(loglogn)

candidate levels. Embed neighborhoods of O(loglogn) levels into Euclidean space

Fast, precise and dynamic distance queries 32

Oracle query What’s going on?

We’ve narrowed down the level of x’,y’ to O(loglogn) levels These neighborhoods are small Build a snowflake for each neighborhood

O(ddim) = O(log1/3n) dimensions O(log ddim + loglog n) bits per dimension

So the Euclidean representation of each point fits into o(log½ n) bits (into a word)

Lemma: The embedded (snowflake) distance between two points can be returned in O(1) time Proof outline: The distance between two vectors w,z is w·w - 2w·z + z·z. A dot product can be computed in O(1) time by manipulating the

multiplication operator

Fast, precise and dynamic distance queries 34

Oracle query Result of Step 2:

O(ddim) approximation to the snowflake distance x,y (or rather, their ancestors in the appropriate neighborhood)

By the corollary, restricts the candidate levels of x’,y’ to O(log ddim) levels

Oracle Step 3: Preprocessing: In neighborhoods of O(log dim) levels, store a pointer

from each pair to highest ancestors which are not c-neighbors Space 2O(ddim log ddim) per neighborhood or point O(1) query time

Fast, precise and dynamic distance queries 35

Dynamic oracle Steps that needed to be made dynamic:

Hierarchy Already done [CG-06] MS-09 oracle Problem! Answer: Tree embedding[Bar96] Level ancestor query Problem! Answer: Jump trees Snowflake embedding Problem! Extension of above techniques…

Conclusion: There exists a dynamic 1+ approximate distortion oracle for doubling

spaces with O(1) query time, which uses -O(ddim) n +2O(ddim log ddim) n space and can be updated in time 2-O(ddim) log n + 2O(ddim log ddim)