High Concurrent R-tree Operations when Tracking Continuous Movement

37
High Concurrent R-tree Operations when Tracking Continuous Movement Cezar Chitac, Robertas Kerpys, Raluca Marcuta

description

High Concurrent R-tree Operations when Tracking Continuous Movement. Cezar Chitac , Robertas Kerpys , Raluca Marcuta. Motivation. Need for tracking moving objects in real time: concurrency Organize and access positional information Queries: s earch query r ange query. Overview. - PowerPoint PPT Presentation

Transcript of High Concurrent R-tree Operations when Tracking Continuous Movement

Page 1: High Concurrent R-tree Operations when Tracking Continuous Movement

High Concurrent R-tree Operations when Tracking Continuous Movement

Cezar Chitac, Robertas Kerpys, Raluca Marcuta

Page 2: High Concurrent R-tree Operations when Tracking Continuous Movement

2

Motivation

• Need for tracking moving objects in real time:– concurrency

• Organize and access positional information• Queries:– search query– range query

Page 3: High Concurrent R-tree Operations when Tracking Continuous Movement

3

Overview

Problem Formulation Queries vs. Updates First Approach: Split-Supporting Index Second Approach: Split-Free Index Related Work Project Status Conclusions Future Work

Page 4: High Concurrent R-tree Operations when Tracking Continuous Movement

4

Problem Formulation• Frequent updates – Efficient R-tree index structure– Concurrency between queries and updates

• Objectives– Query performance – Update performance – Data freshness

• Challenges– Structural modifications during concurrent tree

operations: queries and updates

avoid locks

Page 5: High Concurrent R-tree Operations when Tracking Continuous Movement

5

Range Queries

δ

Objects send updates when moving δ units:• reported position• real position

real position at current time last reported position

range

expanded range

Current time in [ts, te) time for which the results are returned

Page 6: High Concurrent R-tree Operations when Tracking Continuous Movement

6

Semantics - may have been in range at time ts

• Updates used to construct answer to range query – freshest possible: – all that finish before ts– some that finish after ts

• Constructing the resulting set– roll time back or forward to ts– area where object may have been at ts:• circle of radius: min(vmax|ts - tu|, δ), center: (x, y)• intersect with original unexpanded range

Page 7: High Concurrent R-tree Operations when Tracking Continuous Movement

7

Background: Bottom-up Updates

• Efficient updates: no top-down traversal• Secondary index on oid

oid P idx

object identifier pointer to leaf node offset of entry in leaf node

Page 8: High Concurrent R-tree Operations when Tracking Continuous Movement

8

p1

p2

p3

p4

p5p6

p7

p11

p12

p10

p9

p8

p13

R2

R1

p1 p2 p3 p4 p5 p13 p6 p7p11 p12p8 p9

R5 R6 R7

R3

R4

R6 R5

R7

R3 R4

R1 R2Query

Types of Update: Local & Non-local

p5

p10

p10

Page 9: High Concurrent R-tree Operations when Tracking Continuous Movement

9

Types of Update: Local & Non-local

• Local updates– modify position coordinates– no structural modifications

• Non-local updates– move object to another leaf node => delete + insert– problem: concurrent query

• Solution– insert + logical delete (negative tu)

Page 10: High Concurrent R-tree Operations when Tracking Continuous Movement

10

p1 p2 p3 p4 p5 p13 p6 p7p11 p12p8 p9

R5 R6 R7R3 R4

R1 R2Query

Query vs. Update

p10 p10

• Query retrieves:– old position if new is inserted in the already

scanned leaf nodes

• Pold in hash-table: used by next update– delete logically marked entry

– both => query chooses freshest

Page 11: High Concurrent R-tree Operations when Tracking Continuous Movement

11

General Index Structure

oid Pnew idxnew Pold idxoldobject

identifierpointer to

new leaf nodeoffset of entry in leaf node

pointer to old leaf node

offset of entry in leaf node

Page 12: High Concurrent R-tree Operations when Tracking Continuous Movement

12Overview of Update Process

Page 13: High Concurrent R-tree Operations when Tracking Continuous Movement

13

Overview

Problem Formulation Queries vs. Updates First Approach: Split-Supporting Index Second Approach: Split-Free Index Related Work Project Status Conclusions Future Work

Page 14: High Concurrent R-tree Operations when Tracking Continuous Movement

Split-Supporting Index

• Algorithm is based on atomic operations and versioning of the items

• Latching is minimal

Page 15: High Concurrent R-tree Operations when Tracking Continuous Movement

p14

Node Split

p1 p2 p6 p7p3 p4 p5 p13p11 p12p8 p9 p10

R6 R7 R8R3 R4 R5

R1 R2

p14p3 p4 p5

p14

- exclusive latch between non-local updates

- marks logically deleted items

Page 16: High Concurrent R-tree Operations when Tracking Continuous Movement

p14

Node Split

p1 p2 p6 p7p3 p4 p5 p13p11 p12p8 p9 p10

R6 R7 R8R3 R4 R5

R1 R2

R5

R9

R4

p3 p4 p5 p14

R3

p1 p2 p6 p7

- exclusive latch between non-local updates

- marks logically deleted items

Page 17: High Concurrent R-tree Operations when Tracking Continuous Movement

p14

Node Split

p1 p2 p6 p7p3 p4 p5 p13p11 p12p8 p9 p10

R6 R7 R8R3 R4 R5

R1 R2

R5 R9R4

p3 p4 p5 p14

R3

p1 p2 p6 p7

- exclusive latch between non-local updates

- marks logically deleted items

R10

Page 18: High Concurrent R-tree Operations when Tracking Continuous Movement

Local Updates

• Are allowed during splits and merges

oid Pnew idxnew Pold idxold

p1 0

p2 1

p3 0

… … … … …

p1 p2 p6 p7p3 p4 p5

R3 R4 R5

R1 R2

R4

p3 p4

R3

p1 p2

p6 p7

N1 N2 N3

N1’ N2’ N3’

N1

N1

Nil Nil

Nil Nil

N2’ N2 0

N1’ 0

N1’ 1

Secondary index

Page 19: High Concurrent R-tree Operations when Tracking Continuous Movement

Non-Local Updates

• Are not allowed to make changes for items which are involved into split

• Updates are put into a priority queue and retried later

Page 20: High Concurrent R-tree Operations when Tracking Continuous Movement

Merge

Merge underflow node with one of the sibling nodes

1. Sibling node have space for all entries2. Sibling node would become overflow after

insertion

Page 21: High Concurrent R-tree Operations when Tracking Continuous Movement

Sibling node has space for all entries

1. Sibling and underflow nodes are latched2. New empty node is created3. Entries from sibling and underflow nodes are

copied into the new node4. New node is introduced into structure by

atomic swap of the pointers

Page 22: High Concurrent R-tree Operations when Tracking Continuous Movement

Sibling node would become overflow

1. Split of the sibling node is performed2. Split function accepts all entries from the

underflow node instead of one entry3. Entries are distributed between two new

nodes4. Two new nodes are introduced into structure

in two atomic operations

Page 23: High Concurrent R-tree Operations when Tracking Continuous Movement

Summary

• Local updates are permitted during node splits and merges

• Queries can execute concurrently

• High complexity due to avoidance of locks• Creation of artificial updates

Advantages

Disadvantages

Page 24: High Concurrent R-tree Operations when Tracking Continuous Movement

24

Overview

Problem Formulation Queries vs. Updates First Approach: Split-Supporting Index Second Approach: Split-Free Index Related Work Project Status Conclusions Future Work

Page 25: High Concurrent R-tree Operations when Tracking Continuous Movement

Second Approach – Main Idea

• Splits and merges:– Time consuming– Increase complexity– Artificial updates

• Goals:– Objects update only when they move– No splits and no merges

Page 26: High Concurrent R-tree Operations when Tracking Continuous Movement

Parameters

Page 27: High Concurrent R-tree Operations when Tracking Continuous Movement

R1R2

Logically Overfull Node

p1

p2

p3p4

p5

p6

p7

Node is logically overfull: LO = 6Create new node

cut_val

Algorithm: choose cut value

persistent part evacuating part

Change nodes’ statesStore pointer to new node

Page 28: High Concurrent R-tree Operations when Tracking Continuous Movement

Node Structure

split_ptr – pointer to newly created nodestate – represents a node’s state: Normal, Evacuating, Populating or Newcut – stores the axis by which the node was “divided”cutval – stores the value of the axisev_part– indicates the part that is evacuatingiNeed – indicates a node’s desire to attract or repel objects

Page 29: High Concurrent R-tree Operations when Tracking Continuous Movement

State DiagramCreationNR = 0

New

Normal

Evacuating

Populating

Deletion

Insert(obj)

Insert(obj) & NR = LU

Total Evacuation

LU+1≤NR ≤LO

NR ≤PU

Delete(obj) & NR=1

Delete(obj) & N

R=PU+1

PU≤NR ≤LU

NR ≤LU

NR ≥LO/2

Page 30: High Concurrent R-tree Operations when Tracking Continuous Movement

Find Node Heuristics

• Search parent node first – Sibling node in need of objects

• Top-down tree traversal based on:– MBR area enlargement– iNeed values

Page 31: High Concurrent R-tree Operations when Tracking Continuous Movement

Local and Non-local

p1 p2 p3

R3 R4 R5

R1 R2 R12

R6 R7 R8 R9 R10 R11

p5

p4

p4 p3

P3 local update

p5 non-local update to R3

Page 32: High Concurrent R-tree Operations when Tracking Continuous Movement

Summary

• Advantages:– Algorithmic simplicity– No artificial updates– Novelty

• Disadvantages:– Setting heuristic parameters– Logical complexity

Page 33: High Concurrent R-tree Operations when Tracking Continuous Movement

Related Work• Logical and Physical Versioning in Main Memory

Databases [Rastogi et al. 1997]• Trees or Grids: Main-memory Indexing [Šidlauskas et al.

2009]• High-Concurrency Locking in R-trees: R-link [Kornacker &

Banks 1995]• Existing concurrent approaches:– An Enhanced Concurrency Control Scheme for Multi-

dimensional Index Structures [Song et al. 2004]– CGiST: Concurrency and Recovery in Generalized Search Trees

[Kornacker et al. 1997]

Page 34: High Concurrent R-tree Operations when Tracking Continuous Movement

Status of the Project

• Semantics of application domain• Concurrent queries and updates• An approach based on copying on demand:– Create minimal structure on the side– Integrate using atomic operations

• A new approach:– A tree structure with no splits or merges– Necessary heuristics to compensate

Page 35: High Concurrent R-tree Operations when Tracking Continuous Movement

Conclusion

• Addresses concurrency issues when minimizing locking/latching

• Two approaches debated (one novel)

• Focus on concurrency while maintaining structure integrity

Page 36: High Concurrent R-tree Operations when Tracking Continuous Movement

Future Work

• Next semester:– Implementation of second approach– Comparison with relevant existing approaches

• Additional work:– Implementation of the first approach– Comparison between the two

Page 37: High Concurrent R-tree Operations when Tracking Continuous Movement

37

Feedback

• What parts of the presentation:– needed more focus?– unnecessary?– too detailed?

• Was the flow of the presentation natural? • Any thoughts about our two presented

methods?