Download - High Concurrent R-tree Operations when Tracking Continuous Movement

High Concurrent R-tree Operations when Tracking Continuous Movement

Cezar Chitac, Robertas Kerpys, Raluca Marcuta

2

Motivation

• Need for tracking moving objects in real time:– concurrency

• Organize and access positional information• Queries:– search query– range query

3

Overview

Problem Formulation Queries vs. Updates First Approach: Split-Supporting Index Second Approach: Split-Free Index Related Work Project Status Conclusions Future Work

4

Problem Formulation• Frequent updates – Efficient R-tree index structure– Concurrency between queries and updates

• Objectives– Query performance – Update performance – Data freshness

• Challenges– Structural modifications during concurrent tree

operations: queries and updates

avoid locks

5

Range Queries

δ

Objects send updates when moving δ units:• reported position• real position

real position at current time last reported position

range

expanded range

Current time in [ts, te) time for which the results are returned

6

Semantics - may have been in range at time ts

• Updates used to construct answer to range query – freshest possible: – all that finish before ts– some that finish after ts

• Constructing the resulting set– roll time back or forward to ts– area where object may have been at ts:• circle of radius: min(vmax|ts - tu|, δ), center: (x, y)• intersect with original unexpanded range

7

Background: Bottom-up Updates

• Efficient updates: no top-down traversal• Secondary index on oid

oid P idx

object identifier pointer to leaf node offset of entry in leaf node

8

p1

p2

p3

p4

p5p6

p7

p11

p12

p10

p9

p8

p13

R2

R1

p1 p2 p3 p4 p5 p13 p6 p7p11 p12p8 p9

R5 R6 R7

R3

R4

R6 R5

R7

R3 R4

R1 R2Query

Types of Update: Local & Non-local

p5

p10

p10

9

Types of Update: Local & Non-local

• Local updates– modify position coordinates– no structural modifications

• Non-local updates– move object to another leaf node => delete + insert– problem: concurrent query

• Solution– insert + logical delete (negative tu)

10

p1 p2 p3 p4 p5 p13 p6 p7p11 p12p8 p9

R5 R6 R7R3 R4

R1 R2Query

Query vs. Update

p10 p10

• Query retrieves:– old position if new is inserted in the already

scanned leaf nodes

• Pold in hash-table: used by next update– delete logically marked entry

– both => query chooses freshest

11

General Index Structure

oid Pnew idxnew Pold idxoldobject

identifierpointer to

new leaf nodeoffset of entry in leaf node

pointer to old leaf node

offset of entry in leaf node

12Overview of Update Process

13

Overview


Split-Supporting Index

• Algorithm is based on atomic operations and versioning of the items

• Latching is minimal

p14

Node Split

p1 p2 p6 p7p3 p4 p5 p13p11 p12p8 p9 p10

R6 R7 R8R3 R4 R5

R1 R2

p14p3 p4 p5

p14

- exclusive latch between non-local updates

- marks logically deleted items

p14

Node Split

p1 p2 p6 p7p3 p4 p5 p13p11 p12p8 p9 p10

R6 R7 R8R3 R4 R5

R1 R2

R5

R9

R4

p3 p4 p5 p14

R3

p1 p2 p6 p7



p14

Node Split

p1 p2 p6 p7p3 p4 p5 p13p11 p12p8 p9 p10

R6 R7 R8R3 R4 R5

R1 R2

R5 R9R4

p3 p4 p5 p14

R3

p1 p2 p6 p7



R10

Local Updates

• Are allowed during splits and merges

oid Pnew idxnew Pold idxold

p1 0

p2 1

p3 0

… … … … …

p1 p2 p6 p7p3 p4 p5

R3 R4 R5

R1 R2

R4

p3 p4

R3

p1 p2

p6 p7

N1 N2 N3

N1’ N2’ N3’

N1

N1

Nil Nil

Nil Nil

N2’ N2 0

N1’ 0

N1’ 1

Secondary index

Non-Local Updates

• Are not allowed to make changes for items which are involved into split

• Updates are put into a priority queue and retried later

Merge

Merge underflow node with one of the sibling nodes

1. Sibling node have space for all entries2. Sibling node would become overflow after

insertion

Sibling node has space for all entries

1. Sibling and underflow nodes are latched2. New empty node is created3. Entries from sibling and underflow nodes are

copied into the new node4. New node is introduced into structure by

atomic swap of the pointers

Sibling node would become overflow

1. Split of the sibling node is performed2. Split function accepts all entries from the

underflow node instead of one entry3. Entries are distributed between two new

nodes4. Two new nodes are introduced into structure

in two atomic operations

Summary

• Local updates are permitted during node splits and merges

• Queries can execute concurrently

• High complexity due to avoidance of locks• Creation of artificial updates

Advantages

Disadvantages

24

Overview


Second Approach – Main Idea

• Splits and merges:– Time consuming– Increase complexity– Artificial updates

• Goals:– Objects update only when they move– No splits and no merges

Parameters

R1R2

Logically Overfull Node

p1

p2

p3p4

p5

p6

p7

Node is logically overfull: LO = 6Create new node

cut_val

Algorithm: choose cut value

persistent part evacuating part

Change nodes’ statesStore pointer to new node

Node Structure

split_ptr – pointer to newly created nodestate – represents a node’s state: Normal, Evacuating, Populating or Newcut – stores the axis by which the node was “divided”cutval – stores the value of the axisev_part– indicates the part that is evacuatingiNeed – indicates a node’s desire to attract or repel objects

State DiagramCreationNR = 0

New

Normal

Evacuating

Populating

Deletion

Insert(obj)

Insert(obj) & NR = LU

Total Evacuation

LU+1≤NR ≤LO

NR ≤PU

Delete(obj) & NR=1

Delete(obj) & N

R=PU+1

PU≤NR ≤LU

NR ≤LU

NR ≥LO/2

Find Node Heuristics

• Search parent node first – Sibling node in need of objects

• Top-down tree traversal based on:– MBR area enlargement– iNeed values

Local and Non-local

p1 p2 p3

R3 R4 R5

R1 R2 R12

R6 R7 R8 R9 R10 R11

p5

p4

p4 p3

P3 local update

p5 non-local update to R3

Summary

• Advantages:– Algorithmic simplicity– No artificial updates– Novelty

• Disadvantages:– Setting heuristic parameters– Logical complexity

Related Work• Logical and Physical Versioning in Main Memory

Databases [Rastogi et al. 1997]• Trees or Grids: Main-memory Indexing [Šidlauskas et al.

2009]• High-Concurrency Locking in R-trees: R-link [Kornacker &

Banks 1995]• Existing concurrent approaches:– An Enhanced Concurrency Control Scheme for Multi-

dimensional Index Structures [Song et al. 2004]– CGiST: Concurrency and Recovery in Generalized Search Trees

[Kornacker et al. 1997]

Status of the Project

• Semantics of application domain• Concurrent queries and updates• An approach based on copying on demand:– Create minimal structure on the side– Integrate using atomic operations

• A new approach:– A tree structure with no splits or merges– Necessary heuristics to compensate

Conclusion

• Addresses concurrency issues when minimizing locking/latching

• Two approaches debated (one novel)

• Focus on concurrency while maintaining structure integrity

Future Work

• Next semester:– Implementation of second approach– Comparison with relevant existing approaches

• Additional work:– Implementation of the first approach– Comparison between the two

37

Feedback

• What parts of the presentation:– needed more focus?– unnecessary?– too detailed?

• Was the flow of the presentation natural? • Any thoughts about our two presented

methods?