High Concurrent R-tree Operations when Tracking Continuous Movement
Cezar Chitac, Robertas Kerpys, Raluca Marcuta
2
Motivation
• Need for tracking moving objects in real time:– concurrency
• Organize and access positional information• Queries:– search query– range query
3
Overview
Problem Formulation Queries vs. Updates First Approach: Split-Supporting Index Second Approach: Split-Free Index Related Work Project Status Conclusions Future Work
4
Problem Formulation• Frequent updates – Efficient R-tree index structure– Concurrency between queries and updates
• Objectives– Query performance – Update performance – Data freshness
• Challenges– Structural modifications during concurrent tree
operations: queries and updates
avoid locks
5
Range Queries
δ
Objects send updates when moving δ units:• reported position• real position
real position at current time last reported position
range
expanded range
Current time in [ts, te) time for which the results are returned
6
Semantics - may have been in range at time ts
• Updates used to construct answer to range query – freshest possible: – all that finish before ts– some that finish after ts
• Constructing the resulting set– roll time back or forward to ts– area where object may have been at ts:• circle of radius: min(vmax|ts - tu|, δ), center: (x, y)• intersect with original unexpanded range
7
Background: Bottom-up Updates
• Efficient updates: no top-down traversal• Secondary index on oid
oid P idx
object identifier pointer to leaf node offset of entry in leaf node
8
p1
p2
p3
p4
p5p6
p7
p11
p12
p10
p9
p8
p13
R2
R1
p1 p2 p3 p4 p5 p13 p6 p7p11 p12p8 p9
R5 R6 R7
R3
R4
R6 R5
R7
R3 R4
R1 R2Query
Types of Update: Local & Non-local
p5
p10
p10
9
Types of Update: Local & Non-local
• Local updates– modify position coordinates– no structural modifications
• Non-local updates– move object to another leaf node => delete + insert– problem: concurrent query
• Solution– insert + logical delete (negative tu)
10
p1 p2 p3 p4 p5 p13 p6 p7p11 p12p8 p9
R5 R6 R7R3 R4
R1 R2Query
Query vs. Update
p10 p10
• Query retrieves:– old position if new is inserted in the already
scanned leaf nodes
• Pold in hash-table: used by next update– delete logically marked entry
– both => query chooses freshest
11
General Index Structure
oid Pnew idxnew Pold idxoldobject
identifierpointer to
new leaf nodeoffset of entry in leaf node
pointer to old leaf node
offset of entry in leaf node
12Overview of Update Process
13
Overview
Problem Formulation Queries vs. Updates First Approach: Split-Supporting Index Second Approach: Split-Free Index Related Work Project Status Conclusions Future Work
Split-Supporting Index
• Algorithm is based on atomic operations and versioning of the items
• Latching is minimal
p14
Node Split
p1 p2 p6 p7p3 p4 p5 p13p11 p12p8 p9 p10
R6 R7 R8R3 R4 R5
R1 R2
p14p3 p4 p5
p14
- exclusive latch between non-local updates
- marks logically deleted items
p14
Node Split
p1 p2 p6 p7p3 p4 p5 p13p11 p12p8 p9 p10
R6 R7 R8R3 R4 R5
R1 R2
R5
R9
R4
p3 p4 p5 p14
R3
p1 p2 p6 p7
- exclusive latch between non-local updates
- marks logically deleted items
p14
Node Split
p1 p2 p6 p7p3 p4 p5 p13p11 p12p8 p9 p10
R6 R7 R8R3 R4 R5
R1 R2
R5 R9R4
p3 p4 p5 p14
R3
p1 p2 p6 p7
- exclusive latch between non-local updates
- marks logically deleted items
R10
Local Updates
• Are allowed during splits and merges
oid Pnew idxnew Pold idxold
p1 0
p2 1
p3 0
… … … … …
p1 p2 p6 p7p3 p4 p5
R3 R4 R5
R1 R2
R4
p3 p4
R3
p1 p2
p6 p7
N1 N2 N3
N1’ N2’ N3’
N1
N1
Nil Nil
Nil Nil
N2’ N2 0
N1’ 0
N1’ 1
Secondary index
Non-Local Updates
• Are not allowed to make changes for items which are involved into split
• Updates are put into a priority queue and retried later
Merge
Merge underflow node with one of the sibling nodes
1. Sibling node have space for all entries2. Sibling node would become overflow after
insertion
Sibling node has space for all entries
1. Sibling and underflow nodes are latched2. New empty node is created3. Entries from sibling and underflow nodes are
copied into the new node4. New node is introduced into structure by
atomic swap of the pointers
Sibling node would become overflow
1. Split of the sibling node is performed2. Split function accepts all entries from the
underflow node instead of one entry3. Entries are distributed between two new
nodes4. Two new nodes are introduced into structure
in two atomic operations
Summary
• Local updates are permitted during node splits and merges
• Queries can execute concurrently
• High complexity due to avoidance of locks• Creation of artificial updates
Advantages
Disadvantages
24
Overview
Problem Formulation Queries vs. Updates First Approach: Split-Supporting Index Second Approach: Split-Free Index Related Work Project Status Conclusions Future Work
Second Approach – Main Idea
• Splits and merges:– Time consuming– Increase complexity– Artificial updates
• Goals:– Objects update only when they move– No splits and no merges
Parameters
R1R2
Logically Overfull Node
p1
p2
p3p4
p5
p6
p7
Node is logically overfull: LO = 6Create new node
cut_val
Algorithm: choose cut value
persistent part evacuating part
Change nodes’ statesStore pointer to new node
Node Structure
split_ptr – pointer to newly created nodestate – represents a node’s state: Normal, Evacuating, Populating or Newcut – stores the axis by which the node was “divided”cutval – stores the value of the axisev_part– indicates the part that is evacuatingiNeed – indicates a node’s desire to attract or repel objects
State DiagramCreationNR = 0
New
Normal
Evacuating
Populating
Deletion
Insert(obj)
Insert(obj) & NR = LU
Total Evacuation
LU+1≤NR ≤LO
NR ≤PU
Delete(obj) & NR=1
Delete(obj) & N
R=PU+1
PU≤NR ≤LU
NR ≤LU
NR ≥LO/2
Find Node Heuristics
• Search parent node first – Sibling node in need of objects
• Top-down tree traversal based on:– MBR area enlargement– iNeed values
Local and Non-local
p1 p2 p3
R3 R4 R5
R1 R2 R12
R6 R7 R8 R9 R10 R11
p5
p4
p4 p3
P3 local update
p5 non-local update to R3
Summary
• Advantages:– Algorithmic simplicity– No artificial updates– Novelty
• Disadvantages:– Setting heuristic parameters– Logical complexity
Related Work• Logical and Physical Versioning in Main Memory
Databases [Rastogi et al. 1997]• Trees or Grids: Main-memory Indexing [Šidlauskas et al.
2009]• High-Concurrency Locking in R-trees: R-link [Kornacker &
Banks 1995]• Existing concurrent approaches:– An Enhanced Concurrency Control Scheme for Multi-
dimensional Index Structures [Song et al. 2004]– CGiST: Concurrency and Recovery in Generalized Search Trees
[Kornacker et al. 1997]
Status of the Project
• Semantics of application domain• Concurrent queries and updates• An approach based on copying on demand:– Create minimal structure on the side– Integrate using atomic operations
• A new approach:– A tree structure with no splits or merges– Necessary heuristics to compensate
Conclusion
• Addresses concurrency issues when minimizing locking/latching
• Two approaches debated (one novel)
• Focus on concurrency while maintaining structure integrity
Future Work
• Next semester:– Implementation of second approach– Comparison with relevant existing approaches
• Additional work:– Implementation of the first approach– Comparison between the two
37
Feedback
• What parts of the presentation:– needed more focus?– unnecessary?– too detailed?
• Was the flow of the presentation natural? • Any thoughts about our two presented
methods?
Top Related