Update-Pattern-Aware Modeling and Processing of Continuous Queries

21
Update-Pattern- Update-Pattern- Aware Modeling and Aware Modeling and Processing of Processing of Continuous Queries Continuous Queries Lukasz Golab Lukasz Golab University of Waterloo, University of Waterloo, Canada Canada [email protected] [email protected] Joint work with M. Tamer Joint work with M. Tamer Özsu Özsu

description

Update-Pattern-Aware Modeling and Processing of Continuous Queries. Lukasz Golab University of Waterloo, Canada [email protected] Joint work with M. Tamer Özsu. Introduction. Relational algebra and queries - PowerPoint PPT Presentation

Transcript of Update-Pattern-Aware Modeling and Processing of Continuous Queries

Page 1: Update-Pattern-Aware Modeling and Processing of Continuous Queries

Update-Pattern-Update-Pattern-Aware Modeling Aware Modeling

and Processing of and Processing of Continuous Continuous

QueriesQueriesLukasz GolabLukasz GolabUniversity of Waterloo, University of Waterloo,

[email protected]@uwaterloo.ca

Joint work with M. Tamer Joint work with M. Tamer ÖzsuÖzsu

Page 2: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 2 of 21 Lukasz Golab

IntroductionIntroduction

Relational algebra and queriesRelational algebra and queries Each operator consumes one or more Each operator consumes one or more

relation instances and outputs a relation instances and outputs a relation instancerelation instance

BlockingBlocking computations computations Some operators have non-blocking Some operators have non-blocking

variantsvariants aggregation, joinaggregation, join

Page 3: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 3 of 21 Lukasz Golab

What is a continuous What is a continuous query?query?

Expression composed of non-blocking Expression composed of non-blocking ``relational’’ operators that operate on ``relational’’ operators that operate on streamsstreams Streams may be bounded by Streams may be bounded by sliding windowssliding windows

QQ((tt) = answer of a continuous query ) = answer of a continuous query QQ at time at time tt

= output of corresponding one-time= output of corresponding one-time relational query relational query Q’Q’ whose inputs are the whose inputs are the current states of the streams, windows, current states of the streams, windows, and tables referenced in and tables referenced in QQ

Page 4: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 4 of 21 Lukasz Golab

Example of a continuous Example of a continuous queryquery

SUM

InputsOutput

Page 5: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 5 of 21 Lukasz Golab

What is an update pattern?What is an update pattern?

Update pattern does not refer to Update pattern does not refer to individual tuplesindividual tuples stream = stream = append-onlyappend-only

Update pattern refers to changes in the Update pattern refers to changes in the answer of a continuous query answer of a continuous query (insertions/deletions) (insertions/deletions)

Deletions? Aren’t streams append-only?Deletions? Aren’t streams append-only? Queries over an append-only database don’t Queries over an append-only database don’t

necessarily produce append-only outputnecessarily produce append-only output

Page 6: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 6 of 21 Lukasz Golab

Non-append-only outputNon-append-only output

Select stocks whose price this hour Select stocks whose price this hour is greater than their price in the is greater than their price in the previous hourprevious hour

Select all stock prices reported in Select all stock prices reported in the last 5 minutesthe last 5 minutes1 2 3 4 5 6 7 8 9 10 11 121 2 3 4 5 6 7 8 9 10 11 121 2 3 4 5 6 7 8 9 10 11 12FIFO Update Pattern

Company X 8am $1.00Company X 9am $1.50Company X 10am $1.25

Update Pattern?

Page 7: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 7 of 21 Lukasz Golab

Monotonic queriesMonotonic queries Query Query QQ is is monotonicmonotonic (over an append-only (over an append-only

database) if database) if QQ((tt) ) QQ((tt`̀for all for all t ≤ t`t ≤ t` Queries over sliding windows are Queries over sliding windows are non-non-

monotonicmonotonic because all of their results because all of their results eventually expire as the windows slide eventually expire as the windows slide forwardforward

Some queries are non-monotonic over an Some queries are non-monotonic over an append-only database (stream)append-only database (stream) Stock quotes whose price is higher than last Stock quotes whose price is higher than last

hourhour But others become non-monotonic due to But others become non-monotonic due to

windowingwindowing Select all stock quotes – monotonicSelect all stock quotes – monotonic

Page 8: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 8 of 21 Lukasz Golab

Problem definitionProblem definition

MotivationMotivation Two possible reasons for non-monotonic Two possible reasons for non-monotonic

behaviour of continuous queriesbehaviour of continuous queries Problem statementProblem statement

Divide non-monotonic queries into Divide non-monotonic queries into classesclasses

Analyze the update patterns of each Analyze the update patterns of each classclass

Use the knowledge of update patterns Use the knowledge of update patterns in query processing and optimizationin query processing and optimization

Page 9: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 9 of 21 Lukasz Golab

OutlineOutline

Update patterns of sliding window Update patterns of sliding window queriesqueries ClassificationClassification

Advantages of update-pattern Advantages of update-pattern awarenessawareness Modeling (query semantics)Modeling (query semantics) Processing (query execution)Processing (query execution)

Page 10: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 10 of 21 Lukasz Golab

Sliding window operatorsSliding window operators When a tuple falls out of its window, it also When a tuple falls out of its window, it also

expires from the output and from operator stateexpires from the output and from operator state

DISTINCT

x z x z x z z x y

x z y z y

oldest

f a d a cS1

S2 c f g d af d a a c

undo

z x z x z z x y

Page 11: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 11 of 21 Lukasz Golab

Calculating expiration Calculating expiration timestimes

Time-based windows – Time-based windows – predictable expiration predictable expiration timestimes Assign a timestamp, Assign a timestamp, tsts, upon arrival, upon arrival Expiration time = Expiration time = ts ts + window_size + window_size FIFOFIFO For joins: min(expiration times of the joined tuples)For joins: min(expiration times of the joined tuples)

Predictable, but is it stillPredictable, but is it still FIFO?FIFO?

Count-based windows, non-monotonic queries Count-based windows, non-monotonic queries over infinite streams - over infinite streams - unpredictableunpredictable Expiration time depends on stream arrival rates or Expiration time depends on stream arrival rates or

the data arriving on the stream the data arriving on the stream need need negative negative tuplestuples

Page 12: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 12 of 21 Lukasz Golab

Classification of update Classification of update patternspatterns

MonotonicMonotonic: answers never expire: answers never expire selection, join, duplicate elimination, over infinite selection, join, duplicate elimination, over infinite

streamsstreams Weakest non-monotonicWeakest non-monotonic: answers expire in : answers expire in

FIFO order, negative tuples are not necessaryFIFO order, negative tuples are not necessary operators over time-based windows that don’t operators over time-based windows that don’t

reorder incoming tuples during processingreorder incoming tuples during processing Weak non-monotonicWeak non-monotonic: order is not FIFO, but : order is not FIFO, but

negative tuples are not needednegative tuples are not needed Time-based window join, duplicate eliminationTime-based window join, duplicate elimination

Strict non-monotonicStrict non-monotonic: unpredictable : unpredictable expiration orderexpiration order negation, queries over count-based windowsnegation, queries over count-based windows

Page 13: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 13 of 21 Lukasz Golab

OutlineOutline

Update patterns of sliding window queries Classification

Advantages of update pattern Advantages of update pattern awarenessawareness Modeling (query semantics)Modeling (query semantics) Processing (query execution)Processing (query execution)

Page 14: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 14 of 21 Lukasz Golab

Update-pattern-aware Update-pattern-aware semantics of continuous semantics of continuous

queriesqueries How are updates of relational tables How are updates of relational tables

different from insertions and deletions different from insertions and deletions caused by the movement of the caused by the movement of the windows?windows? Join of two infinite streams is monotonicJoin of two infinite streams is monotonic Join of two windows is weak non-Join of two windows is weak non-

monotonicmonotonic Join of a window and a table: weakest Join of a window and a table: weakest

(easier), weak (same), or strict non-(easier), weak (same), or strict non-monotonic (harder)?monotonic (harder)?

Page 15: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 15 of 21 Lukasz Golab

Update-pattern-aware Update-pattern-aware modeling of continuous modeling of continuous

queries, cont.queries, cont. Harder: allow arbitrary table updatesHarder: allow arbitrary table updates

Strict non-monotonic because we can’t Strict non-monotonic because we can’t predict when and how the table will be predict when and how the table will be changedchanged

Easier: don’t allow retroactive updatesEasier: don’t allow retroactive updates Non-retroactive relation (NRR)Non-retroactive relation (NRR) – table – table

updates don’t affect previously arrived updates don’t affect previously arrived stream tuplesstream tuples

Weakest non-monotonicWeakest non-monotonic

Page 16: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 16 of 21 Lukasz Golab

ExampleExample

Stream: stock quotesStream: stock quotes Table: mapping between stock symbols Table: mapping between stock symbols

and company namesand company names Query: select company name and price Query: select company name and price

over a (time-based) windowover a (time-based) window Company goes bankrupt: delete its previous Company goes bankrupt: delete its previous

quotes (relation) or not (NRR)quotes (relation) or not (NRR) Company changes name: update the name in Company changes name: update the name in

previous quotes (relation) or not (NRR)previous quotes (relation) or not (NRR) New company: no prior stock quotesNew company: no prior stock quotes

Page 17: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 17 of 21 Lukasz Golab

Update-pattern-aware query Update-pattern-aware query processingprocessing

Annotate query plan with update Annotate query plan with update patterns of each sub-querypatterns of each sub-query

Use appropriate data structures for Use appropriate data structures for storing statestoring state

Use appropriate physical operatorsUse appropriate physical operators

Delete Insert partition by expiration time

DISTINCTDISTINCT

Strict non-monotonic Weakest or weak non-monotonic

Page 18: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 18 of 21 Lukasz Golab

Update-pattern-aware query Update-pattern-aware query optimizationoptimization

Cost model Cost model Per-unit-time cost of executing Per-unit-time cost of executing

operators, maintaining state, and operators, maintaining state, and processing negative tuplesprocessing negative tuples

Update-pattern-aware heuristicUpdate-pattern-aware heuristic Strict NM pull-up, weakest NM push-Strict NM pull-up, weakest NM push-

down down operator and state implementations are operator and state implementations are

simpler with weakest and weak NMsimpler with weakest and weak NM

Page 19: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 19 of 21 Lukasz Golab

Update-pattern-aware query Update-pattern-aware query optimization, cont.optimization, cont.

WKS WKS

WKSWK

STR

WKS WKS

WKSSTR

STR

STR

Stream 1 Stream 2 Stream 3 Stream 1 Stream 2 Stream 3

Page 20: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 20 of 21 Lukasz Golab

SummarySummary

Monotonic vs. non-monotonic classification Monotonic vs. non-monotonic classification is not precise enoughis not precise enough Fails to distinguish between predictable (due Fails to distinguish between predictable (due

to windowing) and unpredictable update to windowing) and unpredictable update patternspatterns

Our update-pattern classificationOur update-pattern classification Clarifies the semantics of continuous queries Clarifies the semantics of continuous queries

that reference tables alongside that reference tables alongside streams/windowsstreams/windows

Forms the basis of our update-pattern-aware Forms the basis of our update-pattern-aware query processorquery processor

Page 21: Update-Pattern-Aware Modeling and Processing of Continuous Queries

SIGMOD, June 2005 21 of 21 Lukasz Golab

Future workFuture work

Extend update-pattern-aware query Extend update-pattern-aware query optimizationoptimization

Investigate the update patterns of Investigate the update patterns of periodically re-executed queriesperiodically re-executed queries

Sub-divide queries over count-based Sub-divide queries over count-based windowswindows For now, strict non-monotonicFor now, strict non-monotonic