Continuous Queries

25
Scalable Continuous Query System for Internet Databases Jianjun Chen et al Computer Sciences Dept. University of Wisconsin- Madison SIGMOD 2000 Presented by Mukund Agrawal

description

NiagaraCQ : A Scalable Continuous Query System for Internet Databases Jianjun Chen et al Computer Sciences Dept. University of Wisconsin-Madison SIGMOD 2000 Presented by Mukund Agrawal. Continuous Queries. A triple ( Q, A, Stop) Scope also includes future data Example - PowerPoint PPT Presentation

Transcript of Continuous Queries

Page 1: Continuous Queries

NiagaraCQ : A Scalable Continuous Query System for Internet Databases Jianjun Chen et al

Computer Sciences Dept. University of Wisconsin-Madison

SIGMOD 2000 Presented by

Mukund Agrawal

Page 2: Continuous Queries

Continuous QueriesA triple ( Q, A, Stop)Scope also includes future dataExample

Inform me when there is a new publication related to multi-query optimization

A broad classification Change based Timer based

Page 3: Continuous Queries

NiagaraCQA CQ system for the InternetContinuous Queries on XML data setsScalable CQ processingIncremental group optimizationHandles both change based and timer based queries in a uniform way

Page 4: Continuous Queries

OutlineGeneral strategy of incremental group optimizationQuery split with materialized intermediate filesIncremental grouping of selection and join operatorsSystem architectureExperimental results

Page 5: Continuous Queries

NiagaraCQ command language

Creating a CQCreate CQ_name XML-QL queryDo action { START start_time} { EVERY

time_interval} { EXPIRE expiration_time}

Delete CQ_name

Page 6: Continuous Queries

Incremental group optimization

General Strategy

Why can’t we regroup all queries when a new query is added ?

Use of expression signatures for grouping

Same syntax structure Different constant values

Page 7: Continuous Queries

Expression SignatureQuery examples Where <Quotes><Quote><Symbol>INTC</></></> element_as $g in “http://www.stock.com/quotes.xml”

construct $g

Where <Quotes><Quote><Symbol>MSFT</></></> element_as $g in “http://www.stock.com/quotes.xml”

construct $g

Expression signatures

Quotes.Quote.Symbol in quotes.xml

constant

=

Page 8: Continuous Queries

Query plansTrigger Action I Trigger Action J

File Scan

Select Symbol = “MSFT”

Select Symbol = “INTC”

File Scan

quotes.xml quotes.xml

Page 9: Continuous Queries

GroupGroup Signature

Common signature of all queries in the group

Group constant table Constant_value

Dest_buffer

INTC Dest. I MSFT Dest. J

Page 10: Continuous Queries

The group plan

Page 11: Continuous Queries

Incremental Grouping AlgoWhen a new query is submitted

If the expression signature of the new query matches that of existing groups

Break the query plan into two partsRemove the lower partAdd the upper part onto the group

planelse create a new group

Page 12: Continuous Queries

Query split with materialized intermediate files

Why not use a pipeline scheme ? Split operator may block simple queries Gives a single complicated execution plan A large portion of query plan may not need to be

executed at each invocation Does not work for grouping timer based queries

Using intermediate files Cut query plan into 2 parts at split operator Add a file scan operator to upper part to read

intermediate file

Page 13: Continuous Queries

The query split scheme

Page 14: Continuous Queries

Trade-offsOther advantages of materialized intermediate files

Only the necessary queries are executed Uniform handling of intermediate files and

original data source files

Disadvantages Split operator becomes a blocking

operator Extra disk I/Os

Page 15: Continuous Queries

Incremental grouping of selection predicates

Multiple selection predicates in a query CNF for predicates on same data source

Incremental grouping Choose the most selective conjunct

Evaluation of other predicates Upper levels of continuous query

Example queryWhere <Quotes><Quote><Symbol>”INTC”</><Current_Price>$p</></> element_as $g </>in “quotes.xml”, $p < 100Construct $g

Page 16: Continuous Queries

Range-query groupsProblem

Intermediate files may contain duplicate tuples

Solution : Virtual intermediate files Virtual intermediate file stores value ranges One real intermediate file has a clustered index

Page 17: Continuous Queries

Incremental grouping of join operators

A join queryQuotes.Quote.Change_Ratio constant in “quotes.xml”Where <Quotes><Quote><Symbol>$s</></>element_as $g </> in “quotes.xml”,<Companies><Company><Symbol>$s</></>element_as $t</> in “companies.xml”construct $g, $t

Page 18: Continuous Queries

Queries that contain both join and selection

Example query :Where <Quotes><Quote><Symbol>$s</><Industry>”Computer Service”</></>element_as $g </> in “quotes.xml”,<Companies><Company><Symbol>$s</></>element_as $t</> in “companies.xml”construct $g, $t

Where to place the selection operator ? Below the join Above the join

Page 19: Continuous Queries

Grouping timer-based queries

Challenge Sharing common computation

Event List Stores time events sorted in time order

Page 20: Continuous Queries

Incremental evaluationInvoke queries only on changed data

For each file, NiagaraCQ keeps a delta file

Incremental evaluation of join operators requires complete data files

Page 21: Continuous Queries

Memory CachingThousands of continuous queries can’t fit in memoryWhat should we cache ?

Grouped query plans What about non-grouped queries ?

Favor small delta files Front part of the event list

Page 22: Continuous Queries

System Architecture

Page 23: Continuous Queries

CQ processing

Page 24: Continuous Queries

Experimental ResultsExample query :

Where <Quotes><Quote><Symbol>”INTC”</></>element_as $g </> in “quotes.xml”, construct $g

Page 25: Continuous Queries

Thank You