Query Processing in Query Processing in the AquaLogic Data the AquaLogic Data Services PlatformServices Platform
Vinayak Borkar, Michael Carey, Dmitry Lychagin, Till Westmann, Daniel Engovatov, Nicola Onose
BEA Systemswww.bea.com
(C) Copyright 2006, BEA Systems, Inc | 2
Data Is Everywhere Today
Relational databases made things too easy
Departmental vs. inter-galactic centralized databases
Databases come in many flavors
Relational: Oracle, DB2(s), SQL Server, MySQL, …
Hangers-on: IMS, IDMS, VSAM, …
Not all data is SQL-accessible
Packaged applications: SAP, PeopleSoft, Siebel, Oracle, SalesForce, …
Custom “homegrown” applications
Files of various shapes and sizes (XML, non-XML)
And the list goes on…
(C) Copyright 2006, BEA Systems, Inc | 3
Painful to Develop Applications
No one “single view of X” for any XWhat data do I have about X?
How do I stitch together the info I have?
What else is X related to?
No uniformity in source model or languageData about X is stored in many different formats
Accessing or updating X involves many different APIs
Manual coding of “distributed query plans”
No reuse of artifactsDifferent access criteria or returned data different access plans
No model to help organize or find the artifacts anyway
(C) Copyright 2006, BEA Systems, Inc | 4
Agenda
Why data services?
Overview and ExampleOverview and Example
The Query Processor
Work in progress at BEA
Summary and Q&A
(C) Copyright 2006, BEA Systems, Inc | 5
OverviewJAVA/J2EE
WebServices
Business partners, Adapter,
.Net
LDAP J2EE Excel
Java FunctionsFiles
XML, Flat Files
Tables, views stored
procedures,SQL
Relational
JAVA API JDBC/SQL
Developer Tooling
REPORTING
Connectivity
Client API
Data Processing Engine
WEB SERVICE
WSDL
JCA JMSCustom Access
(C) Copyright 2006, BEA Systems, Inc | 6
Example: Customer Profile Data Service
CUSTOMER,
ORDERCREDIT_CARD getRating(…)
Order Info
Credit Card Info
Rating Info
Customer Info
(C) Copyright 2006, BEA Systems, Inc | 7
Data Service – Design View
(C) Copyright 2006, BEA Systems, Inc | 8
Data Service – “Get All” Read Method
(::pragma function ... kind="read"kind="read" ...::)
declare function tns:getProfile() as element(ns0:PROFILE)*{ for $CUSTOMER in db1:CUSTOMER() return <tns:PROFILE> <CID>{ fn:data($CUSTOMER/CID) }</CID> <LAST_NAME>{ fn:data($CUSTOMER/LAST_NAME) }</LAST_NAME> <ORDERS>{ db1:getORDER($CUSTOMER) }</ORDERS> <CREDIT_CARDS>{ db2:CREDIT_CARD()[CID eq $CUSTOMER/CID] }</CREDIT_CARDS> <RATING>{ fn:data(ws1:getRating( <ns5:getRating> <ns5:lName>{ data($CUSTOMER/LAST_NAME) }</ns5:lName> <ns5:ssn>{ data($CUSTOMER/SSN) }</ns5:ssn> </ns5:getRating>
) }</RATING> </tns:PROFILE>};
(C) Copyright 2006, BEA Systems, Inc | 9
Data Service – Read & Navigate Methods
(::pragma function ... kind="read"kind="read" ...::)
declare function tns:getProfileByID($id as xs:string) as element(ns0:PROFILE)*{ tns:getProfile()[CID eq $id]};
...
(::pragma function ... kind="navigate"kind="navigate" ...::)
declare function tns:getCOMPLAINTs($arg as element(ns0:PROFILE)) as element(ns8:COMPLAINT)*{ db3:COMPLAINT()[CID eq $arg/CID]};
...
(C) Copyright 2006, BEA Systems, Inc | 10
Agenda
Why data services?
Overview and ExampleOverview and Example
The Query Processor
Work in progress at BEA
Summary and Q&A
(C) Copyright 2006, BEA Systems, Inc | 11
Query Processor Overview
(C) Copyright 2006, BEA Systems, Inc | 12
Efficient processing
Avoid unnecessary workFunction inlining (view unfolding)
Push work to the sourcesQueryable sources can do some of our work
(C) Copyright 2006, BEA Systems, Inc | 13
Optimization: Function Inlining
Example: This fragment
let $x := <CUSTOMER> <LAST_NAME>{$name}</LAST_NAME> <ORDERS>…</ORDERS> </CUSTOMER> return fn:data($x/LAST_NAME)
can be replaced by
$name
But we need to:maintain structural type information for compilation
extend “preserve” mode for runtime
(C) Copyright 2006, BEA Systems, Inc | 14
Pushdown: Overview
SQL Translator tries to “swallow” as much as possible.Make translation easy
Remove unnecessary functions (by inlining)
Translation to joins and grouping
Split sorting and grouping
Make sources do as much as possible
DBMS specific code
Maximize Pushdown
Inverse functions
(C) Copyright 2006, BEA Systems, Inc | 15
Pushdown: Preparation
Translation to joins and grouping
Split sorting and grouping
for $CUSTOMER in db1:CUSTOMER() return <tns:PROFILE> <CID>{ fn:data($CUSTOMER/CID) }</CID> … <ORDERS>{ db1:getORDER($CUSTOMER) }</ORDERS> <CREDIT_CARDS>{ db2:CREDIT_CARD()[CID eq $CUSTOMER/CID] }</CREDIT_CARDS> … </tns:PROFILE>
(C) Copyright 2006, BEA Systems, Inc | 16
Pushdown: Inverse Functions I
Example fragment of tns:getProfile:
<tns:PROFILE> <CID>{fn:data($CUSTOMER/CID)}</CID> <LAST_NAME>{ fn:data($CUSTOMER/LAST_NAME) }</LAST_NAME> <SINCE>{int2date($CUSTOMER/SINCE)}</SINCE> ... <tns:PROFILE>
used in this query
for $c in tns:getProfile() where $c/SINCE gt $start return $c
yields (after inlining)
for $c1 in ns3:CUSTOMER() where int2date($c1/SINCE) gt $start return <tns:PROFILE> ... </tns:PROFILE>
(C) Copyright 2006, BEA Systems, Inc | 17
Pushdown: Inverse Functions II
RegisterInverse function date2int for int2date
Transformation rule (gt, int2date) gt-intfromdate with
declare function gt-intfromdate($x1 as xs:dateTime, $x2 as xs:dateTime) as xs:boolean?{ date2int($x1) gt date2int($x2) };
Now we can rewrite the query into
for $c1 in ns3:CUSTOMER() where $c1/SINCE gt ns1:date2int($start) return <tns:PROFILE> ... </tns:PROFILE>
can be pushed as
SELECT * FROM "CUSTOMER" t1 WHERE (t1."SINCE" > ?)
(C) Copyright 2006, BEA Systems, Inc | 18
Optimization: PP-k JoinParameter Passing in chunks of k
PrerequisitesDistributed join
Right side is a relational source
Idea: relational source can partition its content
StepsRead k items from the left into L
Select all items from the right that match any of the items in L into R
Join L and R in the middleware
Repeat until the left is exhausted
Benefit: excellent trade-off betweenMemory footprint in the middleware
Roundtrip overhead imposed by the data source
k = 20
(C) Copyright 2006, BEA Systems, Inc | 19
Example: “Get All” Read Method Revisited
(::pragma function ... kind="read" ...::)
declare function tns:getProfile() as element(ns0:PROFILE)*{ for $CUSTOMER in db1:CUSTOMER() return <tns:PROFILE> <CID>{ fn:data($CUSTOMER/CID) }</CID> <LAST_NAME>{ fn:data($CUSTOMER/LAST_NAME) }</LAST_NAME> <ORDERS>{ db1:getORDER($CUSTOMER) }</ORDERS> <CREDIT_CARDS>{ db2:CREDIT_CARD()[CID eq $CUSTOMER/CID] }</CREDIT_CARDS> <RATING>{ fn:data(ws1:getRating( <ns5:getRating> <ns5:lName>{ data($CUSTOMER/LAST_NAME) }</ns5:lName> <ns5:ssn>{ data($CUSTOMER/SSN) }</ns5:ssn> </ns5:getRating>
) }</RATING> </tns:PROFILE>};
(C) Copyright 2006, BEA Systems, Inc | 20
Evaluation Plan, Example 1 (getProfile)
(C) Copyright 2006, BEA Systems, Inc | 21
Evaluation Plan, Example 2 (query getProfile)
(C) Copyright 2006, BEA Systems, Inc | 22
Agenda
Why data services?
Overview and ExampleOverview and Example
The Query Processor
Work in progress at BEA
Summary and Q&A
(C) Copyright 2006, BEA Systems, Inc | 23
Some ALDSP Work in Progress
Native JDBC/SQL92 support (available on Sep 15)
Bilingual engine for efficient reporting/BI tool access
Support for compensating transactions
Extend update facility to support safe non-XA updates (sagas)
XQuery Update support (as well as XQueryP)
Goal: no Java coding for many Web service use cases
(C) Copyright 2006, BEA Systems, Inc | 24
Agenda
Why data services?
Overview and ExampleOverview and Example
The Query Processor
Work in progress at BEA
Summary and Q&A
(C) Copyright 2006, BEA Systems, Inc | 25
Summary
Covered hereWhy Data Services?
How are Data Services used?
Some techniques for efficient evaluation
In the paperXQuery extensions
Runtime architecture
Caches
Updates
Security
Top Related