Lecture 24: Server-side Logic Execution
Transcript of Lecture 24: Server-side Logic Execution
![Page 1: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/1.jpg)
1 / 57
Adaptive Query Optimization
Lecture 24: Server-side Logic Execution
![Page 2: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/2.jpg)
2 / 57
Adaptive Query Optimization Recap
Recap
![Page 3: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/3.jpg)
3 / 57
Adaptive Query Optimization Recap
Adaptive Query Optimization
• The "plan-first execute-second" approach to query planning is notoriously error prone.• Optimizers should work with the execution engine to provide alternative plan
strategies and receive feedback.• Adaptive techniques now appear in many of the major commercial DBMSs
▶ DB2, Oracle, MSSQL, TeraData• Approaches
▶ Approach 1: Modify Future Invocations▶ Approach 2: Replan Current Invocation▶ Approach 3: Plan Pivot Points
![Page 4: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/4.jpg)
4 / 57
Adaptive Query Optimization Recap
Cost Models
• Using number of tuples processed is a reasonable cost model for in-memory DBMSs.▶ But computing this is non-trivial.▶ A combination of sampling + sketches allows the DBMS to achieve accurate estimations.
![Page 5: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/5.jpg)
5 / 57
Adaptive Query Optimization Recap
Observation
• Until now, we have assumed that all of the logic for an application is located in theapplication itself.
• The application has a "conversation" with the DBMS to store/retrieve data.▶ Protocols: JDBC, ODBC
![Page 6: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/6.jpg)
6 / 57
Adaptive Query Optimization Recap
Today’s Agenda
• Background• UDF In-lining• UDF to CTE Conversion
![Page 7: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/7.jpg)
7 / 57
Adaptive Query Optimization Background
Background
![Page 8: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/8.jpg)
8 / 57
Adaptive Query Optimization Background
Conversational Database API
![Page 9: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/9.jpg)
9 / 57
Adaptive Query Optimization Background
Conversational Database API
![Page 10: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/10.jpg)
10 / 57
Adaptive Query Optimization Background
Conversational Database API
![Page 11: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/11.jpg)
11 / 57
Adaptive Query Optimization Background
Conversational Database API
![Page 12: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/12.jpg)
12 / 57
Adaptive Query Optimization Background
Conversational Database API
![Page 13: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/13.jpg)
13 / 57
Adaptive Query Optimization Background
Conversational Database API
• The application has a "conversation" with the DBMS to store/retrieve data.• Locks are held for the duration of the transaction• Multiple network round-trips
![Page 14: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/14.jpg)
14 / 57
Adaptive Query Optimization Background
Embedded Database Logic
• Move application logic into the DBMS to avoid multiple network round-trips and toextend the functionality of the DBMS.
• Potential Benefits▶ Efficiency▶ Reuse logic across web and mobile applications
![Page 15: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/15.jpg)
15 / 57
Adaptive Query Optimization Background
Embedded Database Logic: Stored Procedures
![Page 16: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/16.jpg)
16 / 57
Adaptive Query Optimization Background
Embedded Database Logic: Stored Procedures
![Page 17: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/17.jpg)
17 / 57
Adaptive Query Optimization Background
Embedded Database Logic
• Stored Procedures (may contain DML statements, call UDFs e.t.c.)• User-Defined Functions (UDFs)• Triggers• User-Defined Types (UDTs)• User-Defined Aggregates (UDAs)
![Page 18: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/18.jpg)
18 / 57
Adaptive Query Optimization User-Defined Functions
User-Defined Functions
![Page 19: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/19.jpg)
19 / 57
Adaptive Query Optimization User-Defined Functions
User-Defined Functions
• A user-defined function (UDF) is a function written by the application developer thatextends the system’s functionality beyond its built-in operations.▶ It takes in input arguments (scalars)▶ Perform some computation▶ Return a result (scalars, tables)
• Examples: PL/SQL, plPG/SQL
![Page 20: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/20.jpg)
20 / 57
Adaptive Query Optimization User-Defined Functions
UDF Example
• Get all the customer ids and compute their customer service level based on theamount of money they have spent.
SELECT c_custkey, cust_level(c_custkey) FROM customer
CREATE FUNCTION cust_level(@ckey int) RETURNS char(10) ASBEGINDECLARE @total float; DECLARE @level char(10);SELECT @total = SUM(o_totalprice) FROM orders WHERE o_custkey=@ckey;IF (@total > 1000000) SET @level = 'Platinum';ELSE SET @level = 'Regular';RETURN @level;END
![Page 21: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/21.jpg)
21 / 57
Adaptive Query Optimization User-Defined Functions
UDF Advantages
• They encourage modularity and code reuse▶ Different queries can reuse the same application logic without having to reimplement it
each time.
• Fewer network round-trips between application server and DBMS for complexoperations.
• Some types of application logic are easier to express and read as UDFs than SQL.
![Page 22: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/22.jpg)
22 / 57
Adaptive Query Optimization User-Defined Functions
UDF Disadvantages (1)
• Query optimizers treat UDFs as black boxes.▶ Unable to estimate cost if you don’t know what a UDF is going to do when you run it.
• It is difficult to parallelize UDFs due to correlated queries inside of them.▶ Some DBMSs will only execute queries with a single thread if they contain a UDF.▶ Some UDFs incrementally construct queries.
![Page 23: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/23.jpg)
23 / 57
Adaptive Query Optimization User-Defined Functions
UDF Disadvantages (2)
• Complex UDFs in SELECT /WHERE clauses force the DBMS to execute iteratively.▶ RBAR = "Row By Agonizing Row"▶ Things get even worse if UDF invokes queries due to implicit joins that the optimizer
cannot "see".
• Since the DBMS executes the commands in the UDF one-by-one, it is unable toperform cross-statement optimizations.
![Page 24: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/24.jpg)
24 / 57
Adaptive Query Optimization User-Defined Functions
UDF Performance
SELECT l_shipmode,SUM(CASE
WHEN o_orderpriority <> '1-URGENT' THEN 1ELSE 0
END) AS low_line_countFROM orders, lineitemWHERE o_orderkey = l_orderkeyAND l_shipmode IN ('MAIL','SHIP')AND l_commitdate < l_receiptdateAND l_shipdate < l_commitdateAND l_receiptdate >= '1994-01-01'AND dbo.cust_name(o_custkey) IS NOT NULL --- User Defined Function
GROUP BY l_shipmode ORDER BY l_shipmode
![Page 25: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/25.jpg)
25 / 57
Adaptive Query Optimization User-Defined Functions
UDF Performance
CREATE FUNCTION cust_name(@ckey int)RETURNS char(25) ASBEGINDECLARE @n char(25);SELECT @n = c_nameFROM customer WHERE c_custkey = @ckey;
RETURN @n;END
![Page 26: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/26.jpg)
26 / 57
Adaptive Query Optimization User-Defined Functions
UDF Performance
• Microsoft SQL Server• TPC-H Q12 using a UDF (Scale Factor=1).• Reference
▶ Original Query: 0.8 sec▶ Query + UDF: 13 hr 30 min
![Page 27: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/27.jpg)
27 / 57
Adaptive Query Optimization User-Defined Functions
Microsoft SQL Server: UDF History
• 2001 – Microsoft adds TSQL Scalar UDFs.• 2008 – People realize that UDFs are "evil".• 2010 – Microsoft acknowledges that UDFs are evil.• 2014 – UDF decorrelation research @ IIT-B.• 2015 – Froid project begins @ MSFT Gray Lab.• 2018 – Froid added to SQL Server 2019.
![Page 28: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/28.jpg)
28 / 57
Adaptive Query Optimization UDF In-lining
UDF In-lining
![Page 29: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/29.jpg)
29 / 57
Adaptive Query Optimization UDF In-lining
Froid: UDF In-lining
• Automatically convert UDFs into relational expressions that are inlined as sub-queries.▶ Does not require the app developer to change UDF code.
• Perform conversion during the rewrite phase to avoid having to change the cost-baseoptimizer.▶ Commercial DBMSs already have powerful transformation rules for executing
sub-queries efficiently.
• Reference
![Page 30: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/30.jpg)
30 / 57
Adaptive Query Optimization UDF In-lining
Sub-Queries
• The DBMS treats nested sub-queries in the where clause as functions that takeparameters and return a single value or set of values.
• Two Approaches:▶ Rewrite to de-correlate and/or flatten them▶ Decompose nested query and store result to temporary table. Then the outer joins with
the temporary table.
![Page 31: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/31.jpg)
31 / 57
Adaptive Query Optimization UDF In-lining
Sub-Queries – De-correlate
SELECT name FROM sailors AS SWHERE EXISTS (SELECT * FROM reserves AS RWHERE S.sid = R.sidAND R.day = '2020-04-22'
)
SELECT nameFROM sailors AS S, reserves AS RWHERE S.sid = R.sidAND R.day = '2020-04-22'
![Page 32: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/32.jpg)
32 / 57
Adaptive Query Optimization UDF In-lining
Lateral Join
• Subqueries appearing in FROM can be preceded by the key word LATERAL.• This allows them to reference columns provided by preceding FROM items.• Without LATERAL, each subquery is evaluated independently and so cannot
cross-reference any other FROM item.• LATERAL is primarily useful when the cross-referenced column is necessary for
computing the row(s) to be joined.
![Page 33: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/33.jpg)
33 / 57
Adaptive Query Optimization UDF In-lining
Lateral Join
CREATE TABLE orders (id SERIAL PRIMARY KEY, user_id INT, created TIMESTAMP
);--- QuerySELECT user_id, first_order, next_order, id FROM(SELECT user_id, min(created) AS first_order FROM orders GROUP BY user_id) o1INNER JOIN LATERAL(SELECT id, created AS next_orderFROM ordersWHERE user_id = o1.user_id AND created > o1.first_orderORDER BY created ASC LIMIT 1)o2 ON true LIMIT 1;
![Page 34: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/34.jpg)
34 / 57
Adaptive Query Optimization UDF In-lining
FROID Overview
• Step 1 – Transform Statements• Step 2 – Break UDF into Regions• Step 3 – Merge Expressions• Step 4 – Inline UDF Expression into Query• Step 5 – Run Through Query Optimizer
![Page 35: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/35.jpg)
35 / 57
Adaptive Query Optimization UDF In-lining
Step 1 – Transform Statements
![Page 36: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/36.jpg)
36 / 57
Adaptive Query Optimization UDF In-lining
Step 2 – Break UDF into Regions
![Page 37: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/37.jpg)
37 / 57
Adaptive Query Optimization UDF In-lining
Step 3 – Merge Expressions
![Page 38: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/38.jpg)
38 / 57
Adaptive Query Optimization UDF In-lining
Step 4 – Inline UDF Expression into Query
![Page 39: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/39.jpg)
39 / 57
Adaptive Query Optimization UDF In-lining
Step 5 - Run Through Query Optimizer
![Page 40: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/40.jpg)
40 / 57
Adaptive Query Optimization UDF In-lining
Bonus Optimizations
![Page 41: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/41.jpg)
41 / 57
Adaptive Query Optimization UDF In-lining
Bonus Optimizations
![Page 42: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/42.jpg)
42 / 57
Adaptive Query Optimization UDF In-lining
Bonus Optimizations
![Page 43: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/43.jpg)
43 / 57
Adaptive Query Optimization UDF In-lining
Bonus Optimizations
![Page 44: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/44.jpg)
44 / 57
Adaptive Query Optimization UDF In-lining
Supported Operations (2019)
• T-SQL Syntax:▶ DECLARE, SET (variable declaration, assignment)▶ SELECT (SQL query, assignment )▶ IF / ELSE / ELSEIF (arbitrary nesting)▶ RETURN (multiple occurrences)▶ EXISTS, NOTEXISTS, ISNULL, IN, . . . (Other relational algebra operations)
• UDF invocation (nested/recursive with configurable depth)• All SQL datatypes.• Limitations: Loops, Dynamic Queries, Exceptions
![Page 45: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/45.jpg)
45 / 57
Adaptive Query Optimization UDF In-lining
Applicability / Coverage
Workloads Number of Scalar UDFs Froid Compatible
Workload 1 178 150Workload 2 90 82Workload 3 22 21
![Page 46: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/46.jpg)
46 / 57
Adaptive Query Optimization UDF In-lining
UDF Improvement Study
![Page 47: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/47.jpg)
47 / 57
Adaptive Query Optimization UDFs to CTEs Conversion
UDFs to CTEs Conversion
![Page 48: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/48.jpg)
48 / 57
Adaptive Query Optimization UDFs to CTEs Conversion
UDFs-to-CTEs
• Rewrite UDFs into plain SQL commands.• Use recursive common table expressions (CTEs) to support iterations and other control
flow concepts not supported in Froid.• DBMS Agnostic
▶ Can be implemented as a rewrite middleware layer on top of any DBMS that supportsCTEs.
• Reference
![Page 49: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/49.jpg)
49 / 57
Adaptive Query Optimization UDFs to CTEs Conversion
UDFs-to-CTEs Overview
• Step 1 – Static Single Assignment Form• Step 2 – Administrative Normal Form• Step 3 – Mutual to Direct Recursion• Step 4 – Tail Recursion to WITH RECURSIVE• Step 5 – Run Through Query Optimizer
![Page 50: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/50.jpg)
50 / 57
Adaptive Query Optimization UDFs to CTEs Conversion
Step 1 – Static Single Assignment Form
![Page 51: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/51.jpg)
51 / 57
Adaptive Query Optimization UDFs to CTEs Conversion
Step 2 – Administrative Normal Form
![Page 52: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/52.jpg)
52 / 57
Adaptive Query Optimization UDFs to CTEs Conversion
Step 3 – Mutual to Direct Recursion
![Page 53: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/53.jpg)
53 / 57
Adaptive Query Optimization UDFs to CTEs Conversion
Step 4 – WITH RECURSIVE
![Page 54: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/54.jpg)
54 / 57
Adaptive Query Optimization UDFs to CTEs Conversion
UDFs-to-CTEs Evaluation
![Page 55: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/55.jpg)
55 / 57
Adaptive Query Optimization Conclusion
Conclusion
![Page 56: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/56.jpg)
56 / 57
Adaptive Query Optimization Conclusion
Parting Thoughts
• This is huge. You rarely get 500× speed up without either switching to a new DBMS orrewriting your application.
• Another optimization approach is to compile the UDF into machine code.▶ This does not solve the optimizer’s cost model problem.
![Page 57: Lecture 24: Server-side Logic Execution](https://reader031.fdocuments.net/reader031/viewer/2022012500/6179254f499d686dd1497343/html5/thumbnails/57.jpg)
57 / 57
Adaptive Query Optimization Conclusion
Next Class
• Course Retrospective