DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting...

44
DEBUGGING YOUR DATABASE SYSTEM USING APOLLO JOY ARULRAJ GEORGIA TECH

Transcript of DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting...

Page 1: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

DEBUGGING YOUR DATABASE SYSTEMUSING APOLLO

J O Y A R U L R A JG E O R G I A T E C H

Page 2: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

APOLLO

• Holistic toolchain for debugging database systems▫ Inspired by Jepsen

3

AUTOMATICALLY FIND SQL queries exhibiting PERFORMANCE regressions

AUTOMATICALLY DIAGNOSE THE ROOT CAUSE OF PERFORMANCE regressions

1

2

JOY ARULRAJ ([email protected])

Page 3: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

APOLLO (VLDB 2020)

JINHO JUNG TAESOO KIM

HONG HU WOONHAK KANG

4JOY ARULRAJ ([email protected])

Page 4: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

6.1

26.4

47.7

1.4 4.48.7

0

10

20

30

40

50

60

2000 2010 Present

7x increaseCode

Size(MB)

SQLITEPOSTGRESQL

Release Year

MOTIVATION: DBMS COMPLEXITY

5

Lower isBetter

JOY ARULRAJ ([email protected])

Page 5: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

MOTIVATION: PERFORMANCE REGRESSIONS

• Challenging to build systems with predictable performance▫ Due to complex interactions between different components

• Scenario: User upgrades a DBMS installation ▫ Query suddenly takes ten times longer to execute▫ Due to unexpected interactions between different components▫ Refer to this behavior as a performance regression

• Performance regressions can hurt user productivity▫ Can easily covert an interactive query to an overnight one

6JOY ARULRAJ ([email protected])

Page 6: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

SELECT R0.S_DIST_06FROM PUBLIC.STOCK AS R0WHERE (R0.S_W_ID < CAST(LEAST(0, 1) AS INT8))

>10,000x slowdown

MOTIVATION: PERFORMANCE REGRESSIONS

7

LATEST VERSION OF POSTGRESQL

• Due to a recent optimizer update▫ New policy for choosing the scan algorithm▫ Resulted in over-estimating the number of rows in the table▫ Earlier version: Fast bitmap scan▫ Latest version: Slow sequential scan

JOY ARULRAJ ([email protected])

Page 7: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

MOTIVATION: DETECTING REGRESSIONS

8

How to discover queries exhibiting regressions?

SELECT NO FROM ORDER AS R0WHERE EXISTS (SELECT CNT FROM SALES AS R1WHERE EXISTS (SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

SELECT NO FROM ORDER AS R0WHERE EXISTS (

SELECT CNT FROM SALES AS R1WHERE EXISTS (

SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

1

Query runsslower on

latest version

JOY ARULRAJ ([email protected])

Page 8: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

SELECT NO FROM ORDER AS R0WHERE EXISTS (SELECT CNT FROM SALES AS R1WHERE EXISTS (SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

SELECT NO FROM ORDER AS R0WHERE EXISTS (

SELECT CNT FROM SALES AS R1WHERE EXISTS (

SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

MOTIVATION: REPORTING REGRESSIONS

9

2 How to simplify queries for reporting regressions?

Query runsslower on

latest version

JOY ARULRAJ ([email protected])

Page 9: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

10

MOTIVATION: DIAGNOSING REGRESSIONS

SELECT NO FROM ORDER AS R0WHERE EXISTS (SELECT CNT FROM SALES AS R1WHERE EXISTS (SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

SELECT NO FROM ORDER AS R0WHERE EXISTS (

SELECT CNT FROM SALES AS R1WHERE EXISTS (

SELECT ID FROM HISTORY AS R2WHERE (R0.INFO IS NOT NULL));

3 How to DIAGNOSE THE ROOT CAUSE OF THE regression?

Query runsslower on

latest version

JOY ARULRAJ ([email protected])

Page 10: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

APOLLO TOOLCHAIN

11

How to discover queries exhibiting regressions?1

SQLFUZZ: FEEDBACK-DRIVEN FUZZING

OLDVERSION SQLFuzz SQLMin SQLDebug

APOLLO TOOLCHAIN

NEWVERSION

BUGREPORTS

- Query- Commit- File- Function

JOY ARULRAJ ([email protected])

Page 11: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

APOLLO TOOLCHAIN

12

OLDVERSION SQLFuzz SQLMin SQLDebug

APOLLO TOOLCHAIN

NEWVERSION

BUGREPORTS

- Query- Commit- File- Function

2 How to simplify queries for reporting regressions?

SQLMIN: BI-DIRECTIONAL QUERY REDUCTION ALGORITHMS

JOY ARULRAJ ([email protected])

Page 12: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

APOLLO TOOLCHAIN

13

OLDVERSION SQLFuzz SQLMin SQLDebug

APOLLO TOOLCHAIN BUGREPORTS

- Query- Commit- File- FunctionNEW

VERSION

3 How to DIAGNOSE THE ROOT CAUSE OF THE regression?

SQLDEBUG: STATISTICAL DEBUGGING + COMMIT BISECTION

JOY ARULRAJ ([email protected])

Page 13: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

TALK OVERVIEW

14

OLDVERSION SQLFuzz SQLMin SQLDebug

BUGREPORTS

- Query- Commit- File- FunctionNEW

VERSION

APOLLO TOOLCHAIN

JOY ARULRAJ ([email protected])

Page 14: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#1: SQLFUZZ — DETECTING REGRESSIONS

Old Version New Version

QueryGenerator

QueryExecutor

Reg.Validator

SQLFUZZ

RandomQueries

CandidateQueries Queries

ExhibitingPerformanceRegression

Update SQL GrammarProbability Table

1 2 3

15JOY ARULRAJ ([email protected])

Page 15: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

Check Complexity

#1: SQLFUZZ — DETECTING REGRESSIONS

QueryGenerator

RetrieveSchema

SQL GrammarProbability Table

ValidQueries

Queriesfor

Fuzzing

CASE 0.3 LEFT JOIN 0.3

LIMIT 0.2 CAST 0.2

16

1 QUERY GENERATOR: RANDOM QUERY GENERATION

JOY ARULRAJ ([email protected])

Page 16: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#1: SQLFUZZ — DETECTING REGRESSIONS

QueryExecutor

FoundRegression?

SELECT R0.S_DIST_06 FROM PUBLIC.STOCK AS R0WHERE (R0.S_W_ID < CAST (LEAST(0, 1) AS INT8))

Update Table

CASE LEFT JOIN

LIMIT CAST +0.1

17

2 QUERY EXECUTOR: FEEDBACK-DRIVEN FUZZING

Old Version New Version

SQL Grammar Probability TableJOY ARULRAJ ([email protected])

Page 17: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#1: SQLFUZZ — DETECTING REGRESSIONS

1 Non-deterministic behavior

2 Non-executed QUERY plan?

3 usage OF catalog statistics?

4 enough memory?

5 LIMIT statement?

6 Query is too complex?

7 …

DBMS Developers

Update Filtering Rules

Queries ExhibitingPerformanceRegression

Filtering Rules

Report

18

3 REGRESSION VALIDATOR: REDUCING FALSE POSITIVES

JOY ARULRAJ ([email protected])

Page 18: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

TALK OVERVIEW

19

OLDVERSION SQLFuzz SQLMin SQLDebug

BUGREPORTS

- Query- Commit- File- FunctionNEW

VERSION

APOLLO TOOLCHAIN

JOY ARULRAJ ([email protected])

Page 19: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#2: SQLMIN — REPORTING REGRESSIONS

• Top-Down Query Reduction▫ Iteratively remove unnecessary query elements

•Bottom-Up Query Reduction▫ Extract valid sub-queries

20JOY ARULRAJ ([email protected])

Page 20: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#2: SQLMIN — REPORTING REGRESSIONS

JOY ARULRAJ ([email protected])

SELECT S1.C2FROM (SELECTCASE WHEN EXISTS (

SELECT S0.C0FROM ORDER AS R1WHERE ((S0.C0 = 10) AND (S0.C1 IS NULL))

) THEN S0.C0 END AS C2,FROM (

SELECT R0.I_PRICE AS C0, R0.I_DATA AS C1,(SELECT ID FROM ITEM) AS C2

FROM ITEM AS R0WHERE R0.PRICE IS NOT NULL

OR (R0.PRICE IS NOT S1.C2)LIMIT 1000) AS S0) AS S1;

21

Page 21: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

JOY ARULRAJ ([email protected])

SELECT S1.C2FROM (SELECTCASE WHEN EXISTS (

SELECT S0.C0FROM ORDER AS R1WHERE ((S0.C0 = 10) AND (S0.C1 IS NULL))

) THEN S0.C0 END AS C2,FROM (

SELECT R0.I_PRICE AS C0, R0.I_DATA AS C1,(SELECT ID FROM ITEM) AS C2

FROM ITEM AS R0WHERE R0.PRICE IS NOT NULL

OR (R0.PRICE IS NOT S1.C2)LIMIT 1000) AS S0) AS S1;

#2: SQLMIN — REPORTING REGRESSIONS

Removedependencies

BOTTOM-UPReductionEXTRACT SUB-QUERY

22

Page 22: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

JOY ARULRAJ ([email protected])

SELECT S1.C2FROM (SELECTCASE WHEN EXISTS (

SELECT S0.C0FROM ORDER AS R1WHERE ((S0.C0 = 10) AND (S0.C1 IS NULL))

) THEN S0.C0 END AS C2,FROM (

SELECT R0.I_PRICE AS C0, R0.I_DATA AS C1,(SELECT ID FROM ITEM) AS C2

FROM ITEM AS R0WHERE R0.PRICE IS NOT NULL

OR (R0.PRICE IS NOT S1.C2)LIMIT 1000) AS S0) AS S1;

#2: SQLMIN — REPORTING REGRESSIONS

Top-Down ReductionRemove Elements

Remove conditions

Remove columnsRemove sub-queries

Remove clauses

23

Page 23: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

SELECT S1.C2FROM (SELECTCASE WHEN EXISTS (

SELECT S0.C0FROM ORDER AS R1WHERE ((S0.C0 = 10))

) THEN S0.C0 END AS C2,FROM (

SELECT R0.I_PRICE AS C0, FROM ITEM AS R0WHERE R0.PRICE IS NOT NULL) AS S0)

AS S1;

#2: SQLMIN — REPORTING REGRESSIONS

24JOY ARULRAJ ([email protected])

Page 24: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

TALK OVERVIEW

25

OLDVERSION SQLFuzz SQLMin SQLDebug

BUGREPORTS

- Query- Commit- File- FunctionNEW

VERSION

APOLLO TOOLCHAIN

JOY ARULRAJ ([email protected])

Page 25: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

Slow

Fast

DBMS

CommitBisection

SQLMin

RegressionQuery

First Commit Exhibiting Regression?

StatisticalDebugger

Control-Flow Graphs(Traces)

Partially ReducedQueries

26

SQLDEBUGBUG

REPORT

- Query- Commit- File- Function

JOY ARULRAJ ([email protected])

Page 26: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

Commit 1

Commit 2

Commit 3

Commit 5 New VERSION (slow QUERY EXECUTION)

Old VERSION (fast QUERY EXECUTION)

Problem begins here!

27

1 COMMIT BISECTION: FIND EARLIEST PROBLEMATIC COMMIT

JOY ARULRAJ ([email protected])

Page 27: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

Partially Reduced Queries

MinimizedQuery

OriginalQuery

SELECT NO FROMORDER AS R0 WHEREEXISTS (SELECT CNTFROM SALES AS R1WHERE EXISTS (SELECT ID FROM

SELECT CNTFROM SALESWHERE CNT > ID

28

2 QUERY REDUCTION: PARTIALLY REDUCED QUERIES

JOY ARULRAJ ([email protected])

Page 28: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

SELECT NO FROMORDER AS R0 WHEREEXISTS (SELECT CNTFROM SALES AS R1WHERE EXISTS (SELECT ID FROM

SELECT CNTFROM SALESWHERE CNT > ID

Collect set of queries29

Partially Reduced Queries

MinimizedQuery

OriginalQuery

2 QUERY REDUCTION: PARTIALLY REDUCED QUERIES

JOY ARULRAJ ([email protected])

Page 29: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

Functions

int func(){if (cond1)

work;}

int func(){if (cond1)

work;}

30

Old Version

New Version

3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES

JOY ARULRAJ ([email protected])

Page 30: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

Functions

int func(){if (cond1)

work;}

int func(){if (cond1)

work;}

0x4000x420 è true

0x5000x520 è false

31

Old Version

New Version

3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES

Traces

JOY ARULRAJ ([email protected])

Page 31: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

Functions Traces

int func(){if (cond1)

work;}

int func(){if (cond1)

work;}

0x4000x420 è true

0x5000x520 è false

Trace Alignment

func + 0x0func + 0x20 è true

func + 0x0func + 0x20 è false

32

3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES

Old Version

New Version

JOY ARULRAJ ([email protected])

Page 32: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

Fast QueryExecution Traces

Slow QueryExecution Traces

StatisticalDebugging

Model

33

BRANCH TRACE

1 TAKEN

2 TAKEN

BRANCH TRACE

1 TAKEN

2 NOT TAKEN

4 STATISTICAL DEBUGGING: FAST AND SLOW QUERY TRACES

JOY ARULRAJ ([email protected])

Page 33: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

Bug Report

RANK FILE FUNCTION LINE

1 foo.c bar() 2

… … … …

BRANCH TRACE

1 TAKEN

2 TAKEN

BRANCH TRACE

1 TAKEN

2 NOT TAKEN

34

Fast QueryExecution Traces

Slow QueryExecution Traces

4 STATISTICAL DEBUGGING: FAST AND SLOW QUERY TRACES

StatisticalDebugging

Model

JOY ARULRAJ ([email protected])

Page 34: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

RECAP

35

OLDVERSION SQLFuzz SQLMin SQLDebug

BUGREPORTS

- Query- Commit- File- FunctionNEW

VERSION

APOLLO TOOLCHAIN

JOY ARULRAJ ([email protected])

Page 35: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

EVALUATION

• Tested database systems▫ PostgreSQL, SQLite •Binary instrumentation to get control flow graphs▫ DynamoRIO instrumentation tool

•Evaluation▫ Efficacy of SQLFuzz in detecting regressions?▫ Efficacy of SQLMin in reducing queries?▫ Accuracy of SQLDebug in diagnosing regressions?

36JOY ARULRAJ ([email protected])

Page 36: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#1: SQLFUZZ — DETECTING REGRESSIONS

218201

0

50

100

150

200

250

PostgreSQL SQLite

MeanPerformance

Drop (Ratio)

37

Discovered 10 previously unknown,unique performance regressions.

(7 acknowledged, 2 fixed)

200x performancedrop

Lower isBetter

JOY ARULRAJ ([email protected])

Page 37: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#1: SQLFUZZ — FALSE POSITIVES

99

0.00440.001

0.01

0.1

1

10

100

DiscoveredQueries

SQLFuzz

False PositiveQueries (Percent)

38

Lower isBetter

Filtering rules remove almost all false positives

JOY ARULRAJ ([email protected])

Page 38: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#2: SQLMIN — REPORTING REGRESSIONS

1602

3800

500

1000

1500

2000

Discovered Queries SQLMin

QuerySize

(Bytes)

39

77%reduction

Lower isBetter

Significant reduction in query size

DiscoveredQueries

JOY ARULRAJ ([email protected])

Page 39: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

5

2

3Correctly Identified

Within Two Candidates

Within Three Candidates

10 regressions

40

FIRST ranked branch

Second ranked branch

Third ranked branch

Branch related to root cause correctly identified in all cases

(within top-3 ranked branches)

JOY ARULRAJ ([email protected])

Page 40: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

SELECT COUNT (∗)FROM (SELECT R0.IDFROM CUSTOMER AS R0 LEFT JOIN STOCK AS R1ON (R0.STREET = R1.DIST)WHERE R1.DIST IS NOT NULL) AS S0

WHERE EXISTS (SELECT ID FROM CUSTOMER);

41

>1000x slowdown

LATEST VERSION OF SQLITE

• Due to a bug fix (for a correctness bug)▫ Breaks query optimization▫ Optimizer no longer transforms the LEFT JOIN operator

• Regression status: Not Yet Fixed▫ Searching for a fix that resolves both correctness and performance issues

CASE STUDY #1: OPTIMIZER UPDATE

JOY ARULRAJ ([email protected])

Page 41: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

CASE STUDY #2: EXECUTION ENGINE UPDATE

SELECT R0.ID FROM ORDER AS R0WHERE EXISTS (SELECT COUNT(∗)

FROM (SELECT DISTINCT R0.ENTRYFROM CUSTOMER AS R1WHERE (FALSE)) AS S1);

• Hashed aggregation executor update▫ Resulted in redundantly building hash tables

• Regression status: Fixed▫ If hash table already exists, then reuse the table

42

3x slowdown

LATEST VERSION OF POSTGRESQL

JOY ARULRAJ ([email protected])

Page 42: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

CONCLUSION

•APOLLO (v1.0)▫ Toolchain for detecting & diagnosing regressions▫ Going to be open-sourced in 2020

•Adding support for other types of bugs (v2.0)▫ Correctness bugs▫ System crashes▫ Database corruption

43JOY ARULRAJ ([email protected])

Page 43: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

CONCLUSION

• Interested in integrating APOLLO with more database systems▫ Improve the toolchain based on developer feedback

•Automation will help reduce labor cost of developing DBMSs▫ Developers get to focus on more important problems

44JOY ARULRAJ ([email protected])

Page 44: DEBUGGING YOUR DATABASE SYSTEM USING APOLLOjarulraj/talks/2019.apollo.hpts.pdfmotivation: detecting regressions 8 how to discover queries exhibiting regressions? select no from order

END@joy_arulraj

45JOY ARULRAJ ([email protected])