Cost Based Optimizer - Part 2 of 2

28
www.hotsos.com Slide 1 Copyright © 1999–2007 by Hotsos Enterprises, Ltd. Cost Based Optimizer – 2 of 2 Hotsos Enterprises, Ltd. Grapevine, Texas Oracle. Performance. Now. [email protected]

description

This is a presentation that describes how Oracle uses histograms to make decisions on SQL query execution. To see the actual webinar and demo, go https://portal.hotsos.com/events/webinars/

Transcript of Cost Based Optimizer - Part 2 of 2

Page 1: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 1Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Cost Based Optimizer – 2 of 2

Hotsos Enterprises, Ltd.

Grapevine, Texas

Oracle. Performance. Now.

[email protected]

Page 2: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 2Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Agenda

• Cost Based Optimizer and its impact on performance• Skewed Data• Histograms• Impact

– Performance (Logical I/O Impact)– Performance (Join Strategy)– Bind Variables– Cardinality and Cost

• Conclusion

Page 3: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 3Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Cost Based Optimizer

Page 4: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 4Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Cost Based Optimizer (CBO)

• The CBO in reality is a complex decision making software– Use several Database Initialization Parameters

• These are listed in the 10053 trace file

– Uses several session level initialization parameter• These are parameters at the session level that override the

database initialization parameters

– Uses statistics about the objects (Tables, Indexes)– Hints to the optimizer– Uses Statistics about the system (CPU, Disk etc)– Use this information and makes decisions on the “best way” to

generate an execution plan– Use Information about the skew of the column if that

information is gathered

Page 5: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 5Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

CBO will be part of your life if you keep working with Oracle.

• The cost-based query optimizer (CBO)…– Uses data from a variety of sources– Estimates the costs of several execution plans– Chooses the plan it estimates to be the least expensive

• Characteristics– Adapts to changing circumstances– Frustrating if you don’t know what it considers as input

• Works great if you know how to use it

• But produces very poor results if you lie to it

– The only query optimizer supported by Oracle Corporation from release 10 onward

Page 6: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 6Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

The cost-based query optimizer chooses the plan that it computes as having the lowest estimated cost.

• Don’t assume the following are identical– CBO’s estimated cost of an execution plan– The actual cost of an execution plan

• CBO’s cost estimate can be imperfect– Are your CBO inputs perfect?– CBO isn’t perfect, but by 9.2 it’s almost always good enough

• Without properly collected statistics, the CBO will– use RBO if no statistics exist on any object in the statement– use default statistics if statistics exist for a single object in the

statement but not others– use dynamic sampling to generate statistics (based on

parameter setting and Oracle version)

Page 7: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 7Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Cost Based Optimizer

dbms_stats.gather_database_stats

execution

databasetable and

indexstatistics

dbms_stats.gather_system_stats

execution

systemCPU and

I/Ostatistics

databaseschemaconfig

index, partitionmanipulation, etc.

Oracle cost-basedoptimizer

Oraclequery cost

model

execution plan

application SQLmanipulation

SQL text

Oracle DBMSauthorship

Oracleinstance

parameters

parameter editsstored outlinemanipulation

storedoutlines

Page 8: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 8Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Execution plan changes can result in profoundly different application performance.

A. Table size change

B. Device latency change

C. Execution plan change

• Type C performance changes are the most profound

200 400 600 800 1000rows

20

40

60

80

100LIOs C

1000 2000 3000 4000 5000rows

1000

2000

3000

4000

5000LIOs B

5000 10000 15000 20000 25000 30000 35000rows

2000

4000

6000

8000

10000

12000

14000

16000LIOs A

size change

performance change

performance change

performance change

Page 9: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 9Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Recap

• The CBO is a complex piece of software• It uses several data points to calculate the cost of the execution

plan and will choose the plan with the lowest cost• It is dynamic and will adapt to changing data better than the Rule

Based Optimizer• A good understanding of the Cost Based Optimizer is imperative

in understanding the rationale behind some of the choices

Page 10: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 10Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Skewed Data

Page 11: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 11Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Skewed Data

• Skewed Data is where the data distribution is not uniform• A good example is the owner column for dba_objects• The column is highly skewed• Select owner,count(*) from dba_objects • Group by owner;

Page 12: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 12Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Some kinds of data skew naturally; some don’t.

• Guaranteed to be skewed– E.g., status attribute (open | closed) of a sales order table

• Possibly not skewed– E.g., sale date attribute of a sales order table

Page 13: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 13Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Histograms

Page 14: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 14Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

What are the costs and benefits of histograms?

• Benefits of histograms– CBO sometimes needs the information to make good

decisions• Costs of histograms

– Computing histograms will consume extra computing capacity during the statistics collection

– Some CPU time and extra latching is required during plan determination for the optimizer to consider histograms

Page 15: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 15Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Histograms provide the optimizer with better information from which to derive an execution plan for a query.

• A histogram is a graphic representation of frequency distribution by means of rectangles whose widths represent class intervals and whose heights represent corresponding frequencies

• Oracle implements histograms in two ways– Height-balanced – created if column NDV > SIZE– Frequency – created if column NDV <= SIZE

Page 16: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 16Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Types of Histograms

• Frequency– Every distinct value in the column will have a count of how

many occurrences of that value• Height Balanced Histograms

– All histogram entries will have the same value but a range for the columns will be used

Page 17: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 17Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Frequency Histogram

020406080

100120140160180200

SCOTT SYS SYSTEM SYSMAN

Page 18: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 18Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Height Balanced Histogram

0102030405060708090

100

AB-SC SC-SYS SYS-SYSM

SYSM-SYST

Page 19: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 19Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Histograms can be gathered by setting the parameter for METHOD_OPT.

For a specific column:

FOR COLUMNS column_x SIZE <n|REPEAT|AUTO|SKEWONLY>

For all the columns in a table:

FOR ALL COLUMNS

For only the columns that have an index:

FOR ALL INDEXED COLUMNS

EXEC DBMS_STATS.GATHER_TABLE_STATS(

ownname=>'OP',

tabname=>'my_table',

method_opt=>'FOR COLUMNS column_x SIZE 10')

Page 20: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 20Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Histograms are not useful in all cases.

• Histograms are not useful for columns with the following characteristics:– All (or most) predicates on the column use bind variables– The column data is uniformly distributed– The column is unique and is used only with equality

predicates– Data distribution changes frequently and statistics aren't

collected to match

Page 21: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 21Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Even in the most recent Oracle versions, histogram optimization doesn’t completely work with bind variables.

• Oracle version 8– Use of bind variables prohibits histogram optimization

• Oracle version 9 and above– Oracle query optimizer “peeks” at bind value to use histogram

optimization– But only on initial hard parse of a query

Page 22: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 22Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Be prepared for how application developers might have worked around skew problems.

• The old-fashioned RBO technique

1. Create the index

2. Hard-code the selective query with “status=1”

3. Hard-code the un-selective query with “status+0=1”• A CBO technique

1. Create the index

2. Hard-code the selective query with /*+ index(t) */

3. Hard-code the un-selective query with /*+ full(t) */

Don’t resort to either of these!

Page 23: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 23Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Where Histogram Information is Stored

• DBA_TAB_HISTOGRAMS• DBA_TAB_COL_STATISTICS

Page 24: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 24Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Demo

Histogram Data Dictionary Tables

Page 25: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 25Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Impact

Performance in terms of Logical I/O’s

Page 26: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 26Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Demo

Cardinality

Page 27: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 27Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Demo

Join Cardinality

Page 28: Cost Based Optimizer - Part 2 of 2

www.hotsos.com Slide 28Copyright © 1999–2007 by Hotsos Enterprises, Ltd.

Recap

• Histograms can be really useful when gathered on skewed columns

• Histograms are specific to your data and version• Test it out and prove that gathering histograms is beneficial• Be careful of bind variable substitutions as histograms may not

be used