Oracle statistics by example
-
Upload
mauro-pagano -
Category
Software
-
view
89 -
download
0
Transcript of Oracle statistics by example
OracleStatisticsbyExample
MauroPagano
Background• Optimizergeneratesexecutionplans• ManyexecutionplansforeachSQL• Optimalexecutionplanhaslowercost(*)• Costiscomputedbasedon– Statisticalformulas(OracleIP)–ManystatisticsaroundtheSQL(seededbyus)
1/29/17 2
Someterminology• Cost– Unitofmeasuretocompareplanestimatedperf– Equivalenttoexpected#singleblockreads
• Cardinality– Numberofrowshandled,produced/consumed
• Selectivity– %offilteringcausedbypredicates,rangeis[0,1]– Outputcard=inputcard*selectivity
1/29/17 3
Whysomuchemphasis?
• Statisticsare“picture”aboutentities• Qualityofthepictureaffectsqualityplan– Poorstatsgenerallyleadtopoorplans(*)– Betterstatsgenerallyleadtobetterplans(*)
• Ourbestbetistoprovidegoodqualitystats– Notalwaysastrivialasitsounds
1/29/17 4
Manytypeofstatistics• OracleOptimizerusesstatisticsabout– Objects:tables,indexes,columns,etc– System:CPUSpeedandmanyIOmetrics– Dictionary:Oracleinternalphysicalobjects– FixedObjects:memorystructure(X$)
• FirsttwoaffectapplicationSQLs– Focusofthispresentationisobjectstatistics
1/29/17 5
WhatshouldIdoaboutstatistics?• CollectthemJ– Objectstatswhenthereare“enough”changes– Systemstatsonce,ifany(*)
• Oracle-seededpackageDBMS_STATS• Usedtocollectalltypeofstatistics– Plusdrop,exp/imp,setprefs,etc etc
• Manyparams toaffecthow/whattocollect– Canhavelargeimpactonquality
1/29/17 6
WhenshouldIgatherstats?
• Nospecificthresholdintermsoftime• Balancebetweenfrequencyandquality– Gatherhighqualityisexpensivethusslowexec– Gatherfrequentlyrequirefastexec
• Optimalplanstendnottochangeovertime– Favorqualityoverfrequency
1/29/17 7
How?DBMS_STATS.GATHER_TABLE_STATS (
ownname VARCHAR2, tabname VARCHAR2, partname VARCHAR2 DEFAULT NULL, estimate_percent NUMBER DEFAULT
to_estimate_percent_type (get_param('ESTIMATE_PERCENT')), block_sample BOOLEAN DEFAULT FALSE, method_opt VARCHAR2 DEFAULT get_param('METHOD_OPT'), degree NUMBER DEFAULT to_degree_type(get_param('DEGREE')),granularity VARCHAR2 DEFAULT GET_PARAM('GRANULARITY'), cascade BOOLEAN DEFAULT to_cascade_type(get_param('CASCADE')),stattab VARCHAR2 DEFAULT NULL, statid VARCHAR2 DEFAULT NULL, statown VARCHAR2 DEFAULT NULL, no_invalidate BOOLEAN DEFAULT
to_no_invalidate_type ( get_param('NO_INVALIDATE')), stattype VARCHAR2 DEFAULT 'DATA', force BOOLEAN DEFAULT FALSE, context DBMS_STATS.CCONTEXT DEFAULT NULL, -- non operative options VARCHAR2 DEFAULT 'GATHER');
1/29/17 8
Thatlooksreallycomplex!• EasiestthingisletOracleusedefaults– Justpassownerandobjectname– Thisisalsotherecommendedwaystarting11g–Manyfeaturesdependondefaultvalues• 12chistograms,Incremental,Concurrent
• Assimpleas– exec dbms_stats.gather_table_stats(user,'T1')
1/29/17 9
Whatdidwejustdo?• Gathered:– tablestatisticsontableT1– columnstatisticsforeverycolumn– indexstatisticsoneveryindexdefinedonT1– (sub)partitionstatistics– histogramsonsubsetofcolumns(*)
• We’llcovernextstatsthatmatterstoCBO
1/29/17 10
Tablestatistics• Optimizeronlyusestwostatistics– NumberofblocksbelowHWM• [ALL|DBA|USER]_TABLES.NUM_BLOCKS• UsedtocostFullTableScanoperations
– Numberofrowsinthetable• [ALL|DBA|USER]_TABLES.NUM_ROWS• Usedtoestimatehowmanyrowswedealingwith
1/29/17 11
Tablestatistics– FTScostselect table_name,num_rows,blocks from user_tables where table_name='T1';
TABLE_NAME NUM_ROWS BLOCKS------------------------------ ---------- ----------T1 920560 16378
explain plan for select * from t1;select * from table(dbms_xplan.display);
Plan hash value: 3617692013----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 920K| 100M| 4463 (1)| 00:00:01 || 1 | TABLE ACCESS STORAGE FULL| T1 | 920K| 100M| 4463 (1)| 00:00:01 |----------------------------------------------------------------------------------
1/29/17 12
Tablestatistics– FTScostselect table_name,num_rows,blocks from user_tables where table_name='T1';
TABLE_NAME NUM_ROWS BLOCKS------------------------------ ---------- ----------T1 920560 30000
explain plan for select * from t1;select * from table(dbms_xplan.display);
Plan hash value: 3617692013----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 920K| 100M| 8156 (1)| 00:00:01 || 1 | TABLE ACCESS STORAGE FULL| T1 | 920K| 100M| 8156 (1)| 00:00:01 |----------------------------------------------------------------------------------
1/29/17 13
Tablestatistics– Cardinalityselect table_name,num_rows,blocks from user_tables where table_name='T1';
TABLE_NAME NUM_ROWS BLOCKS------------------------------ ---------- ----------T1 920560 16378
explain plan for select * from t1;select * from table(dbms_xplan.display);
Plan hash value: 3617692013----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 920K| 100M| 4463 (1)| 00:00:01 || 1 | TABLE ACCESS STORAGE FULL| T1 | 920K| 100M| 4463 (1)| 00:00:01 |----------------------------------------------------------------------------------
1/29/17 14
Tablestatistics– Cardinalityselect table_name,num_rows,blocks from user_tables where table_name='T1';
TABLE_NAME NUM_ROWS BLOCKS------------------------------ ---------- ----------T1 1 16378
explain plan for select * from t1;select * from table(dbms_xplan.display);
Plan hash value: 3617692013----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1| 115| 4442 (1)| 00:00:01 || 1 | TABLE ACCESS STORAGE FULL| T1 | 1| 115| 4442 (1)| 00:00:01 |----------------------------------------------------------------------------------
1/29/17 15
ColumnStatistics• Optimizeruses
– Numberofdistinctvalues(NDV)• [ALL|DBA|USER]_TAB_COLS.NUM_DISTINCT• Usedtodetermineselectivity(nohistogrampresent)
– NumberofNULLs• [ALL|DBA|USER]_TAB_COLS.NUM_NULLS• Usedtoestimatehowmanyrowswedealingwith
– Min/Maxvalue• [ALL|DBA|USER]_TAB_COLS.[LOW|HIGH]_VALUE• Usedtodeterminein|out-of range
1/29/17 16
Column statistics– NoHgrm
1/29/17 17
select column_name, num_distinct, num_nulls, histogram from user_tab_colswhere table_name = 'T1' and column_name like '%OBJECT_ID';COLUMN_NAME NUM_DISTINCT NUM_NULLS HISTOGRAM------------------------------ ------------ ---------- ---------------OBJECT_ID 93192 0 NONEDATA_OBJECT_ID 8426 835930 NONE
explain plan for select * from t1 where object_id = 1234;----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 10 | 1150 | 4453 (1)| 00:00:01 ||* 1 | TABLE ACCESS STORAGE FULL| T1 | 10 | 1150 | 4453 (1)| 00:00:01 |----------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------
1 - storage("OBJECT_ID"=1234) filter("OBJECT_ID"=1234)
Let’sdothemath!Totalrows:920560
NDV:93192920560*1/93192~=10
Column statistics– NoHgrm
1/29/17 18
select column_name, num_distinct, num_nulls, histogram from user_tab_colswhere table_name = 'T1' and column_name like '%OBJECT_ID';COLUMN_NAME NUM_DISTINCT NUM_NULLS HISTOGRAM------------------------------ ------------ ---------- ---------------OBJECT_ID 93192 0 NONEDATA_OBJECT_ID 8426 835930 NONE
explain plan for select * from t1 where data_object_id = 1234;----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 10 | 1150 | 4454 (1)| 00:00:01 ||* 1 | TABLE ACCESS STORAGE FULL| T1 | 10 | 1150 | 4454 (1)| 00:00:01 |----------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------
1 - storage(”DATA_OBJECT_ID"=1234) filter(”DATA_OBJECT_ID"=1234)
Let’sdothemath!Totalrows:920560TotalNULLs:835930
NDV:8426(920560– 835930)/8426~=10
Column statistics– Min/Max
1/29/17 19
cook_raw(low_value,'NUMBER') low_v,cook_raw(high_value, 'NUMBER') high_v
COLUMN_NAME NUM_DISTINCT LOW_VALU HIGH_VAL------------------------------ ------------ -------- --------OBJECT_ID 93192 2 99953DATA_OBJECT_ID 8426 0 99953
----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
explain plan for select * from t1 where object_id = 99953;|* 1 | TABLE ACCESS STORAGE FULL| T1 | 10 | 1150 | 4453 (1)| 00:00:01 |
explain plan for select * from t1 where object_id = 150000;|* 1 | TABLE ACCESS STORAGE FULL| T1 | 5 | 575 | 4453 (1)| 00:00:01 |
Themorewemovefarawayfromtherange,thelowertheestimation
ColumnStatistics• Optimizeralsouses– Density
• Notstoredindictionary(oldonewas,newoneno)• Usedforunpopularvalueselectivity
– Histogram• [ALL|DBA|USER]_TAB_COLS.LOW_VALUE• [ALL|DBA|USER]_TAB_COLS.HIGH_VALUE• [ALL|DBA|USER]_TAB_HISTOGRAMS• Usedforpopularvalueselectivity
1/29/17 20
Whatisahistogram?• Describedatadistributionskewness– HelptheCBOgetmoreaccurateestimations
• Manytypesavailable– Frequency– 1bucketperNDV– Top-frequency– 1bucketpertopNDV– Hybrid– 1bucketperpopularvalue,otherssplit
• Creationinfluencedbymethod_opt param1/29/17 21
Whatdoesitlooklike?
1/29/17 22
Column statistics– Histogram
1/29/17 23
explain plan for select count(*) from t1 where object_type = 'INDEX';-------------------------------------------------------------------------| Id |Operation |Name|Rows |Bytes | ost (%CPU)|Time |-------------------------------------------------------------------------| 0|SELECT STATEMENT | | 1| 9 | 4455 (1)|00:00:01|| 1| SORT AGGREGATE | | 1| 9 | | ||* 2| TABLE ACCESS STORAGE FULL|T1 |44990| 395K| 4455 (1)|00:00:01|-------------------------------------------------------------------------2 - storage("OBJECT_TYPE"='INDEX') filter("OBJECT_TYPE"='INDEX')
explain plan for select count(*) from t1 where object_type = 'TABLE';-------------------------------------------------------------------------| Id |Operation |Name|Rows |Bytes | ost (%CPU)|Time |-------------------------------------------------------------------------| 0|SELECT STATEMENT | | 1| 9 | 4455 (1)|00:00:01|| 1| SORT AGGREGATE | | 1| 9 | | ||* 2| TABLE ACCESS STORAGE FULL|T1 |24980| 219K| 4455 (1)|00:00:01|-------------------------------------------------------------------------2 - storage("OBJECT_TYPE"='TABLE') filter("OBJECT_TYPE"='TABLE')
Differentvalueshavedifferentestimation
thankstothehistogram
Whatisanindex?• Structurethatstorespairkey(s)-location– Key(s)arestoredinsortedorder
• UsedtoidentifyrowsofinterestwithoutFTS– Navigatingindexandextractionlocation(s)
• Dependingonfilters,fasterthanFTS(ornot)– Nofixedthreshold,cheaperoptionwins
1/29/17 24
IndexStatistics• Optimizeruses
– Blevel• [ALL|DBA|USER]_INDEXES.BLEVEL• Usedtoestimatehowexpensiveistolocatefirstleaf
– Numberofleafblocks(LB)• [ALL|DBA|USER]_INDEXES.LEAF_BLOCKS• Usedtoestimatehowmanyindexleafblockstoread
– ClusteringFactor(CLUF)• [ALL|DBA|USER]_INDEXES.CLUSTERING_FACTOR• Usedtoestimatehowmanytableblockstoread
– DistinctKeys(DK)• [ALL|DBA|USER]_INDEXES.DISTINCT_KEYS• Usedtohelpwithdatacorrelation
1/29/17 25
Whatdoesitlooklike?
1/29/17 26
B B B B B B
Root
Branches
Leaves
Leavesarechainedbackandforthforasc/desc scan
NumberofjumpsisCLUF
IndexStatistics
1/29/17 27
select index_name, blevel, leaf_blocks, distinct_keys, clustering_factorfrom user_indexes where index_name = 'T1_IDX';
INDEX_NAME BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR----------- ---------- ----------- ------------- -----------------T1_IDX 2 2039 92056 920530
explain plan for select * from t1 where object_id = 1234;-----------------------------------------------------------------------------| Id | Operation |Name |Rows | Bytes|Cost (%CPU)|-----------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 10| 1150| 13 (0)|| 1 | TABLE ACCESS BY INDEX ROWID BATCHED|T1 | 10| 1150| 13 (0)||* 2 | INDEX RANGE SCAN |T1_IDX| 10| | 3 (0)|-----------------------------------------------------------------------------
2 - access("OBJECT_ID"=1234)
Distinctkeysis100%accurateNUM_DISTINCTisapproximated
IfCLUF~=numberofrowsinthetable,inefficientindex
Costjumps10for10rows(from3to13)as
consequenceofbadCLUF
ExtendedStatistics• ProvideadditionalinfotoCBOabout– Datacorrelation(functionaldependencies)– Expressionsappliedtocolumn(s)
• Needtobemanuallyimplemented– Automaticallyin12c,notbulletproofyet
• Lackofusuallytranslatesinestim mistakes
1/29/17 28
Extendedstatistics– Expression
1/29/17 29
explain plan for select count(*) from t1 where lower(object_type) = 'index';
-----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 9 | 4459 (1)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | 9 | | ||* 2 | TABLE ACCESS STORAGE FULL| T1 | 9206 | 82854 | 4459 (1)| 00:00:01 |-----------------------------------------------------------------------------------2 - storage(LOWER("OBJECT_TYPE")='index') filter(LOWER("OBJECT_TYPE")='index')
dbms_stats.gather_table_stats(user,'T1',method_opt=>'FOR COLUMNS (lower(object_type)) SIZE 254');-----------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 9 | 4251 (1)| 00:00:01 || 1 | SORT AGGREGATE | | 1 | 9 | | ||* 2 | TABLE ACCESS STORAGE FULL| T1 | 44990 | 395K| 4251 (1)| 00:00:01 |-----------------------------------------------------------------------------------2 - storage(LOWER("OBJECT_TYPE")='index') filter(LOWER("OBJECT_TYPE")='index')
Incorrectestimation,weknowtherightoneis
~45k
CorrectestimationJ
estimate_percent• Amountofdatatosampleforgatheringstats• Hasanimpactontimetogatherandquality• Recommended(default)AUTO_SAMPLE_SIZE– Notrecommendedin10g,yesin11gonwards– Requiredformanyfeatures– UseHyperLogLog algorithminternally(*)
1/29/17 30
method_opt• Onwhichcolumnsgatherstats• Onwhichcolumnsgatherhistograms(#buckets)• Recom (default)FOR ALL COLUMNS SIZE AUTO– Notrecommendedin10g,yesin11gonwards– Oracledetermineshist/no-hist basedoncolusage– Ifappknowsbetter,followapprecommendations
1/29/17 31
Can’tOracledoitforme?• Oracleprovidesnightlyjobtogatherstats– Doesadecentjobstarting11g(sosoin10g)– Prioritizetablesorderdependingon#changes– Onlyallowedtorunforfixednumberofhours• Mightnottouchallneededobjects
– Collectsobjectanddictionarystatsonly• Appsmighthavespecificreq,followthem
1/29/17 32
33
References• OracleDatabasePL/SQLPackagesandTypesReference12.1
• OracleDatabaseSQLTuningGuide12.1• http://blogs.oracle.com/optimizer• MasterNote:OptimizerStatistics(DocID1369591.1)
34
ContactInformation• http://mauro-pagano.com– Tools• SQLd360,TUNAs360,Pathfinder
• Email– [email protected]
35