Post on 04-Jun-2020
Expert Oracle SQL
Optimization, Deployment, and Statistics
Tony Hasler
(IfiSfi) Apress*
Contents
J
About the Author xxii
About the Technical Reviewers xxiii
Acknowledgments xxv
Foreword xxvii
Introduction xxix
Part 1: Basic Concepts 1
Chapter 1: SQL Features 3
SQL and Declarative Programming Languages 3
Statements and SQLJDs 4
Cross-Referencing Statement and SQLJD 5
Array Interface 7
Subquery Factoring 9
The Concept of Subquery Factoring 9
Joins 16
Inner Joins and Traditional Join Syntax 16
Outer Joins and ANSI Join Syntax 17
Summary 24
Chapter 2: The Cost-Based Optimizer 25
The Optimal Execution Plan 26
The Definition of Cost 26
vii
CONTENTS
The CBO's Cost-Estimating Algorithm 27
Calculating Cost 27
The Quality of the CBO's Plan Selection 27
The Optimization Process 29
Parallelism 29
Query Transformation 32
Final State Query Optimization 34
Summary 35
Chapter 3: Basic Execution Plan Concepts 37
Displaying Execution Plans 37
Displaying the Results of EXPLAIN PLAN 37
EXPLAIN PLAN May Be Misleading 40
Displaying Output from the Cursor Cache 41
Displaying Execution Plans from the AWR 42
Understanding Operations 43
What an Operation Does 43
How Operations Interact 44
How Long Do Operations Take? 45
Summary 46
Chapter 4: The Runtime Engine 47
Collecting Operation Level Runtime Data 47
The GATHER_PLAN_STATISTICS Hint 48
Setting STATISTICS_LEVEL=ALL 48
Enabling SQL Tracing 49
Displaying Operational Level Data 49
Displaying Runtime Engine Statistics with DBMS_XPLAN.DISPLAY_CURSOR 50
Displaying Runtime Engine Statistics with V$SQL_PLAN_STATISTICS_ALL 51
Displaying Session Level Statistics with Snapper 52
The SQL Performance Monitor 53
viii
CONTENTS
Workareas 55
Operations Needing a Workarea 55
Allocating Memory to a Workarea 56
Optimal, One-Pass, and Multipass Operations 57
Shortcuts 57
Scalar Subquery Caching 58
Join Shortcuts 59
Result and OCI Caches 60
Function Result Cache 61
Summary 63
Chapter 5: Introduction to Tuning 65
Understanding the Problem 65
Understanding the Business Problem 65
Understanding the Technical Problem 66
Understanding the SQL Statement 66
Understanding the Data 67
Understanding the Problem Wrap Up 67
Analysis 67
Running the Statement to Completion 67
Analyzing Elapsed Time 68
When the Elapsed Times Doesn't Add Up 69
When the Time Does Add Up 69
Fixing the Problem 72
Check the Statistics 72
Changing the Code 73
Adding Hints 73
Making Physical Changes to the Database 74
Making Changes to the Environment 75
Running the SQL Tuning Advisor 75
Rethink the Requirement 77
Summary 77
ix
CONTENTS
Chapter 6: Object Statistics and Deployment • 79
The Principle of Performance Management 79
The Royal Mail Example 79
The Airport Example 80
Service Level Agreements in IT 80
Non-database Deployment Strategies 80
The Strategic Direction for the CBO 81
The History of Strategic Features 81
Implications of the CBO Strategy 81
Why We Need to Gather Statistics 82
How Often Do We Need to Change Execution Plans? 87
Wolfgang Breitling's Tuning by Cardinality Feedback 87
The TCF Corollary 88
Concurrent Execution Plans 88
Skewed Data and Histograms 88
Workload Variations 89
Concurrent Execution Plans Wrap Up 91
Oracle's Plan Stability Features 91
Stored Outlines 91
SQL Profiles 91
SQL Plan Baselines 91
Introducing TSTATS 92
Acknowledgements 92
Adjusting Column Statistics 92
TSTATS in a Nutshell 93
An Alternative to TSTATS 94
Deployment Options for Tuned SQL 94
When Just One SQL Statement Needs to Change 94
When Multiple SQL Statements Need to Change 95
Summary 98
X
CONTENTS
Part 2: Advanced Concepts 99
Chapter 7: Advanced SQL Concepts 101
Query Blocks and Subqueries 101
Terminology 101
How Query Blocks are Processed 102
Functions 103
Aggregate Functions 103
Analytic Functions 108
Combining Aggregate and Analytic Functions 119
Single-row Functions 121
The MODEL Clause 122
Spreadsheet Concepts 122
A Moving Median with the MODEL Clause 123
Why Not Use PL/SQL? 125
Summary 126
Chapter 8: Advanced Execution Plan Concepts 127
Displaying Additional Execution Plan Sections 127
DBMS_XPLAN Formatting Options 127
Running EXPLAIN PLAN for Analysis 129
Query Blocks and Object Alias 129
Outline Data 135
Peeked Binds 139
Predicate Information 141
Column Projection 142
Remote SQL 143
Adaptive Plans 143
Result Cache Information 147
Notes 148
xi
CONTENTS
Understanding Parallel Execution Plans 148
Operations That Can Be Run in Parallel 149
Controlling Parallel Execution 149
Granules of Parallelism 153
Data Flow Operators 155
Parallel Query Server Sets and DFO Trees 156
Table Queues and DFO Ordering 158
Multiple DFO Trees 160
Parallel Query Distribution Mechanisms 162
Why Forcing Parallel Query Doesn't Force Parallel Query 166
Further Reading 167
Understanding Global Hints 168
Hinting Data Dictionary Views 168
Applying Hints to Transformed Queries 171
The N0_MERGE Hint 171
Summary 174
Chapter 9: Object Statistics 175
The Purpose of Object Statistics 176
Creating Object Statistics 176
Gathering Object Statistics 176
Exporting and Importing Statistics 177
Transferring Statistics 180
Setting Object Statistics 181
Creating or Rebuilding Indexes and Tables 181
Creating Object Statistics Wrap Up 182
Examining Object Statistics 182
Examining Object Statistics in the Data Dictionary 182
Examining Exported Object Statistics 183
Statistic Descriptions 184
Table Statistics 184
Index Statistics 186
xii
CONTENTS
Column Statistics 191
Statistics Descriptions Wrap-up 207
Statistics and Partitions 207
Gathering Statistics on Partitioned Tables 207
How the CBO Uses Partition-level Statistics 212
Why We Need Partition-level Statistics 215
Statistics and Partitions Wrap-up 218
Restoring Statistics 218
Locking Statistics 220
Pending Statistics 220
A Footnote on Other Inputs to the CBO 225
Initialization Parameters 225
System Statistics 226
Other Data Dictionary Information 227
Summary 227
Part 3: The Cost-Based Optimizer 229
Chapter 10: Access Methods 231
Access by ROWID 231
ROWID Concepts 231
Access by ROWID 232
B-tree Index Access 237
INDEX FULL SCAN 237
INDEX RANGE SCAN 240
INDEX SKIP SCAN 242
INDEX UNIQUE SCAN 243
INDEX FAST FULL SCAN 243
INDEX SAMPLE FAST FULL SCAN 244
INDEX JOIN 245
AND_EQUAL 247
xiii
CONTENTS
Bitmap Index Access 247
Full Table Scans 252
TABLE and XMLTABLE 254
Cluster Access 258
Summary 260
Chapter 11: Joins 261
Join Methods 261
Nested loops 261
Hash joins 267
Merge joins 269
Cartesian joins 272
Join Orders 272
Join orders without hash join input swapping 272
Join orders with hash join input swapping 274
Semi-joins 278
Standard semi-joins 278
Null-accepting semi-joins 279
Anti-joins 280
Standard anti-joins 280
Null-aware anti-joins 280
Distribution Mechanisms for Parallel Joins 281
The PCLDISTRIBUTE hint and parallel joins 282
Full partition-wise joins 282
Partial partition-wise joins 283
Broadcast distribution 285
Row source replication 286
Hash distribution 287
Adaptive parallel joins 289
Data buffering 290
Bloom filtering 291
Summary 293
xiv
CONTENTS
Chapter 12: Final State Optimization 295
Join Order 295
Join Method 298
Access Method 300
IN List Iteration 302
Summary 303
Chapter 13: Optimizer Transformations 305
No-brainer Transformations 306
Count Transformation 306
Predicate Move-around 307
Set and Join Transformations 309
Join Elimination 309
Outer Join to Inner Join 313
Full Outer Join to Outer Join 315
Semi-Join to Inner Join 317
Subquery Unnesting 319
Partial Joins 323
Join Factorization 327
Set to Join 330
Aggregation Transformations 332
Distinct Aggregation 332
Distinct Placement 333
Group by Placement 335
Group by Pushdown 340
Subquery Transformations 343
Simple View Merging 343
Complex View Merging 345
Factored Subquery Materialization 348
Subquery Pushdown 351
XV
CONTENTS
Join Predicate Pushdown 356
Subquery Decorrelation 358
Subquery Coalescing 361
Miscellaneous Transformations 364
Or Expansion 364
Materialized View Rewrite 366
Grouping Sets to Union Expansion 367
Order by Elimination 369
Table Expansion 371
Star Transformation 374
The Distributed Join Filter Problem 374
Solving the Distributed Join Filter Problem 375
In the Future 381
Summary 384
Part 4: Optimization 385
Chapter 14: Why Do Things Go Wrong? 387
Cardinality Errors 387
Correlation of Columns 388
Statistics Feedback and DBMS_STATS.SEED_COL_USAGE Features 389
Functions 390
Stale Statistics 390
Daft Data Types 390
Caching Effects 391
Transitive Closure 395
Unsupported Transformations 398
Missing Information 399
Bad Physical Design 400
Contention 401
Summary 401
xvi
CONTENTS
Chapter 15: Physical Database Design 403
Adding and Removing Indexes 403
Removing Indexes 404
Identifying Required Indexes 405
Managing Contention 411
Sequence Contention 412
The Hot-block Problem 412
Partitioning 415
Full Table Scans on Partitions or Subpartitions 415
Partition-wise Joins 416
Parallelization and Partitioning 418
Denormalization 419
Materialized Views 419
Manual Aggregation and Join Tables 419
Bitmap Join Indexes 420
Compression 422
Index Compression 422
Table Compression 423
LOBs 424
Summary 424
Chapter 16: Rewriting Queries 425
Use of Expressions in Predicates 425
Equality Versus Inequality Predicates 428
Implicit Data-Type Conversions 430
Bind Variables 432
UNION, UNION ALL, and OR 432
Issues with General Purpose Views 437
How to Use Temporary Tables 438
Avoiding Multiple Similar Subqueries 441
Summary 443
xvii
CONTENTS
Chapter 17: Optimizing Sorts 445
The Mechanics of Sorting 445
Memory Limits for Sorts 445
Disk-based Sorts 446
Avoiding Sorts 448
Non-sorting Aggregate Functions 448
Index Range Scans and Index Full Scans 451
Avoiding Duplicate Sorts 453
Sorting Fewer Columns 454
Taking Advantage of ROWIDs 454
Solving the Pagination Problem 459
Sorting Fewer Rows 461
Additional Predicates with Analytic Functions 462
Views with Lateral Joins 464
Avoiding Data Densification 469
Parallel Sorts 473
Summary 475
Chapter 18: Using Hints 477
Are Hints Supportable? 478
The PUSH_SUBQ story 478
The DML error logging story 478
Documented versus undocumented hints 479
The MODEL clause corollary 479
Supportability conclusion 480
Types of Hints 481
Edition -based redefinition hints 481
Hints that cause errors 481
Runtime engine hints 483
Optimizer hints that are hints 486
xviii
CONTENTS
Production-hinting case studies 496
The bushy join 496
Materialization of factored subqueries 499
Suppressing order by elimination and subquery unnesting 502
The v$database_block_corruption view 506
Summary 506
Chapter 19: Advanced Tuning Techniques 507
Leveraging an INDEX FAST FULL SCAN 507
Simulating a Star Transformation 508
Simulating an INDEX JOIN 509
Joining Multi-Column Indexes 511
Using ROWID Ranges for Application-Coded Parallel Execution 513
Converting an Inner Join to an Outer Join 516
Summary 521
Part 5: Managing Statistics with TSTATS 523
Chapter 20: Managing Statistics with TSTATS 525
Managing Column Statistics 527
Time-based columns 527
Columns with NUM_DISTINCT=1 529
Skewed column values and range predicates 535
Correlated columns and expressions 536
Use of sample data for complex statistical issues 536
Managing column statistics wrap up 546
Statistics and Partitions 546
The DBMS_STATS. COPY_TABLE_STATS myth 547
Cardinality estimates with global statistics 554
Costing full table scans of table partitions -556
six
CONTENTS
Temporary Tables 564
The pros and cons of dynamic sampling 565
Fabricating statistics for temporary tables 565
How to Deploy TSTATS 573
Summary 575
Index 577
XX