Explain Class

30
Explains Explained

Transcript of Explain Class

Page 1: Explain Class

Explains Explained

Page 2: Explain Class

What is an Explain???

DB2 EXPLAIN is a monitoring tool that produces information about:

–A plan, package, or SQL statement when it is bound. The output appears in a table that you create, called a plan table.

–The estimated cost of executing a SELECT, INSERT, UPDATE, or DELETE statement. The output appears in a table that you create, called a statement table.

Page 3: Explain Class

What does this mean to me???• Information that EXPLAIN provides:

– The primary use of EXPLAIN is to observe the access paths for the SELECT parts of your statements.

– The information in the plan table can help you:

Determine the access path chosen for a query

Design databases, indexes, and application programs

Determine when to rebind an application

• For each access to a single table, EXPLAIN tells you if DB2 uses index access or a table space scan.

• For indexes, EXPLAIN tells you how many indexes and index columns are used and what I/O methods are used to read the pages.

• For joins of tables, EXPLAIN tells you the join method and type, the order in which DB2 joins the tables, and when and why it sorts any rows.

Page 4: Explain Class

How do I explain my query/program?

• Bind program using parm EXPLAIN(YES)– Binds all SQL explainable SQL in program

• Via SQL statement, for a single query as:– EXPLAIN PLAN SET QUERYNO=nnnn FOR

• Princeton Softech clist– on command line enter ADB2EXPL

Page 5: Explain Class

Where does explain info reside?

• Create your own PLAN_TABLE(will only be on system for 30 - 90 days)

• Use application plan_table• Follows naming convention

xxxrUPD.PLAN_TABLE– xxx = Application name(eg. VAM, NEV, ALK)

– r = Region(eg. db2U85, db2Q36, db2S49)

Page 6: Explain Class

Reading Your EXPLAIN• SQL Query

• SELECT * FROM X.PLAN_TABLE

• In QMF, user friendly query, A9139.EXP_QUERY/A9139.EXP_PROG

• Other tools: The following tools can help you tune SQL queries:

DB2 Visual Explain

– Visual Explain is a graphical workstation feature of DB2 that provides:

An easy-to-understand display of a selected access path

Suggestions for changing an SQL statement

An ability to invoke EXPLAIN for dynamic SQL statements

An ability to provide DB2 catalog statistics for referenced objects of an access path

A subsystem parameter browser with keyword 'Find' capabilities

Princeton Softech has a CLIST you can run, ADB2EXE

Page 7: Explain Class

What questions to ask???• "Is access through an index? (ACCESSTYPE is I, I1, N or MX)" in topic 6.4.2.1

• "Is access through more than one index? (ACCESSTYPE=M)" in topic 6.4.2.2

• "How many columns of the index are used in matching? (MATCHCOLS=n)" in topic 6.4.2.3

• "Is the query satisfied using only the index? (INDEXONLY=Y)" in topic 6.4.2.4

• "Is direct row access possible? (PRIMARY_ACCESSTYPE = D)" in topic 6.4.2.5

• "Is a view or nested table expression materialized?" in topic 6.4.2.6

• "Was a scan limited to certain partitions? (PAGE_RANGE=Y)" in topic 6.4.2.7

• "What kind of prefetching is done? (PREFETCH = L, S, or blank)" in topic 6.4.2.8

• "Is data accessed or processed in parallel? (PARALLELISM_MODE is I, C, or X)" in topic 6.4.2.9

• "Are sorts performed?" in topic 6.4.2.10

• "Is a subquery transformed into a join?" in topic 6.4.2.11

• "When are column functions evaluated? (COLUMN_FN_EVAL)" in topic 6.4.2.12

Page 8: Explain Class

Access Methods

• A number (0, 1, 2, 3, or 4) that indicates the join method used for the step:– 0 First table accessed, continuation of previous table accessed, or not

used.

– 1 Nested loop join. For each row of the present composite table, matching rows of a new table are found and joined.

– 2 Merge scan join. The present composite table and the new table are scanned in the order of the join columns, and matching rows are joined.

– 3 Sorts needed by ORDER BY, GROUP BY, SELECT DISTINCT, UNION, a quantified predicate, or an IN predicate. This step does not access a new table.

– 4 Hybrid join. The current composite table is scanned in the order of the join-column rows of the new table. The new table is accessed using list

prefetch.

Page 9: Explain Class

Nested Loop Join• DB2 scans the composite (outer) table. For each row in that table that

qualifies (by satisfying the predicates on that table), DB2 searches for matching rows of the new (inner) table. It concatenates any it finds with the current row of the composite table. If no rows match

the current row, then: For an inner join, DB2 discards the current row.

For an outer join, DB2 concatenates a row of null values.

• The nested loop join repetitively scans the inner table. That is, DB2 scans the outer table once, and scans the inner table as many times as the number of qualifying rows in the outer table. Hence, the nested loop join is usually the most efficient join method when the values of the join column passed to the inner table are in sequence and the index on the join column of the inner table is clustered, or the number of rows retrieved in the inner table through the index is small.

Page 10: Explain Class
Page 11: Explain Class

Merge Scan Join• DB2 scans both tables in the order of the join columns. If no efficient indexes

on the join columns provide the order, DB2 might sort the outer table, the inner table, or both. The inner table is put into a work file; the outer table is put into a work file only if it must be sorted. When a row of the outer table matches a row of the inner table, DB2 returns the combined rows.

• DB2 then reads another row of the inner table that might match the same row of the outer table and continues reading rows of the inner table as long as there is a match. When there is no longer a match, DB2 reads another row of the outer table.

If that row has the same value in the join column, DB2 reads again the matching group of records from the inner table. Thus, a group of duplicate records in the inner table is scanned as many times as there are matching records in the outer table.

Page 12: Explain Class
Page 13: Explain Class

Hybrid Join• The method requires obtaining RIDs in the order needed to use list prefetch. The

steps are shown in Figure 212 in topic 6.4.4.4. In that example, both the outer table (OUTER) and the inner table (INNER) have indexes on the join columns.

1 Scans the outer table (OUTER).

2 Joins the outer tables with RIDs from the index on the inner table. The result is the phase 1 intermediate table. The index of the inner table is scanned for every row of the outer table.

3 Sorts the data in the outer table and the RIDs, creating a sorted RID list and the phase 2 intermediate table. The sort is indicated by a value of Y in column SORTN_JOIN of the plan table. If the index on the inner table is a clustering index, DB2 can skip this sort; the value in SORTN_JOIN is then N.

4 Retrieves the data from the inner table, using list prefetch.

5 Concatenates the data from the inner table and the phase 2 intermediate table to create the final composite table.

Page 14: Explain Class

Hybrid Join Explain Values• METHOD='4'

– A hybrid join was used.

• SORTC_JOIN='Y' – The composite table was sorted.

• SORTN_JOIN='Y' – The intermediate table was sorted in the order of inner table RIDs. A non-clustered

index accessed the inner table RIDs.

• SORTN_JOIN='N' – The intermediate table RIDs were not sorted. A clustered index retrieved the inner table

RIDs, and the RIDs were already well ordered.

• PREFETCH='L' – Pages were read using list prefetch.

Page 15: Explain Class
Page 16: Explain Class

Index Access

• Is access through an index? • If the column ACCESSTYPE in the plan table has one of the values(I, I1, N or MX),

then DB2 uses an index to access the table named in column TNAME. The columns ACCESSCREATOR and ACCESSNAME identify the index.

– Matching index scan (MATCHCOLS>0)

– Index screening

– Nonmatching index scan (ACCESSTYPE=I and MATCHCOLS=0)

– IN-list index scan (ACCESSTYPE=N)

– Multiple index access (ACCESSTYPE is M, MX, MI, or MU)

– One-fetch access (ACCESSTYPE=I1)

– Index-only access (INDEXONLY=Y)

– Equal unique index (MATCHCOLS=number of index columns)

Page 17: Explain Class

Matching Index Scan• In a matching index scan, predicates are specified on either the leading or all of the

index key columns. These predicates provide filtering; only specific index pages and data pages need to be accessed. If the degree of filtering is high, the matching index scan is efficient.

• In the general case, the rules for determining the number of matching columns are simple, although there are a few exceptions.

Look at the index columns from leading to trailing. For each index column, search for an indexable boolean term predicate on that column. If such a predicate is found, then it can be used as a matching predicate.

– Column MATCHCOLS in a plan table shows how many of the index columns are matched by predicates.

If no more matching predicates are found, the search for matching predicates stops.

If a matching predicate is a range predicate, then there can be no more matching columns.

Page 18: Explain Class

Index Screening

• In index screening, predicates are specified on index key columns but are not part of the matching columns. Those predicates improve the index access by reducing the number of rows that qualify while searching the index. For example, with an index on T(C1,C2,C3,C4) in the following SQL statement, C3>0 and C4=2 are index screening predicates.

SELECT * FROM T WHERE C1 = 1 AND C3 > 0 AND C4 = 2 AND C5 = 8;

• The predicates can be applied on the index, but they are not matching predicates. C5=8 is not a column in the index, therefore, it must be evaluated when the datapage is retrieved. The value of MATCHCOLS in the plan table in this case would be 1.

Page 19: Explain Class

Nonmatching Index Scan

• In a nonmatching index scan no matching columns are in the index.

Hence, all the index keys must be examined. • Because a nonmatching index usually provides little or no filtering,

only a few cases provide an efficient access path. The following

situations are examples: – When index screening predicates exist

– When the clause OPTIMIZE FOR n ROWS is used

– When more than one table exists in a nonsegmented table space

Page 20: Explain Class

IN-list Index Scan• An IN-list index scan is a special case of the matching index scan, in

which a single indexable IN predicate is used as a matching equal

predicate. • You can regard the IN-list index scan as a series of matching index scans

with the values in the IN predicate being used for each matching index scan. The following example has an index on (C1,C2,C3,C4) and might use an IN-

list index scan: SELECT * FROM T

WHERE C1=1 AND C2 IN (1,2,3) AND C3>0 AND C4<100;

• The plan table shows MATCHCOLS = 3 and ACCESSTYPE = N. The IN-list scan is performed as the following three matching index scans:

• (C1=1,C2=1,C3>0), (C1=1,C2=2,C3>0), (C1=1,C2=3,C3>0)

Page 21: Explain Class

Multiple Index Access

• Multiple index access uses more than one index to access a table. It is a good access path when:

No single index provides efficient access.

A combination of index accesses provides efficient access.

• RID lists are constructed for each of the indexes involved. The unions or intersections of the RID lists produce a final list of qualified RIDs that is used to retrieve the result rows, using list prefetch. You can consider multiple index access as an extension to list prefetch with more complex RID retrieval operations in its first phase. The complex operators are union and intersection.

• Not a typically recommended access path. Preferably not in any online transaction.

Page 22: Explain Class

One-Fetch Access• One-fetch index access requires retrieving only one row. It is the best possible access

path and is chosen whenever it is available. It applies to a statement with a MIN or MAX column function: the order of the index allows a single row to give the result of the function.

• One-fetch index access is a possible access path when:

There is only one table in the query.

There is only one column function (either MIN or MAX).

Either no predicate or all predicates are matching predicates for the index.

There is no GROUP BY.

Column functions are on:

The first index column if there are no predicates

The last matching column of the index if the last matching predicate is a range type

The next index column (after the last matching column) if all matching predicates are equal type

Page 23: Explain Class

Index-Only Access

• With index-only access, the access path does not require any data pages because the access information is available in the index. Conversely, when an SQL statement requests a column that is not in the index, updates any column in the table, or deletes a row, DB2 has to access the associated data pages. Because the index is almost always smaller than the table itself, an index-only access path

usually processes the data efficiently.

Page 24: Explain Class

Equal Unique Index

• An index that is fully matched and unique, and in which all matching predicates are equal-predicates, is called an equal unique index case. This case guarantees that only one row is retrieved. If there is no one-fetch index access available, this is considered the most efficient access over all other indexes that are not equal unique. (The uniqueness of an index is determined by whether or not it was defined as unique.)

• Sometimes DB2 can determine that an index that is not fully matching is actually an equal unique index case.

Page 25: Explain Class

Am I using Prefetch???

• Prefetching is a method of determining in advance that a set of data pages is about to be used and then reading the entire set into a buffer with a single asynchronous I/O operation.

Page 26: Explain Class

If the value of PREFETCH is:

S, the method is called sequential prefetch. The data pages that are read in advance are sequential. A table space scan always uses sequential prefetch. An index scan might not use it.

L, the method is called list prefetch. One or more indexes are used to select the RIDs for a list of data pages to be read in advance; the pages need not be sequential. Usually, the RIDs are sorted. The exception is the case of a hybrid join when the value of column SORTN_JOIN is N.

Blank, prefetching is not chosen as an access method. However, depending on the pattern of the page access, data can be prefetched at execution time through a process called sequential detection.

Page 27: Explain Class

List Prefetch Warning• During execution, DB2 ends list prefetching if more than 25%

of the rows in the table (with a minimum of 4075) must be accessed. Record IFCID 0125 in the performance trace, mapped by macro DSNDQW01, indicates whether list prefetch ended.

• When list prefetch ends, the query continues processing by a

method that depends on the current access path. For access through a single index or through the union of RID lists from

two indexes, processing continues by a table space scan.

For index access before forming an intersection of RID lists, processing continues with the next step of multiple index access. If no step remains and no RID list has been accumulated, processing continues by a table space scan.

Page 28: Explain Class

Is part of query being Materialized???

• Sometimes DB2 has to materialize a result table in conjunction with other joins, views, or nested table expressions. You can tell when this happens by looking at the TABLE_TYPE and TNAME columns of the plan table.

• When DB2 chooses materialization, TABLE_TYPE contains a ‘W’ or a ‘Q’

• W = Actual materialization

• Q = Virtual materialization

Page 29: Explain Class

Sorts• SORTN_JOIN and SORTC_JOIN: SORTN_JOIN indicates that the new table

of a join is sorted before the join. (For hybrid join, this is a sort of the RID list.) When SORTN_JOIN and SORTC_JOIN are both 'Y', two sorts are performed for the join. The sorts for joins are indicated on the same row as the new table access.

• METHOD 3 sorts: These are used for ORDER BY, GROUP BY, SELECT DISTINCT, UNION, or a quantified predicate. A single row of the plan table can indicate two sorts of a composite table, but only one sort is actually done.

• SORTC_UNIQ and SORTC_ORDERBY: SORTC_UNIQ indicates a sort to remove duplicates, as might be needed by a SELECT statement with DISTINCT or UNION. SORTC_ORDERBY usually indicates a sort for an ORDER BY clause. But SORTC_UNIQ and SORTC_ORDERBY also indicate when the results of a noncorrelated subquery are sorted, both to remove duplicates and to

order the results. One sort does both the removal and the ordering.

Page 30: Explain Class

• Most information for this class can be found in the Application Programming and SQL Guide(SC26-9933-03)

• The easiest way to find explain information documentation is via the web page:– http://www-3.ibm.com/software/data/db2/

os390/v7books.html