1 Index Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2005 Julian Dyke.
-
Upload
kenzie-printup -
Category
Documents
-
view
251 -
download
2
Transcript of 1 Index Internals Julian Dyke Independent Consultant Web Version juliandyke.com © 2005 Julian Dyke.
1
IndexInternals
Julian Dyke
Independent ConsultantWeb Version
juliandyke.com
© 2005 Julian Dyke
2
© 2005 Julian Dykejuliandyke.com
Agenda1. Introduction
2. Block Structure
3. Block Compression
4. Insertion
5. Deletion
6. Coalesce / Rebuild
7. Freelists
8. Virtual Indexes
3
© 2005 Julian Dykejuliandyke.com
B*Tree Indexes
Based on modified B*Tree algorithm Contain branch blocks and leaf blocks Blocks contain keys and data Keys maintained in sorted order within blocks All leaf blocks are at the same depth All blocks are on average 75% full
4
© 2005 Julian Dykejuliandyke.com
Index Types There are several recent variants of B*tree
indexes including
Type Introduced
Bitmap Indexes 7.3.2
Index Organised Table 8.0
Partitioned Indexes 8.0
Reverse Key 8.0
LOB Index 8.0
Compressed 8.1.5
Function-Based Indexes 8.1.5
Descending 8.1.5
Virtual Indexes 8.1.5
Bitmap Join Indexes 9.0.1
5
© 2005 Julian Dykejuliandyke.com
Limits Maximum number of B*tree levels is 24 Maximum number of columns is 16 in 7.3 and
below; 32 in 8.0 and above Maximum key lengths vary with release and
block size
Block Size 8.1.7 9.0.1 9.2.0
2048 758 1526 1478
4096 1578 3166 3118
8192 3218 6446 6398
16384 6498 13006 12958
6
© 2005 Julian Dykejuliandyke.com
Leaf Blocks Every index has a least one leaf block
Each leaf block contains 0 or more rows
Each row contains a key and data
Indexes can be unique or non-unique
Leaf row formats differ for unique and non-unique indexes
7
© 2005 Julian Dykejuliandyke.com
Leaf Block Structure
20 bytes
72 bytes
16 bytes
16 bytes
2 bytes per row
4 bytes
Block Common Header
Transaction Header
Index Header
Index Leaf Header
Index Leaf Rows
Tail
Free Space
Slot Array
Block Size 2 bytes
8
© 2005 Julian Dykejuliandyke.com
Branch Blocks
Indexes may contain branch blocks
Branch blocks point to other branch blocks or leaf blocks
Branch blocks contain 0 or more rows
Each row has a suffix compressed key and a pointer to the next block
Compressed rows are terminated with 0xFE byte
9
© 2005 Julian Dykejuliandyke.com
Branch Block Structure
20 bytesBlock Common Header
Transaction Header
Index Header
Index Branch Header
Index Branch Rows
Tail
Free Space
Slot Array
48 bytes
16 bytes
24 bytes
2 bytes per row
4 bytes
Block Size 2 bytes
10
© 2005 Julian Dykejuliandyke.com
Branch Blocks Each block has a pointer to the left hand side of
the tree. This is part of the header A branch block containing N rows points to N+1
blocks.
S
D E U
DEN ENG SCOSPA
USAAUSBELCAN
Branch Blocks
Root Block
Leaf Blocks Level 0
Level 1
Level 2
11
© 2005 Julian Dykejuliandyke.com
Root Block
Every index has one root block May be a leaf block or a branch block Can be an empty leaf block Always the next block after the segment header
in the first extent
Segment Header
RootBlock
…..
First Extent
12
© 2005 Julian Dykejuliandyke.com
BLEVEL versus Height
BLEVEL is the number of branch block levels in the B*tree
ANALYZE INDEX i1 COMPUTE STATISTICS;
SELECT blevel FROM dba_indexesWHERE index_name = 'I1';
Height is the total number of levels in the B*tree
ANALYZE INDEX i1 VALIDATE STRUCTURE;
SELECT height FROM index_stats;
Height = BLEVEL + 1
13
© 2005 Julian Dykejuliandyke.com
Internal versus External ROWIDs
Internal ROWIDs are used in branch blocks Always 4 bytes.
External ROWIDs are used in leaf blocks
Bytes Description
6 Local Indexes
8 Cluster Indexes
10 Global Indexes
32 LOB Indexes
In 7.3.4 and below all ROWIDs are 4 bytes IOTs do not use external ROWIDs
14
© 2005 Julian Dykejuliandyke.com
Leaf Rows
ROWID Column 1FlagByte
LengthByte(s)
LockByte
LengthByte(s)
Column 2
Unique Index
ROWIDColumn 1FlagByte
LengthByte(s)
LockByte
LengthByte(s)
Column 2 LengthByte
Non-Unique Index
Unique indexes use one byte per row less than non-unique indexes
Each column has one length byte (< 128 bytes); two length bytes otherwise
15
© 2005 Julian Dykejuliandyke.com
Non Unique versus Unique IndexesSELECT c01,c02FROM t1WHERE c01 IN( SELECT c01 FROM t2);
SELECT STATEMENTMERGE JOIN
SORT JOINTABLE ACCESS (FULL) 'T1'
SORT JOINVIEW OF 'VW_NSO_1'
SORT (UNIQUE)TABLE ACCESS (FULL) 'T2'
SELECT STATEMENTNESTED LOOPS
TABLE ACCESS (FULL) 'T1'INDEX (UNIQUE SCAN) 'I2'
CREATE INDEX i2 ON t2 (c01);
CREATE UNIQUE INDEX i2 ON t2 (c01);
Non UniqueIndex
UniqueIndex
16
© 2005 Julian Dykejuliandyke.com
Non Unique Leaf Rows
All leaf rows are stored in sorted order
For non-unique indexes the ROWID is appended to the key to create a unique key
Keys must be effectively unique so that updates can traverse the B*tree directly to the affected leaf block without requiring a scan
For concatenated indexes allows range scans of prefix columns
17
© 2005 Julian Dykejuliandyke.com
Non-Unique Leaf Rows
Y
Y
Y
Y
Y
01 41 E9 A5 00 01
01 41 E9 A5 00 02
01 41 E9 A6 00 00
01 41 E9 A6 00 01
01 41 E9 A6 00 02
Y
Y
Y
Y
Y
01 41 E9 A7 00 00
01 41 E9 A7 00 01
01 41 E9 A7 00 02
01 41 E9 A8 00 00
01 41 E9 A8 00 01
Y
Y
Y
Y
Y
01 41 E9 A8 00 02
01 41 E9 A9 00 00
01 41 E9 A9 00 01
01 41 E9 A9 00 02
01 41 E9 AA 00 00
01 41 E9 A8 00 02Y 01 41 E9 A7Y
For non-unique indexes ROWIDs may be stored in branch blocks
ROWIDs are suffix compressed where possible
18
© 2005 Julian Dykejuliandyke.com
Branch Block Compression Branch block rows are suffix compressed
Number of branch blocks in an index is determined byLength of index key and dataNumber of leaf blocksUniqueness of the leading edge of the key
Number of branch blocks affects index height
19
© 2005 Julian Dykejuliandyke.com
Branch Block Compression
0 0 0 0 0 9 9 9 9 9 90
0 0 0 0 9 9 9 9 9 90
0 0 0 9 9 9 9 9 90
0 0 0 9 9 9 9 9 9
0 0 9 9 9 9 9 9
0 9 9 9 9 9 9
9 9 9 9 9 9
A single column 1000000 row CHAR(N) index
KeyFormat
PrefixLength
KeyLength
IndexHeight
Branch Blocks
Leaf Blocks
0 6 3 82 10990
1 7 3 92 11495
2 8 3 103 12196
3 9 4 117 12821
4 10 4 125 13334
5 11 4 144 14085
6 12 4 157 14706
9 9 9 9 9 9
0 0 0 0 00 Constant prefix
Monotonic values 000000 to 999999
20
© 2005 Julian Dykejuliandyke.com
Branch Block Compression A single column 1000000 row CHAR(12) index
KeyFormat
PrefixLength
KeyLength
IndexHeight
Branch Blocks
Leaf Blocks
0 12 3 108 14706
1 12 3 116 14706
2 12 3 124 14706
3 12 4 134 14706
4 12 4 141 14706
5 12 4 149 14706
6 12 4 157 147060 0 0 0 0 9 9 9 9 9 90
9 9 9 9 9 9
0 0 0 0 00 Constant prefix
9 9 9 9 9 9 X X X X XX
0 9 9 9 9 9 9 X X X XX
Monotonic values 000000 to 999999
X X X XX X Filler
X X XX0 0 9 9 9 9 9 9
X X X0 0 0 9 9 9 9 9 9
0 0 0 9 9 9 9 9 90 X X
X0 0 0 0 9 9 9 9 9 90
21
© 2005 Julian Dykejuliandyke.com
Leaf Block Compression In Oracle 8.1.5 and above, index leaf blocks can
be compressed e.g.
CREATE INDEX i1 ON t1 (c01,c02,c03) COMPRESS 2;
Each leaf row is split into a prefix and a suffix The number of columns in the prefix is specified using
the COMPRESS clause Repeating prefix columns within the block are held
once in a prefix row Suffix row has pointer to prefix row Prefix row has implicit pointers to suffix rows
22
© 2005 Julian Dykejuliandyke.com
Compressed Leaf Block Structure
Prefix Slot Array
Free Space
Prefix rows and
Suffix rows
Suffix Slot Array
Tail
20 bytes
72 bytes
16 bytes
16 bytes
2 bytes per row
4 bytes
2 bytes
4 bytes per row
Block Common Header
Index Leaf Header
Index Header
Transaction Header
Block Size
Compression Header 4 bytes
23
© 2005 Julian Dykejuliandyke.com
Leaf Block Compression
Country City
France Paris
Germany Berlin
Germany Frankfurt
Germany Munich
Italy Milan
Italy Rome
Suffix Slot Array
Prefix Slot Array
1600 120013001400 9001000
15001700 11001 4 6
1700
1500
1400
1300
1200
1100
1600
1000
900
France
Paris
Germany
Berlin
Frankfurt
Munich
Italy
Milan
Rome
Prefix Row
Prefix Row
Prefix Row
Suffix Row
Suffix Row
Suffix Row
Suffix Row
Suffix Row
Suffix Row
24
© 2005 Julian Dykejuliandyke.com
Leaf Block Compression Example – single column index
Number of Rows
Number of Distinct Keys
Number of Leaf Blocks (Compressed)
Number of Leaf Blocks (Uncompressed)
10000 1 68 104
10000 10 68 104
10000 100 69 104
10000 1000 75 104
10000 2000 83 104
10000 2500 86 104
10000 5000 105 104
10000 10000 141 104
CREATE TABLE (c01 CHAR(5));
CREATE INDEX i1 ON t1 (c01) COMPRESS 1;
25
© 2005 Julian Dykejuliandyke.com
Insertion
Inserting 'ENG','SCO' and 'USA'
Before After
Index initially has single empty leaf block at root. Rows are inserted into leaf block
ENGSCOUSA
26
© 2005 Julian Dykejuliandyke.com
Insertion
Inserting 'BEL'
Before After
ENGSCOUSA
Root node is now a branch block. Two new leaf nodes created. Leaf rows split between the two new blocks
USABELENGSCO
U
27
© 2005 Julian Dykejuliandyke.com
Insertion
Inserting 'SPA'
Before After
USABELENGSCO
U
SCOSPA
USABELENG
S U
New leaf node created. Leaf rows split between the two leaf nodes. New leaf block pointer added to branch block
28
© 2005 Julian Dykejuliandyke.com
Insertion
Inserting 'AUS'
New row added to leaf block. No other blocks affected
Before After
SCOSPA
USABELENG
S U
SCOSPA
USAAUSBELENG
S U
29
© 2005 Julian Dykejuliandyke.com
Insertion
Inserting 'CAN'
New leaf node created. Leaf rows split between the two leaf nodes. New leaf block pointer added to branch block
Before After
SCOSPA
USAAUSBELENG
S U
ENG SCOSPA
USAAUSBELCAN
E S U
30
© 2005 Julian Dykejuliandyke.com
Insertion
Inserting 'DEN'
New leaf node created. Branch block now full so new branch blocks created. Index now has three levels
Before After
S
D E U
DEN ENG SCOSPA
USAAUSBELCAN
ENG SCOSPA
USAAUSBELCAN
E S U
31
© 2005 Julian Dykejuliandyke.com
PCTFREE
Only applies to CREATE INDEX statement. Not used for subsequent inserts
CREATE TABLE t1 (c01 NUMBER);
Insert 100000 rows
CREATE INDEX i1 ON t1 (c01)PCTFREE 50
CREATE TABLE t1 (c01 NUMBER);
CREATE INDEX i1 ON t1 (c01)PCTFREE 50;
Insert 100000 rows
100000 Leaf rows 100000
1910 Leaf blocks 858
1588891 Leaf rows length 1588891
1909 Branch rows 857
13 Branch blocks 7
22738 Branch rows length 10172
32
© 2005 Julian Dykejuliandyke.com
Deletion
Deleting 'CAN'
Deleted flag is set for leaf row. Leaf block free space count is adjusted. No effect on other blocks
ENG
S
D E U
DEN SCOSPA
USAAUSBELCAN
S
D E U
DEN ENG SCOSPA
USAAUSBELCAN
Before After
33
© 2005 Julian Dykejuliandyke.com
Deletion
Deleting 'ENG'
Deleted flag is set for leaf row. Leaf block free space count is adjusted. Leaf block pointer is not deleted from branch
ENG
S
D E
DEN SCOSPA
USA
U
AUSBELCAN
S
U
DEN ENG SCOSPA
USA
ED
AUSBELCAN
Before After
34
© 2005 Julian Dykejuliandyke.com
Deletion
Inserting 'FRA'
Leaf block is reused. Leaf row added to leaf block. No other blocks affected
FRA
S
D E U
DEN SCOSPA
USAAUSBELCAN
S
D E U
USAENGDEN SCOSPA
AUSBELCAN
Before After
35
© 2005 Julian Dykejuliandyke.com
Coalesce
Before
S
D E U
DEN ENG SCOSPA
USAAUSBELCAN
After
S
DENENG
SCOSPAUSA
AUSBELCAN
D
ALTER INDEX index1 COALESCE;
Leaf rows are merged in leaf blocks. Branch blocks are updated where necessary
36
© 2005 Julian Dykejuliandyke.com
Rebuild
Before
S
D E U
DEN ENG SCOSPA
USAAUSBELCAN
After
ALTER INDEX index1 REBUILD;
DENENGSCO
SPAUSA
AUSBELCAN
D SP
Index is rebuilt in a new segment. All blocks are modified
37
© 2005 Julian Dykejuliandyke.com
Freelists Index blocks can be returned to the freelist after
all rows have been deleted Empty index blocks can be recycled Leaf blocks may be reused as branch blocks
and vice versa Blocks will not be reused until existing blocks
on the freelist have been used Empty blocks are initially returned to the
transaction freelist. They are subsequently moved to the master freelist.
38
© 2005 Julian Dykejuliandyke.com
Freelists
Deleting all rows from blocks 723 to 726
Before After
726 730727725724723
728 729
722
726 730727725724723
728 729
722
Blocks are inserted in transaction freelist. Branch block is not modified in case rows are re-inserted again
Master Freelist
TransactionFreelist 723 724 725 726
730Master Freelist
TransactionFreelist
730 731 731
39
© 2005 Julian Dykejuliandyke.com
Freelists
Inserting new rows
Before After
Block 730 is full and is removed from master freelist. New rows added to block 731
726 730727725724723
728 729
722
Master Freelist
TransactionFreelist 723 724 725 726
730
726 731727725724723
728 729
722
Master Freelist
TransactionFreelist 723 724 725 726
730
731 731
40
© 2005 Julian Dykejuliandyke.com
Freelists
Inserting new rows
Before After
Block 731 is full and is removed from master freelist. Blocks 723 to 726 moved from transaction freelist to master freelist. New rows added to block 726
Master Freelist
TransactionFreelist
727 726730725724723
728 729
722
731
Master Freelist
TransactionFreelist 723 724 725 726
726 731727725724723
728 729
722
730
731 726 725 724 723
41
© 2005 Julian Dykejuliandyke.com
Freelists
Inserting new rows
Before After
Block 726 is full and is removed from master freelist. New rows added to block 725Leaf block 724 becomes a new branch block
Master Freelist
TransactionFreelist
727 726730 725
724
723
728 729
722
731
725 724 723Master Freelist
TransactionFreelist
727 726730725724723
728 729
722
731
726 725 724 723
42
© 2005 Julian Dykejuliandyke.com
Freelists
Inserting new rows
Before After
Blocks 725 and 724 are removed from master freelist. Branch block 728 is moved to transaction freelistNew rows added to block 723
727 726730 725
724
723
729
722
731
Master Freelist
TransactionFreelist 728
723
727 726730 725
724
723
728 729
722
731
Master Freelist
TransactionFreelist
725 724 723
43
© 2005 Julian Dykejuliandyke.com
Freelists
Inserting new rows
Before After
Block 723 is full and is removed from master freelist. Branch block 728 becomes a leaf blockNew rows added to block 728
727 726730 725
724
723
729
722
731
Master Freelist
TransactionFreelist
728
727 726730 725
724
723
729
722
731
Master Freelist
TransactionFreelist 728
723
728
44
© 2005 Julian Dykejuliandyke.com
Virtual Indexes In 8.1.5 and above it is possible to create virtual
indexes
Virtual indexes have a data dictionary definition, but no associated segment
Effectiveness of new indexes can be tested by generating theoretical execution plans
The CBO will consider virtual indexes if the hidden parameter _use_nosegment_indexes is set to true
45
© 2005 Julian Dykejuliandyke.com
Virtual Indexes
CREATE TABLE t1 ASSELECT * FROM dba_objectsWHERE ROWNUM < 1000;
ANALYZE TABLE t1 COMPUTE STATISTICS;
Consider the following analysed table
CREATE INDEX i1 ON t1 (owner, object_name) NOSEGMENT;
EXECUTE DBMS_STATS.GENERATE_STATS (USER,'I1');
A virtual index can be created using the NOSEGMENT keyword
Statistics must be generated for the new index based on the existing statistics for the table
46
© 2005 Julian Dykejuliandyke.com
Virtual Indexes
SELECT object_id FROM t1WHERE owner = USER AND object_name = 'T1';
ALTER SESSION SET "_use_nosegment_indexes" = TRUE;
SELECT STATEMENT Optimizer = CHOOSETABLE ACCESS (FULL) OF 'T1'
SELECT STATEMENT Optimizer = CHOOSETABLE ACCESS (BY INDEX ROWID)
INDEX (RANGE SCAN) OF 'I1'
The statement
generates the plan
the same statement generates a different plan
With the hidden parameter enabled
47
© 2005 Julian Dykejuliandyke.com
Thank you for your interest
For more information and to provide feedback
please contact me
My e-mail address is:[email protected]
My website address is:
www.juliandyke.com