F28DM Indexes in Oracle 1 F28DM : Database Management Systems Indexes in Oracle Monica Farrow...

32
F28DM Indexes in Oracle 1 F28DM : Database Management Systems Indexes in Oracle Monica Farrow [email protected] Room: EMG30, Ext: 4160 Material on Vision & my web page Content taken from HW lecturers, + books by Rob & Coronel, and by Connolly & Begg For more info on everything, see Oracle9i Database Concepts (on the web)

Transcript of F28DM Indexes in Oracle 1 F28DM : Database Management Systems Indexes in Oracle Monica Farrow...

F28DM Indexes in Oracle 1

F28DM : Database Management Systems

Indexes in Oracle

Monica [email protected]: EMG30, Ext: 4160

Material on Vision & my web pageContent taken from HW lecturers,

+ books by Rob & Coronel, and by Connolly & BeggFor more info on everything, see Oracle9i Database

Concepts (on the web)

F28DM Indexes in Oracle 2

Basic Use of Indexes

Primary, Secondary, CompositeWhen to use

F28DM Indexes in Oracle 3

Indexes

• An index helps speed up retrieval. • A column that is frequently referred to in

the WHERE clause is a potential candidate for indexing.

• Why use indexing?• It's easy and provides immediate value • It can be used to speed up existing applications

without changing any of their logic or queries • One of the main reasons for the success of

relational databases is that they have efficient indexing systems.

F28DM Indexes in Oracle 4

Index recap

• An index typically consists of the search key (one or more attributes) and the row ID(s) of record(s) which match this key

• An index on a file speeds up selections on the search key fields specified for that index• Any subset of the fields of a relation can be the

search key for an index on the relation• A primary index allows efficient searching on

the primary key of a relation• A secondary index allows efficient searching

on other attributes which are often used in queries

F28DM Indexes in Oracle 5

Example: the “Sailors and Boats” Domain

• Information needs to be stored about sailors, boats and reservations that sailors are making for boats

• There are three relations with the following schemas:

• Sailor (id:int, name:string, rating:int, age:int, base:string)

• Boat (id:int, name:string, colour:string, base:string)

• Reservation (sid:int, bid:int, day:date)

F28DM Indexes in Oracle 6

Automatically Created Indexes

• On every relation, Oracle creates a non-clustered B+tree index for the primary key

• A B-tree is shown on the next slide• A B-tree stays balanced when items are added and

removed• Indexes will be created on

• Sailor table, id column• Boat table, id column• Reservations table, a composite index on sid, bid and

day

• Additionally Oracle will also create an index for every UNIQUE constraint

F28DM Indexes in Oracle 7

B-tree index

Diagram just to give you an idea, you don’t have to memorise this

F28DM Indexes in Oracle 8

Composite index keys

• If the key consists of more than one attribute, the index is a composite or concatenated index• i.e. an index for several attributes

• A composite index is sorted lexicographically according to the order of the attributes• E.g for a composite key (name,age), we have

(Kelly, 22) < (Kelly, 63) < (Smith, 18) < (Smith,36)

• This is different to a composite key (age, name)

F28DM Indexes in Oracle 9

Secondary indexes

• Create secondary indexes for other attributes in your tables which are frequently searched by• In the SQL WHERE clause, not the SQL SELECT

line

• In this case, you must explicitly create the index, as shown on the next slide

F28DM Indexes in Oracle 10

Explicitly Created Indexes

• An index • Has a name. Choose an informative one.• Is created for a sequence of attributes over a

table which must already exist• Can be DROPped

• Examples• CREATE INDEX sailor_name_idx

ON Sailor(name);• DROP INDEX sailor_name_idx;• CREATE INDEX sailor_name_and_age_idx

ON Sailor(name, age)

F28DM Indexes in Oracle 11

Using CREATE INDEX (1) 

• The CREATE INDEX statement creates a sorted index on the specified columns of the named table.

• Indexes are automatically used to improve the performance of queries issued to the database, and to sort queries with an ORDER BY clause.

• Once an index is created, it is never referenced in a SQL statement again except to validate it (VALIDATE INDEX) or delete it (DROP INDEX).

• Indexes cannot be created for views. • Add an index to the underlying base table

F28DM Indexes in Oracle 12

Using CREATE INDEX (2)

• Indexes are always owned by the owner of the table. The index name must be unique for each table and owner.

• Exclusive table use.   • You can’t create an index if the table is in use.• CREATE INDEX can be time consuming and the server

will not process requests referencing the same table while the statement is being processed.

• As with all other CREATE commands, store the command in a text file

• To remind you what the index is like• So it can be rerun if you decide to re-create your DB

F28DM Indexes in Oracle 13

Overheads v performance

• There is an overhead involved in the maintenance and use of secondary indexes.• The index must be updated whenever the table

is updated• Disk space is needed for the index

• Balance the overhead against the performance benefits• Faster data retrieval

F28DM Indexes in Oracle 14

How to choose secondary indexes (1)

• Add a secondary index to a foreign key if it is frequently accessed• E.g. Boat id in Reservations, if we frequently

want to know the name of the boat in a reservation

• Add a secondary index to any attribute that is frequently used as a search key• E.g. day in Reservations (what reservations do

we have for today or tomorrow?)• E.g. name in Sailor (what is the rating for the

sailor called ‘Lubber’?). People often search by name rather than id.

F28DM Indexes in Oracle 15

How to choose secondary indexes (2)

• Add a secondary index to any attributes frequently used in order by, group by, min, max, avg• E.g. age in Sailor if frequently want list in age

order

• Add a composite secondary index to a small set of attributes if the index would provide the full details for a frequently used query without having to search the base table• E.g. rating and age in Sailor if a frequent query is

SELECT rating, AVG(age) FROM SailorGROUP BY rating;

F28DM Indexes in Oracle 16

When not to use secondary indexes

• If the relation is small – not many rows

• If the relation or attribute is frequently updated• If periodic large updates, drop the index,

update the data, re-create the index

• If the attribute is always used in queries that retrieve a significant proportion of the rows in the relation• E.g. If the attribute has low sparsity (i.e. the

number of different values is small, such as gender F or M).

F28DM Indexes in Oracle 17

More Advanced Indexing Topics

Index-organised tablesBitmap indexesClusters

F28DM Indexes in Oracle 18

Index organised tables

• Index organised tables• Entries of index are table records rather than

row IDs• Useful if frequent access to complete table via

the index• In Oracle, the default table organisation is heap• To store the table in primary key order, add the

words ORGANIZATION INDEX to the end of the CREATE TABLE command

• In the B-tree, the leaf nodes are replaced by all the non-primary key attributes in the row

F28DM Indexes in Oracle 19

.Bitmap indexes

• Bitmap indexes• An example on the next slide• For each value of a domain, there is a bitmap

identifying the row Ids of satisfying tuples• Usually applied to attributes with a sparse

domain• In Oracle, <100 distinct values• E.g. bitmaps for all tuples with sex = male and for

sex=female

• Updating a bitmap takes a lot of time, so use for tables with hardly any updates, inserts, deletes

• Ideal for data warehousing• See later lecture, if time

F28DM Indexes in Oracle 20

Bitmap indexes example

• The first table is the Sailors table• The second table shows a bitmap index for

rating, assuming values are only from 1-3

1 2 3

1 0 0

0 1 0

0 0 1

1 0 0

Id Rating etc

22 1 Other data

23 2 Other data

31 3 Other data

35 1 Other data

F28DM Indexes in Oracle 21

Clusters in Oracle

• Oracle supports 2 types of clusters• Indexed clusters• Hashed clusters

F28DM Indexes in Oracle 22

Clusters

• “Clusters are groups of 1 or more tables physically stored together because they share common columns and are often used together” (Connolly & Begg)

• This improves disk access

• The related columns of the tables are called the cluster key

F28DM Indexes in Oracle 23

Saiilor/Reservation cluster example

• Sailor and Reservation relations could be stored together, with the cluster key = Sailor id

• Each relation can be retrieved independently• Whenever the joined relations are required,

access is faster

name rating age id bid day

Dustin 7 45 22 101 2/2/08

103 3/4/08

101 4/5/08

Rusty 10 35 58 102 4/5/08

103 6/6/08

F28DM Indexes in Oracle 24

Should we cluster Sailors and Reservations?

• A cluster for Sailors and Reservations could be created if many queries in the application joined these tables together.

• A cluster for Sailors and Reservations would be inefficient if reservations were only a minor part of the application, and many more queries involved Sailors joined with other tables

F28DM Indexes in Oracle 25

Indexed clusters

• Use when • Queries retrieve records over a range of cluster key

values• Clustered tables may grow unpredictably

• With an indexed table or index cluster, Oracle locates the rows in a table using key values that Oracle stores in a separate index.

• E.g. • No idea how many reservations there are going to be• Queries like

SELECT * FROM Sailors S, Reservations RWHERE S.id = R.sidAND S.id > 50;

F28DM Indexes in Oracle 26

Defining a indexed cluster in SQL(1)

• Create a cluster first• CREATE CLUSTER sailor_id_cluster (id int);• A cluster can be made for a fixed number of

typed attributes (here just one)

• Define an index on the cluster• CREATE INDEX sailor_id_idx

ON CLUSTER sailor_id_cluster;

F28DM Indexes in Oracle 27

Defining an indexed cluster in SQL (2)

• Create table(s) for the cluster

• CREATE TABLE Sailor (id INT NOT NULL,

name VARCHAR(20),

ranking INT,

age INT)CLUSTER sailor_id_cluster(id);

• CREATE TABLE Reservation (sid INT, bid INT.........as before....)CLUSTER sailor_id_cluster(sd);

F28DM Indexes in Oracle 28

Hash Clusters

• The key of a hash cluster, like the key of an index cluster, can be a single column or composite key (multiple column key).

• A hash function is applied to the cluster key to obtain a hash value.

• The hash value determines the location where the rows of a table are physically stored. So all records with the same cluster key are stored together

F28DM Indexes in Oracle 29

Creating a hash cluster

• Create a hash cluster• CREATE CLUSTER sailor_id_hashcluster (id int)

HASH IS id HASHKEYS 10000;

• Add table(s) as before• CREATE TABLE Sailor (id INT NOT NULL,

name VARCHAR(20),

ranking INT,

age INT)CLUSTER sailor_id_hashcluster(id);

F28DM Indexes in Oracle 30

Use hashing when (1)

• Most queries are equality queries on the cluster key: SELECT ... WHERE cluster_key = ...;

• the cluster key in the equality condition is hashed, and the corresponding hash key is usually found with a single read.

• In comparison, for an indexed table the key value must first be found in the index (usually several reads), and then the row is read from the table (another read).

• AND WHEN . . .

F28DM Indexes in Oracle 31

Use hashing when (2)

AND• The tables in the hash cluster are primarily

static in size so that you can determine the number of rows and amount of space required for the tables in the cluster.• If tables in a hash cluster require more space

than the initial allocation for the cluster, performance degradation can be substantial because overflow blocks are required.

F28DM Indexes in Oracle 32

Hashing v Indexing

• To find or store a row in an indexed table or cluster, a minimum of two (there are usually more) I/Os must be performed:• One or more I/Os to find or store the key value

in the index • Another I/O to read or write the row in the table

or cluster

• In contrast, if a hash function is used to locate a row in a hash cluster; no I/O is required. • As a result, a minimum of one I/O operation is

necessary to read or write a row in a hash cluster.