Creating and Managing Indexes Using Proc SQL Chapter 6 1.
-
Upload
ernest-may -
Category
Documents
-
view
222 -
download
0
Transcript of Creating and Managing Indexes Using Proc SQL Chapter 6 1.
Creating and Managing Indexes Using Proc SQL
Chapter 6
1
What is An Index?
An index is an auxiliary file that stores the physical location of values for one or more specified columns in a table.
An index stores unique values for a specified column(s) in ascending order.
An index includes value/identifier pairs that allow you to access a row directly, by value.
You cannot create an index on a view.
2
Why Use An Index?
When a query is submitted on a table, PROC SQL accesses rows in a table in the order in which they are stored in the table, beginning with the first row and reading all rows in the table.
For large tables, it can be time consuming for PROC SQL to read all the rows in a table.
3
Why Use An Index?
An index on a table may allow a PROC SQL query to execute more efficiently.
We can refer to the LOG to view real time and CPU time (we will learn about STIMER in Chapter 8) for a partial measure of efficiency gains
4
Simple and Composite Indexes
Two types of indexes can be created– Simple – based on one column. The
name of the index must be the name of the indexed column.
– Composite – based on two or more columns. The name of the composite index cannot be the name of any existing column or index in the table.
5
Example - Creating A Simple Index
Suppose we had a table (meddb) containing all doctors claims for 100,000 patients. If we wanted to create an index on the patient’s Social Security Number (ssn), we would use the following code.
proc sql; create index ssn on meddb(ssn); quit;
6
Example - Creating A Composite Index
If we wanted to create a composite index on the patient’s Social Security Number (ssn) and the type of service the patient received (tos), we would use the following code proc sql; create index ssntos on meddb(ssn,tos); quit;
7
Unique Indexes
Simple and composite indexes can be defined as unique indexes.
Add the word unique before the word index when creating a unique index
If a table contains multiple occurrences of the same value in a column, a unique index cannot be defined on that column.
8
Costs of Using an Index
Additional CPU time Additional input/output requests
may be required Additional memory Additional disk space to store the
index file
9
Guidelines for Creating Indexes
Keep the number of indexes to a minimum
Do not create an index for small tables
Do not create an index on a column that has a small number of distinct values (low cardinality)
Use indexes for queries that return a small subset of rows (< 15%)
10
Describe Table Statement
Use to display a CREATE INDEX statement in the SAS log for each index that is defined
If no indexes are defined for the specified table, a CREATE INDEX statement will not appear in the SAS log
11
MSGLEVEL = Option
To determine if an index is used, specify the SAS system option MSGLEVEL=I.
The default for the MSGLEVEL= option is MSGLEVEL=N. This will display notes, warnings, and error messages only.
OPTIONS MSGLEVEL=I;
12
IDXWHERE = Option
IDXWHERE=YES– Tells SAS to choose the best index to
process the WHERE expression IDXWHERE=NO
– Tells SAS to ignore all indexes IDXNAME= index name
– Tells SAS to use the specified index
13
Dropping Indexes
Use the drop index statement to drop(delete) one or more indexes
proc sql; drop index ssn from claims; quit;
14