Module 5 Planning for SQL Server® 2008 R2 Indexing.

Module 5

Planning for SQL Server® 2008 R2

Indexing

Module Overview

• Core Indexing Concepts

• Data Types and Indexes

• Single Column and Composite Indexes

Lesson 1: Core Indexing Concepts

• How SQL Server Accesses Data

• The Need for Indexes

• Index Structures

• Selectivity, Density and Index Depth

• Index Fragmentation

• Demonstration 1A: Viewing Index Fragmentation

How SQL Server Accesses Data

Index

SQL Server reads all table pages

SQL Server uses index pages to find rows

Table Scan

The Need for Indexes

• ANSI SQL does not mention indexes Generally considered to be external to the logical data model

• All queries can be executed without indexes Primary reason for indexes is performance

• Some constraints are implemented via indexes Indexes are used to make constraints efficient but in theory

could be implemented in other ways

• Analogy: Physical library holding books Index by author is useful

Additional indexes would also be useful

Index Structures

• Indexes are commonly based on tree structures Not just a binary tree as nodes can have more than two

children

• Top node is called the root node

• Bottom level nodes are called leaf nodes

Selectivity, Density and Index Depth

• Three core concepts when working with indexes

• Selectivity A measure of how many rows are returned compared to the

total number of rows

High selectivity means a small number of rows when related to the total number of rows

• Density A measure of the lack of uniqueness of data in the table

High density indicates a large number of duplicates

• Index Depth Number of levels within the index

Common misconception that indexes are deep

Index Fragmentation

How does fragmentation occur?

• SQL Server reorganizes index pages when data modifications cause index pages to split

• SQL Server Management Studio – Index Properties

• System function - sys.dm_db_index_physical_stats

Detecting fragmentation

• Internal – pages are not full

• External – pages are out of logical sequence

Types of fragmentation:

Demonstration 1A: Viewing Index Fragmentation

In this demonstration, you will see how to identify fragmented indexes

Lesson 2: Data Types and Indexes

• Numeric Index Data

• Character Index Data

• Date-Related Index Data

• GUID Index Data

• BIT Index Data

• Indexing Computed Columns

Numeric Index Data

• Indexes with numeric keys work efficiently Many values fit in a small number of index pages

Sorts and comparisons are quick

• Exact numeric types are most efficient Integer types are the most efficient

INT and BIGINT commonly used

• Approximate data types (float and real) much less efficient

Character Index Data

• Character data types are much less efficient when used in index keys

• Character values tend to be much larger than numeric values

• Even short character values are slow to compare unless binary comparisons are being made Most SQL Server applications use collations other than binary

Rules for collations need to be applied whenever comparisons are made

Date-Related Index Data

• Date data types are generally good candidates for index keys Very commonly used in business applications

• Only slightly less efficient than integer data

• Size of the data will be important date more efficient than datetime

GUID Index Data

• Becoming very common in new business applications

• Moderate efficiency Size is 128 bits or 16 bytes

Comparison performance is reasonable

• Problems arise Randomness of generation causes fragmentation problems

Very common problem in many current applications

BIT Index Data

• BIT columns have only two possible values

• BIT columns are efficient as index keys

• Common misconception that BIT columns are not useful in indexes Many valid usage scenarios exist

Filtered indexes often useful with BIT column indexes

Indexing Computed Columns

• Computed columns are based on expressions Values are typically derived from other columns

• Indexing the computed value rather than any underlying values can be useful Can assist with improving performance on poorly-designed

databases

Example: a column that is used to hold values that should have been held in separate columns

• Persisted computed columns Avoid the overhead of computing values each time they are

selected

Are computed during INSERT and UPDATE statements

Lesson 3: Single Column and Composite Indexes

• Single Column vs. Composite Indexes

• Ascending vs. Descending Indexes

• Index Statistics

• Demonstration 3A: Viewing Index Statistics

Single Column vs. Composite Indexes

• Indexes are not always constructed on a single column Multi-column indexes are called "composite" indexes

• Composite indexes are often useful Tend to be more useful than single column indexes in most

typical business applications

Having an index sorted first by customer then by order date makes it very easy to find orders for a particular customer on a particular date.

A query might involve multiple search predicates.

Two columns together might be selective while neither is selective on its own.

• Index on A,B is not the same as an index on B,A Typically index the most restrictive column first

Ascending vs. Descending Indexes

• Indexes could be constructed in ascending or descending order

• In general, for single column indexes, both are equally useful Each layer of a SQL Server index is double-linked (ie: linked in

both directions)

SQL Server can start at either end and work towards the other end

• Each component of a composite index can be ascending or descending Might be very useful for avoiding sort operations

Index Statistics

• SQL Server needs to have knowledge of the layout of the data in a table or index before it optimizes and executes queries Needs to create a reasonable plan for executing the query

Important to know the usefulness of each index

Selectivity is the most important metric

• By default, SQL Server automatically creates statistics on indexes Can be disabled

Recommendation is to leave auto-creation and auto-update enabled

Demonstration 3A: Viewing Index Statistics

• In this demonstration, you will see how to work with index statistics

• Exercise 1: Explore existing index statistics

• Challenge Exercise 2: Design column orders for indexes (Only if time permits)

Lab 5: Planning for SQL Server Indexing

Logon information

Estimated time: 45 minutes

Virtual machine 623XB-MIA-SQL

User name AdventureWorks\Administrator

Password Pa$$w0rd

Lab Scenario

You have been asked to explain the concept of index statistics and selectivity to a new developer. You will explore the statistics available on an existing index and determine how selective some sample queries would be.

One of the company developers has provided you with a list of the most important queries that will be executed by the new marketing management system. Depending upon how much time you have available, you need to determine the best column orders for indexes to support each query. Complete as many as possible within the allocated time. In later modules, you will consider how these indexes would be implemented. Each query is to be considered in isolation in this exercise.

Lab Review

• Which types of queries would most likely lead to widely-differing query plans?

• If you have an equality predicate and a LIKE predicate in your most important query, which predicate would you try to satisfy as the first column of a composite index?

Module Review and Takeaways

• Review Questions

• Best Practices

Module 5 Planning for SQL Server® 2008 R2 Indexing.

Documents

Transcript of Module 5 Planning for SQL Server® 2008 R2 Indexing.