With Michelle Ufford. Sr. DBA for GoDaddy.com Working with SQL for 6 years, focused on performance...

27
SQL Server for the .NET Developer with Michelle Ufford

Transcript of With Michelle Ufford. Sr. DBA for GoDaddy.com Working with SQL for 6 years, focused on performance...

SQL Server for the .NET Developer

with Michelle Ufford

About Michelle

Sr. DBA for GoDaddy.com

Working with SQL for 6 years, focused on performance tuning and VLDB’s

Member of the PASS Performance SIG

President, 380PASS SQL User Group -

http://380pass.org

Active blogger at http://sqlfool.com

Twitters at http://twitter.com/sqlfool

Table Structures

The general syntax for creating a table is:

CREATE TABLE [schema].[tableName] ( <column definition> );

i.e.

CREATE TABLE dbo.myTable( id int NOT NULL

, lastName varchar(20) NOT NULL, firstName varchar(20) NULL );

In general, narrow tables perform better.

What the Heck is a Heap?

Heap Table

LastName

FirstName

Carson

Amanda

Smith

John

Adams Mike

Thomson Laura

Johnson Brian

Heap Table

LastName

FirstName

Carson

Amanda

Smith

John

Adams Mike

Thomson Laura

Johnson Brian

Ordered Table (Clustered

Index)

LastName FirstName

Adams Mike

Carson Amanda

Johnson Brian

Smith John

Thomson Laura

Ordered Table (Clustered

Index)

LastName FirstName

Adams Mike

Carson Amanda

Johnson Brian

Smith John

Thomson Laura

• un-ordered data

• fast writes

• slow reads

• not typically recommended

• ordered data

• can have fast writes (depends on if clustering key is sequential)

• fast reads (fastest reads when searching on the clustering key)

• recommended table structure!

Creating a Clustered Index General syntax for creating a clustered index:

CREATE [UNIQUE] CLUSTERED INDEX <index name> ON <table name>(<column list>);

i.e.CREATE UNIQUE CLUSTERED INDEX CIX_myTable

ON dbo.myTable(id);

Best practice is to create a clustered index on a narrow column with static values

Try to create a unique clustered index for a little better performance

Creating a Primary Key

A primary key enforces uniqueness within a table

General syntax for creating a primary key:CREATE TABLE [schema].[tableName]( <column definition>

CONSTRAINT <constraint name> PRIMARY KEY [NONCLUSTERED|CLUSTERED](<column list>));

i.e.CREATE TABLE dbo.myTable( id int NOT NULL, lastName varchar(20) NOT NULL, firstName varchar(20) NOT NULL

CONSTRAINT PK_myTable PRIMARY KEY NONCLUSTERED(id));

Clustered Index vs Primary Key

Pay attention to where you’re placing the primary key!

CREATE TABLE dbo.employee ( id int NOT NULL

, ssn int NOT NULL PRIMARY KEY, lastName varchar(20) NOT NULL, firstName varchar(20) NOT NULL );

SSN ID LastName

FirstName

123-45-6789 8 Carson

Amanda

212-12-1212 3 Smith John

323-23-2323 5 Adams Mike

424-24-2424 1 Thomson Laura

505-05-0505 4 Johnson Brian

SSN ID LastName

FirstName

123-45-6789 8 Carson

Amanda

212-12-1212 3 Smith John

323-23-2323 5 Adams Mike

424-24-2424 1 Thomson Laura

505-05-0505 4 Johnson Brian

SSN ID LastName FirstName

123-45-6789 8 Carson Amanda

200-98-7653 9 Edwards

Frank

212-12-1212 3 Smith John

323-23-2323 5 Adams Mike

424-24-2424 1 Thomson Laura

505-05-0505 4 Johnson Brian

SSN ID LastName FirstName

123-45-6789 8 Carson Amanda

200-98-7653 9 Edwards

Frank

212-12-1212 3 Smith John

323-23-2323 5 Adams Mike

424-24-2424 1 Thomson Laura

505-05-0505 4 Johnson Brian

New Hire

Clustered Index vs Primary Key

Creating a table with a non-clustered primary key and separate clustered index:

CREATE TABLE dbo.employee( id int Identity(1,1) NOT NULL

, ssn int NOT NULL, lastName varchar(20) NOT NULL, firstName varchar(20) NOT NULL

CONSTRAINT PK_employee PRIMARY KEY NONCLUSTERED(ssn));

CREATE UNIQUE CLUSTERED INDEX CIX_employeeON dbo.employee(id);

Choosing The Right Data Types

Use the most appropriate data type to meet the requirements of the column

Use the smallest data type while allowing a little room for growth

If the data type is too big, you’re… wasting space wasting IO wasting time

The WRONG Data Type

Assume you have 1 million records to store:

a unique id, a number 1 to 100, and a timestampCREATE TABLE dbo.myBigTable (

myID INT IDENTITY(1,1), myNumber INT, myDate DATETIME );

Bytes per row: 4 + 4 + 8 + 10 = 26 bytes  (myID + myNumber + myDate + overhead)    

Rows per page: 8,060 / 26 = 310 rows Pages per 1mm rows: 1,000,000 / 310 = 3225

pages

The RIGHT Data Type

We’re still storing 1 million records:CREATE TABLE dbo.mySmallTable (

myID INT IDENTITY(1,1) , myNumber TINYINT , myDate SMALLDATETIME );

Bytes per row: 4 + 1 + 4 + 10 = 19 bytes   (myID + myNumber + myDate + overhead)    

Rows per page: 8,060 / 19 = 424 rows Pages per 1mm rows: 1,000,000 / 424 = 2357

pages

The Results

myBigTable consumed 3,225 pages

mySmallTable consumed 2,357 pages

mySmallTable used 868 less pages!

Space savings of 27%

A single IO can now return 114 more rows Performance improvement of 37%

Non-Clustered Indexes

Text Book example:

Clustered Index Non-Clustered Index

Book Page Appendix

Adams, Mike

Carson, Amanda

Edwards, Frank

Johnson, Brian

Adams, Mike

Carson, Amanda

Edwards, Frank

Johnson, Brian

Roberts, Amanda

Smith, John

Thomson, Laura

Roberts, Amanda

Smith, John

Thomson, Laura

Amanda 2, 5

Brian 4

Frank 3

Amanda 2, 5

Brian 4

Frank 3

John 6

Laura 7

Mike 1

John 6

Laura 7

Mike 1

Creating a Non-Clustered Index General syntax for creating a non-

clustered index:CREATE [UNIQUE] NONCLUSTERED INDEX <index

name> ON <table name>(<index keys>)

INCLUDE (<column list>);

i.e.CREATE NONCLUSTERED INDEX IX_employee_ssn

ON dbo.employee(ssn)INCLUDE (lastName, firstName);

Too may indexes can make writes expensive

Included Columns

Non-key columns on the leaf level of an index

Not used as search criteria in an index SELECT employeeID, lastName, firstName

FROM dbo.employee WHERE hireDate < ‘2009-01-01’

CREATE NONCLUSTERED INDEX IX_employee_hireDateON dbo.employee(hireDate)INCLUDE (lastName, firstName);

hireDate = index key employeeID = clustered key pointerlastName, firstName = included column

Half-Time Show!

Want free hosting?

First person to tweet…

"@GoDaddyGuy I wantGoDaddy.com hosting!”

…will win free hostingfor a year!

What Is Fragmentation?

When the logical ordering of pages does not match the physical ordering of pages

Both clustered and non-clustered indexes can become fragmented

Fragmentation is caused by insert/update/delete operations

How Does Fragmentation Occur?

Page 3

Page 4

Page 5

Page 3

Page 4

Page 5

Page 4

Page 5

Page 6

Page 4

Page 5

Page 6

Page 3

Page 4

Page 742

Page 3

Page 4

Page 742

Page 4

Page 742

Page 5

Page 4

Page 742

Page 5

Page 742

Page 5

Page 6

Page 742

Page 5

Page 6

Insert

Page 3

Record 1

Record 2

Record 3

Record 4

Page 5

Page 3

Record 1

Record 2

Record 3

Record 4

Page 5

Page 3

Record 1

Record 2

Page 742

Page 3

Record 1

Record 2

Page 742

Page 4

Record 5

Record 3

Record 4

Page 5

Page 4

Record 5

Record 3

Record 4

Page 5

Insert

Page 4 Page 4 Page 742

How Do I Find Fragmentation?

SQL Server 2000 DBCC ShowContig

SQL Server 2005 / 2008 Dynamic Management Views (DMV) sys.dm_db_index_physical_stats Execute after normal business hours or

use “Limited” to avoid impacting the environment

How Do I Fix Fragmentation?

SQL Server 2000 DBCC IndexDefrag - quicker DBCC dbReindex - more thorough

SQL Server 2005 Alter Index {Reorganize | Rebuild}

My Index Defrag Script: http://sqlfool.com/2009/03/automated-index-

defrag-script/

Stored Procedures

Stored procedures offer: enhanced security better performance easier maintenance less network traffic

Best practice for accessing data in SQL Server

Tuning Stored Procs

Select only the columns you actually need

Try to use an existing covering index whenever possible, but…

Don’t create an index for every query!

Place Data Definition (DDL) statements first, then Data Modification (DML)

Specify the schema owner (i.e. “dbo”, “sys”) for every referenced object

Tuning Stored Procs

Consider MAXDOP restrictionsConsider NOLOCK hints (only if dirty

reads are acceptable)Use “SET NOCOUNT ON;”Avoid nested proc callsKeep transactions small

Tips for Developing Applications

Minimize database calls whenever possible

When working with static data, retrieve data once and store it in the app (i.e. hash table)

When developing high-volume applications: Batches are key! More manageable

updates Use table-valued parameters (TVP) or

XML to handle bulk operations Staging tables vs single row updates

Where To Go When You Need Help

Books Online,http://msdn.microsoft.com/en-us/library/ms130214.aspx

SQLServerPedia, http://sqlserverpedia.com

SQL Server Performance, http://www.sql-server-performance.com

SQL Server Central, http://www.sqlservercentral.com/

My blog! http://sqlfool.com

Other Resources

Batch Compiliation, Recompliation, and Plan Caching Issues in SQL Server 2005http://technet.microsoft.com/en-us/library/cc966425.aspx

SQL Server 2005 Waits & Queues (Google)

Grant Fritchey’s SQL Server Execution Plans

Kalen Delaney’s SQL Server 2008 Internals

Itzik Ben-Gan’s SQL Server 2008 T-SQL Fundamentals

Questions?

Thank you for attending my presentation!

I can be reached at:

Michelle [email protected]://sqlfool.com