Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge
52
DBAs Behaving Badly Worst Practices for Database Administrators
-
Upload
sqlservercoil -
Category
Technology
-
view
826 -
download
3
description
Transcript of Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge
- 1. DBAs Behaving Badly
Worst Practices for Database Administrators - 2. About the writer
Rod Colledge
Independent SQL Consultant
Based in Brisbane, Australia
Web; www.sqlCrunch.com
Blog; www.rodcolledge.com
MVP Deep Dives Book
Twitter @rodcolledge
linkedin.com/in/rodcolledge - 3. About us.
Dubi Lebel
DBA - 4. About us.
Dubi Lebel
DBA Dubi Behind All - 5. About us.
Dubi Lebel
DBA Dubi Behind All
D.B.A - 6. About us.
Dubi Lebel
DBA Dubi Behind All
D.B.A Dont Bother Asking
Shahar Bar
SQL Consultant and CEO at Valinor - 7. Session Overview
Disaster Recovery (DR) Planning
Backup & Restore
Change Control
Storage Configuration
File Configuration
Indexing
Administration Techniques - 8. Disaster Recovery (DR) Planning
- 9. 1 / 20; Not having SLAs
SLAs provide context for everything. e.g.;
Database available 24/7 @ 99.999% uptime
Zero data loss
Sub-second response time
Use option papers during SLA negotiations - 10. SLA Option Papers
- 11. 2 / 20; Not having/testing DR plans
Do you have DR Plans?
How do you know your plans will work?
DR fire drills
All/new DBAs trained in recovery procedures?
Location of recovery documents & scripts?
Documents/scripts up to date? - 12. 2 / 20; Not having/testing DR plans
- 13. 3 / 20; Narrow definition of disaster
Types of disasters;
Complete environmental destruction
Air conditioning failure
Disk crash
Accidentally dropping a table/database
Security breach; what data was accessed?
The next disaster will be unanticipated. Are your DR plans pessimistic enough? - 14. Backup & Restore
- 15. argh! ... who would have thought we needed
backups?
- 16. 4 / 20; Not Taking Backups
Huh?
Less obvious variations;
File system backups only
No transaction log backups
SAN Snapshots Recoverability? - 17. 5 / 20; Not Verifying Backups
How do you know they worked?
Verification options
RESTORE VERIFYONLY FROM
Restore to a Reporting Server
Log shipping (log backup verification) - 18. 6 / 20; Designing for Backups only
Design for restoration!
What is the data loss exposure?
How long will the recovery take?
Script, test & document various restore scenarios - 19. Backup Compression
BACKUP DATABASE AdventureWorks2008
TO DISK =G:SQL BackupAWorks.bak
WITH COMPRESSION - 20. Change Control
- 21. 7 / 20; Insufficient Test Environments
- 22. 8 / 20; No Performance Baseline
- 23. 9 / 20; No Standard Build/Change Log
Without a change log, how can you answer;
Why is something different?
Who made the change?
When was the change made?
Was the change successful?
What will happen if the change is rolled back? - 24. Policy Based Management
- 25. Configuration File.ini
- 26. Demo; Configuration Changes Report
- 27. Storage Configuration
- 28. 10 / 20; Capacity-Centric Design
200GB database How many 73GB disks?
Capacity Centric;
200 / 73 = 3 disks
Performance Centric
(reads per sec + (writes per sec * RAID)) / IOPS
(1200 + (400 * 2)) / 125 = 16 disks!
~ 1.1TB or 500GB after RAID - 29. Preface: Many Factors Affect Disk I/O Perf
There are myriad best practices & considerations for optimal disk I/O subsystem performance.
Be mindful of factors such as:
RAID level
File allocation unit size
Number, size, & speed of disks
Configuration & capacity of HBAs & fabric switches
Consider increasing HBA Queue Depth
Network bandwidth
Cache on disk, controllers, & SAN
Whether disks are dedicated, shared, or virtualized
Bus speed
Number of paths from disk I/O subsystem to server
Driver versions for all components
Stripe size
Stripe unit size
Workload - 30. HDD Architecture: 3-D
This image is from a contemporary & otherwise excellent document, but it represents disks as they were over two decades ago!
The disk deities at Microsoft wont allow me to perpetrate such myths.
Graphics source: Veritas Storage Foundation 5.0 for Windows Best Practices for Storage Management
http://eval.symantec.com/mktginfo/enterprise/white_papers/ent-whitepaper_vsfw_5.0_best_practices_for_storage_mgmt_02-2007.en-us.pdf
- 31. Partition Alignment Graphic: NTFS 4KB Cluster: Default vs.
Aligned RAID Array ***This has CONTEMPORARY RELEVANCE***
This is a very simplified graphic
Contemporary relevance
Corresponds to default NTFS file allocation unit of 4KB
Given common 64KB stripe unit size
See the Notes for details
Graphics Source: Jimmy May - 32. Partition Alignment Graphic: RAID Array: Default vs.
Optimized for SQL Server ***This has CONTEMPORARY
RELEVANCE***
This is a very simplified graphic
Mark Licata, Senior Technology Architect
The worst scenario? Random operations using 64K IO and 64K chunk size. One sector off and you are hitting two disks for every IO thus halving the random performance potential.
Note: On a RAID array this means accessing two different stripe units on two separate disks.
Graphics Source: Jimmy May - 33. 11 / 20; Using Unaligned Partitions
- 34. Which of the following RAID levels is not a good choice for
write-intensive DBs?
RAID-0
RAID-1
RAID-5
RAID-10 - 35. File Configuration
- 36. 12 / 20; Relying on Autogrowth
- 37. 13 / 20; Shrinking Files
- 38. 14 / 20; Full recovery + no log backups
When are records removed from the t-log file?
Full recovery model; ONLY after t-log backup
Simple recovery model; On checkpoint
When to use full recovery model?
When point in time recovery is required
Backup the log file!
Take care when moving DBs from/to production - 39. Indexing
- 40. 15 / 20; Too many/not enough indexes
Small dev db production (not enough)
Loaded with unused indexes (too many)
Watch for duplicate or overlapping indexes
DMVs to the rescue
sys.dm_db_missing_index_%
sys.dm_db_index_usage_stats
sys.dm_db_index_physical_stats - 41. Demo; Indexing
- 42. 16 / 20; Inappropriate index maintenance
Code in Books Online: sys.dm_db_index_physical_stats - 43. 17 / 20; Update stats after index rebuild
- 44. Administration Techniques
- 45. 18 / 20; Manual Administration
Automation enables more things to be achieved with fewer mistakes in a given amount of time - 46. 19 / 20; Not defining alerts
Manage by exception
SQL Agent Alerts;
Job failures
Performance conditions
High severity errors (level 19 +)
What about error 825 (level 10) ?
http://www.karaszi.com/SQLServer/util_agent_alerts.asp - 47. 20 / 20; No task lists/check lists
- 48. Demo; Administration techniques
- 49. Summary
Be cautiously pessimistic
Design backups from a restore perspective
Establish & maintain performance baselines
Validate the I/O chain
Use a performance-centric design
Dont rely on all out of the box settings
Understand the indexing DMVs
Automate & manage by exception - 50.
[email protected]
- 51. Complete the Evaluation Form & Win!
You could win a Dell Mini Netbook every day just for handing in your completed form! Each session form is another chance to win!
Pick up your Evaluation Form:
Within each presentation room
At the PASS Booth near registration area
Drop off your completed Form:
Near the exit of each presentation room
At the PASS Booth near registration area
Sponsored by Dell - 52. Thank you
for attending this session and the 2009 PASS Summit in Seattle