Sunil Agarwal Senior Program Manager Email: [email protected] Microsoft Corporation SESSION CODE:...
-
Upload
felix-wilson -
Category
Documents
-
view
217 -
download
1
Transcript of Sunil Agarwal Senior Program Manager Email: [email protected] Microsoft Corporation SESSION CODE:...
Microsoft SQL Server Data Compression: Experience and Changes
Sunil AgarwalSenior Program ManagerEmail: [email protected] Corporation
SESSION CODE: DAT309
Agenda
Customer Experience and FeedbackUnicode Compression in SQL2008R2Future Directions
Data Compression Overview
Two Types of compressionROW
Fixed length columns stored as variable lengthRecommendation: DML heavy workload
PAGEColumn Prefix and Page Dictionary compressionRecommendation: Read-mostly workload
Can be enabled on a table, index, and partitionEstimate data compression savings by sp_estimate_data_compression_savingsCan be enabled/disabled ONLINENo application changes
Data Compression and Space Savings
Your mileage will vary.
Page 4
Data Compression Savings Achieved By CustomersCustomer Data Compression Space Savings Notes
Bank Itau 70% PAGE. Data Warehouse application.
BWIN.com 40% PAGE. OLTP Web application.
NASDAQ 62% PAGE. DW application.
GE Healthcare 38%, 21% PAGE, ROW.
Manhattan Associates 80%, 50% PAGE, ROW.
First American Title 52% PAGE.
SAP ERP 50%, 15% PAGE, ROW.
MS Dynamics AX 81% PAGE. ERP application.
ServiceU 35% PAGE.
Data Compression and Workload Performance
Work Load PerformanceCustomer Performance impact Notes
BWIN.com 5% PAGE compression. OLTP Web application. Large volume of transactions.
NASDAQ 40%-60% PAGE compression. Large sequential range queries . DW Application.
GE Healthcare -1% PAGE compression. 500 users, 1500 Transactions / sec. OLTP with some reporting queries.
Manhattan Associates -11% PAGE compression. A lot of insert, update and delete activity.
First American Title 2% - 3% PAGE compression. OLTP Application.
MS Dynamics AX 3% PAGE compression. ERP application – small transactions.
Top Customer Question - 1Question: Data compression increases the size of my database?
TB-1 (16 GB)
File Size = 20 GB
CompressedTB-1 (8GB)
File Size = 28 GB
Empty Space (16 GB)
o Suggestions:o Do nothing if the object needs to growo Start by compressing the smaller object firsto Use shrink. But it fragments the datao Bulk export/import into empty compressed table. Data availability?o Moving object to a new filegroup
TB-2 ( 4 GB)
CompTB-22GB
Empty Space(14 GB)
Empty Space4GB
TB-1 (16 GB)
File Size = 20 GB
CompTB-2
(2 GB)
File Size = 22 GB
TB-2 ( 4 GB)
Free Space ( 4 GB)
Comp TB-14 GB
Comp TB-14 GB
Empty Space (16 GB)
File Size = 26 GB
Top Customer Question - 2
Question: I am not getting any or minimal compression?ROW Compression:
No fixed length columnFixed length columns but all bytes are usedCompressed row > 4K
PAGE CompressionNo column prefix savingsNo common values for page dictionaryLarge row size implying 1 to few rows per page
Mostly LOB data
Top Customer Question - 3How do I get PAGE compression on a HEAP?
PAGEROW
Problem: Adhoc inserts on a new page will not be PAGE compressed in a HEAPSuggestions
Rebuild HEAP periodically (ONLINE available)Use TABLOCK when bulk importing into a HEAP
R1R2R3R4R5
Header
BTREE PAGE
CI structure
Header
Top Customer Question - 4
Index RelatedQuestion: It is taking longer to rebuild index or heap
ROW compression takes approx. 1.5 times the CPU time used for rebuilding an indexPAGE compression takes approx. 2 to 5 times the CPU time used for rebuilding an index Your mileage may vary
Question: Do I need to take object offline to enable compression?ONLINE operations supported. Few unique values for the leading column of the index may reduce parallelism. This is similar to regular indexCompressing a heap with ONLINE = ON uses a single CPU for compression (or rebuild)
Top Customer Question - 5
NONE ROW PAGE0
12
24
36
0
2
4
6
Time to BULK INSERT 50M rows (min) Table Size after Load (GB)
Compression Type
Tim
e (m
inut
es)
Tabl
e Si
ze a
fter
Loa
d (G
B)
Question: What is the impact of compression on Bulk Import?
Top Customer Question - 6Question: What object(s) should I compress?
Evaluate Compression savings General: DML heavy (ROW) vs Query heavy (PAGE) both for table/partitionDon’t compress all objects in the database without evaluation
If table is relatively small don’t bother compressingConsider compression if table/partition accessed rarely Look at index usage
Used Rarely?Singleton lookup Range Access
Example: Enabling CompressionUnpartitioned table
Table
Index
PAGE Compressed
Index
Uncompressed
Table
Index
Example: Enabling CompressionLatest partition uncompressed
Jan-Mar Apr-June July-Sept Oct-Dec
PAGE Compressed
Uncompressed
ROW Compressed
Customer Example: An SAP Deployment
Table Compression Strategies
TableSize (GB)
ROW save %
PAGE save %
1-row Read
>1-row Read Update Delete Insert
% Scan % Update Plan Notes
COSP 398 80% 90% 2,797 58,735 886,187 15,747 584,017 3.80% 57.27% ROW Updates!
GLPCA 123 15% 89% 0 929,637 0 16,802 9,020 92.46% 0.0% PAGE Scan mostly
COEP 185 30% 81% 019,0read-mostly
36 2,927 0 48,182 27.14% 4.17% ROWLight use, but stay
low riskRESB 243 38% 83% 9,837 7,977,629 943,380 1,321 14,877 89.16% 10.54% ROW #updatesACCTIT 210 21% 87% 0 0 0 0 54,580 0.00% 0.0% PAGE Append onlyMSEG 183 28% 87% 3,441,918 24,684,252 28 0 70,797 87.54% 0.0% PAGE Scan mostlyFAGLFLEXA 98 29% 88% 0 298 0 0 58,882 0.50% 0.0% PAGE Append onlyBSIS 148 30% 90% 0 9,069 67 5,773 64,366 11.44% 0.08% PAGE Append mostlyCOSB 150 84% 92% 0 88 0 0 0 100.00% 0.00% ROW ROW ~=PAGEGLFUNCA 40 15% 89% 0 6 0 0 0 100.00% 0.00% PAGE Read Only
Inputs: sp_estimate_data_compression_savings, dm_db_physical_index_usage_stats, SAP knowledge Computed: S=% scans; U=% updates
ROW ~= PAGE => ROWHigh Update, Low Scan => ROWHigh Scan => PAGEAppend Only => PAGE
Read-only => PAGE
Top Customer Question - 7
How do I compress Unicode data?SQL uses UCS-2 encoding scheme.
NCHAR and NVARCHAR data always takes 2 bytes of storage. Waste of 1 byte/char for commonly deployed locales (e.g. ASCII)Existing ROW compression ineffectivePAGE compression only helps for exact match.
Sample representation‘a’ = 0x61 (ASCII) and 0x0061 (UCS-2)
Agenda
Customer Experience and FeedbackUnicode Compression in SQL2008R2Future Directions
SQL2008R2: Unicode and Competitive challenge
Most ISVs are switching their customers to the UNICODE version of applications.
CompetitionOracle supports UTF-8 encoding for Unicode.
Results in 1 byte storage for ASCII and most EuropeanDB2 provides UTF8 and Unicode compression as well
SQL2008R2: Unicode Solution
Use standard SCSU compression technique http://www.unicode.org/reports/tr6/tr6-4.html No application change neededCompression Achieved
Comparison of UNICODE compression with SCSU and UTF-8Locale SCSU UTF-8English 0.5 0.5Japanese .85 1.0Korean 1.0 1.0Turkish .52 .53German .5 .5Vietnamese 0.61 0.68Hindi 0.5 1.0
SQL2008R2: Enabling CompressionEnterprise Edition onlyTypes of data compressions
ROW Stores fixed length values as variable lengthSuperset of vardecimal storage formatRow metadata optimizedBLOB/LOB is not ROW compressedUnicode data is compressed. For most locales 50% savingSupported types NVARCHAR and NCHAR but not NTEXT
PAGE (includes ROW)Column PrefixDictionaryOnly in-row BLOB/LOB can potentially benefit from PAGE compression
Example: Unicode CompressionCreate Table SQLCOMP (Name NVARCHAR(20))Go
Insert into SQLCOMP values (‘SERVER’)Insert into SQLCOMP values (‘ENGINE’)Insert into SQLCOMP values (‘LOADERS’)
HEADER
0x00530051004C005300450052005600450052 “SQLSERVER”
0x00530051004C0045004E00470049004E0045 “SQLENGINE”
0x00530051004C004C004F00410044004500520053 “SQLLOADERS”
0x53514C534552564552 “SQLSERVER”
0x53514C454E47494E45 “SQLENGINE”
0x53514C4C4F414445525310 “SQLLOADERS”
0x53514C “SQL”
0x03534552564552 “?SERVER”
0x 03454E47494E45 “?ENGINE”
0x034C4F414445525310 “?LOADERS”
Col-prefix
ROW COMPRESSIONPAGE COMPRESSION
Changes to Estimate Compression Stored Procedure
SQL2008 RTMEstimated compression savings = 0 if compression mode did not change
SQL2008R2Estimated compression savings non-zero if space can be further saved. Useful in
De-fragmentation space savingsUnicode Compression space savings
Unicode CompressionSunil AgarwalSenior Program ManagerMicrosoft
DEMO
SQL2008R2: Unicode Compression Results
ROW Compression Savings with UNICODE Compression
Application ROW Compression ROW with UNICODE
SAP ERP Benchmark DB 9% 43%
Dynamics AX 30% 53.2%
**** 45% 64%
**** 30% 45%
Savings on Hardware Cost
Customer Projected Storage Cost Reduction
Microsoft ( MSIT/SAP) $500 K savings
**** $500K
Upgrade to SQL2008R2Scenarios
ROW compression enabled in SQL2008No database changes when upgradedUnicode value compressed only if it saves space. It happens when
An existing value is updated A new row is insertedIndex is rebuilt with ROW or PAGE compression
PAGE compression enabled in SQL2008Same as with ROW compression
No changes needed to existing scripts and DDL
Future Directions and ASKs
We are looking intoUnicode Compression for in-row portion for NVARCHAR(MAX)LOB CompressionXML compressionMake sp_estimate* available on all SKUs
Related Contents
http://sqlcat.com/whitepapers/archive/2009/05/29/data-compression-strategy-capacity-planning-and-best-practices.aspx
www.sqlcat.com
http://blogs.msdn.com/sqlserverstorageengine
http://blogs.msdn.com/sqlcat/
http://blogs.msdn.com/mssqlisv/
http://www.unisys.com/eprise/main/admin/corporate/doc/41371394.pdf
http://search.hp.com/redirect.html?type=REG&qt=sql+server+data+compression&url=http%3A//h71028.www7.hp.com/ERC/downloads/4AA1-8766ENW.pdf%3Fjumpid%3Dreg_R1002_USEN&pos=1
http://www.netapp.com/us/library/technical-reports/tr-3719.html
DAT Track Scratch 2 Win
Find the DAT Track Surface Table in the Yellow Section of the TLCTry your luck to win a Zune HDSimply scratch the game pieces on the DAT Track Surface Table and Match 3 Zune HDs to win
Resources
www.microsoft.com/teched
Sessions On-Demand & Community Microsoft Certification & Training Resources
Resources for IT Professionals Resources for Developers
www.microsoft.com/learning
http://microsoft.com/technet http://microsoft.com/msdn
Learning
Complete an evaluation on CommNet and enter to win!
Sign up for Tech·Ed 2011 and save $500 starting June 8 – June 31st
http://northamerica.msteched.com/registration
You can also register at the
North America 2011 kiosk located at registrationJoin us in Atlanta next year
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to
be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
JUNE 7-10, 2010 | NEW ORLEANS, LA