Compression ow2009 r2

71
1 Carl Dudley – University of Wolverhampton, UK Data Compression in Oracle Carl Dudley University of Wolverhampton, UK UKOUG Director Oracle ACE Director [email protected]

Transcript of Compression ow2009 r2

Page 1: Compression ow2009 r2

1Carl Dudley – University of Wolverhampton, UK

Data Compression in Oracle

Carl Dudley

University of Wolverhampton, UK

UKOUG Director

Oracle ACE Director

[email protected]

Page 2: Compression ow2009 r2

2Carl Dudley – University of Wolverhampton, UK

Introduction

Working with Oracle since 1986

Oracle DBA - OCP Oracle7, 8, 9, 10

Oracle DBA of the Year – 2002

Oracle ACE Director

Regular Presenter at Oracle Conferences

Consultant and Trainer

Technical Editor for a number of Oracle texts

UK Oracle User Group Director

Member of IOUC

Day job – University of Wolverhampton, UK

Page 3: Compression ow2009 r2

3Carl Dudley – University of Wolverhampton, UK

Main Topics

Oracle 9i and 10g Compression - major features

Compression in data warehousing

Sampling the data to predict compression

Pre-sorting the data for compression

Behaviour of DML/DDL on compressed tables

Compression Internals

Advanced Compression in Oracle11g (for OLTP operations)

Shrinking unused space

Page 4: Compression ow2009 r2

4Carl Dudley – University of Wolverhampton, UK

Oracle Data Compression – Main Features

Page 5: Compression ow2009 r2

5Carl Dudley – University of Wolverhampton, UK

Compression : Characteristics

Trades Physical I/O against CPU utilization– Transparent to applications– Can increase I/O throughput and buffer cache capacity

Useful for 'read mostly' applications– Decision Support and OLAP

Compression is performed only when Oracle considers it worthwhile– Depends on column length and amount of duplication

Compression occurs only when duplicate values are present within and across columns within a single database block– Compression algorithms have caused little change to the Kernel code

• Modifications only to block formatting and accessing rows and columns– No compression within individual column values or across blocks– Blocks retrieved in compressed format in the buffer cache

Page 6: Compression ow2009 r2

6Carl Dudley – University of Wolverhampton, UK

Getting Compressed

Building a new compressed tableCREATE TABLE <table_name> ... COMPRESS;

CREATE TABLE <table_name> COMPRESS AS SELECT ...

Altering an existing table to be compressedALTER TABLE <table_name> MOVE COMPRESS;– No additional copy created but temp space and exclusive table

level lock required for the compression activity

ALTER TABLE <table_name> COMPRESS;– Future bulk inserts may be compressed – existing data is not

Compressing individual partitionsALTER TABLE <table_name>

MOVE PARTITION <partition_name> COMPRESS;– Existing data and future bulk inserts compressed in a specific

partition

Page 7: Compression ow2009 r2

7Carl Dudley – University of Wolverhampton, UK

Tablespace Level Compression

Entire tablespaces can compress by default

– All objects in the tablespace will be compressed by default

CREATE | ALTER TABLESPACE < tablespace_name> DEFAULT [ COMPRESS | NOCOMPRESS ] ...

Page 8: Compression ow2009 r2

8Carl Dudley – University of Wolverhampton, UK

Compressing Table Data

Uncompressed conventional emp table

Could consider sorting the data on columns which lend themselves to compression

CREATE TABLE emp(empno NUMBER(4),ename VARCHAR2(12)...)COMPRESS;

Compressed emp table

CREATE TABLE emp(empno NUMBER(4),ename VARCHAR2(12)...);

Page 9: Compression ow2009 r2

9Carl Dudley – University of Wolverhampton, UK

7369 CLERK 2000 1550 ACCOUNTING7782 MANAGER 4975 1600 PLANT7902 ANALYST 4000 2100 OPERATIONS7900 CLERK 2750 1500 OPERATIONS7934 CLERK 2200 1200 ACCOUNTING7654 PLANT 3000 1100 RESEARCH

Table Data Compression

Uncompressed emp table

Compressed emp table– Similar to index compression

[SYMBOL TABLE] [A]=CLERK, [B]=ACCOUNTING, [C]=PLANT,[D]=OPERATIONS7369 [A] 2000 1550 [B]7782 MANAGER 4975 1600 [C]7902 ANALYST 4000 2100 [D]7900 [A] 2750 1500 [D]7934 [A] 2200 1200 [B]7654 [C] 3000 1100 RESEARCH

Page 10: Compression ow2009 r2

10Carl Dudley – University of Wolverhampton, UK

Scanning Compressed Tables – Tests(1)

Compressing can significantly reduce disk I/O - good for queries?

– Possible increase in CPU activity– May need to unravel the compression but logical reads will be reduced

SELECT table_name,compressed,num_rows FROM my_user_tables;

TABLE_NAME COMPRESSED NUM_ROWS---------- ---------- --------EMP_NC DISABLED 1835008EMP_DSS ENABLED 1835008

SELECT COUNT(ename) FROM emp_nccall count cpu elapsed disk query current rows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 0 0Execute 1 0.00 0.00 0 0 0 0Fetch 2 0.31 1.88 10974 10978 0 1------- ------ -------- ---------- ---------- ---------- ---------- ----------total 4 0.31 1.88 10974 10978 0 1

SELECT COUNT(ename) FROM emp_dsscall count cpu elapsed disk query current rows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 0 0Execute 1 0.00 0.00 0 0 0 0Fetch 2 0.56 0.71 3057 3060 0 1------- ------ -------- ---------- ---------- ---------- ---------- ----------total 4 0.56 0.71 3057 3060 0 1

Page 11: Compression ow2009 r2

11Carl Dudley – University of Wolverhampton, UK

Scanning Compressed Tables – Tests (2)

Force all rows to be uncompressed– Increases logical I/O?

SELECT ename FROM emp_nc

call count cpu elapsed disk query current rows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 0 0Execute 1 0.00 0.00 0 0 0 0Fetch 122335 1.68 2.00 10974 132607 0 1835008------- ------ -------- ---------- ---------- ---------- ---------- ----------total 122337 1.68 2.00 10974 132607 0 1835008

SELECT ename FROM emp_dss

call count cpu elapsed disk query current rows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 0 0Execute 1 0.00 0.00 0 0 0 0Fetch 122335 2.17 2.24 3057 125286 0 1835008------- ------ -------- ---------- ---------- ---------- ---------- ----------total 122337 2.17 2.24 3057 125286 0 1835008

Page 12: Compression ow2009 r2

12Carl Dudley – University of Wolverhampton, UK

Space Reduction Due to Compression

Space usage summary

Statistics for table EMP_NCUnformatted Blocks ...........0FS1 Blocks (0-25) ............0FS2 Blocks (25-50) ...........0FS3 Blocks (50-75) ...........0FS4 Blocks (75-100) ..........0Full Blocks .............10,974Total Blocks ............11,776Total Bytes .........96,468,992Total Mbytes ................92Unused Blocks ..............653Unused Bytes .........5,349,376Last Used Ext FileId .........4Last Used Ext BlockId ..264,841Last Used Block ............371

– Summary routine adapted from Tom Kyte’s example

Statistics for table EMP_DSSUnformatted Blocks ...........0FS1 Blocks (0-25) ............0FS2 Blocks (25-50) ...........0FS3 Blocks (50-75) ...........0FS4 Blocks (75-100) ..........0Full Blocks ..............3,057Total Blocks .............3,200Total Bytes .........26,214,400Total Mbytes ................25Unused Blocks ...............85Unused Bytes ...........696,320Last Used Ext FileId .........4Last Used Ext BlockId ..263,561Last Used Block .............43

Page 13: Compression ow2009 r2

13Carl Dudley – University of Wolverhampton, UK

Compression in Data Warehousing

Page 14: Compression ow2009 r2

14Carl Dudley – University of Wolverhampton, UK

Compression in Data Warehousing

Fact tables are good candidates for compression– Large and have repetitive values– Repetitive data tends to be clustered

Dimension tables are often too small for compression

Large block size leads to greater compression– Typical in data warehouses– More rows available for compression within each block

Materialized views can be compressed (and partitioned) – Naturally sorted on creation due to GROUP BY– Especially good for ROLLUP views and join views

• Tend to contain repetitive data

Page 15: Compression ow2009 r2

15Carl Dudley – University of Wolverhampton, UK

Compression of Individual Table Partitions

Partition level– Partitioning must be range or list (or composite)

– The first partition will be compressed– Could consider compressing read only partitions of historical data

CREATE TABLE sales(sales_id NUMBER(8) : :,sales_date DATE)PARTITION BY RANGE(sales_date)(PARTITION sales_jan2009 VALUES LESS THAN (TO_DATE('02/01/2009','DD/MM/YYYY')) COMPRESS, PARTITION sales_feb2009 VALUES LESS THAN (TO_DATE('03/01/2009','DD/MM/YYYY')), PARTITION sales_mar2009 VALUES LESS THAN (TO_DATE('04/01/2009','DD/MM/YYYY')), PARTITION sales_apr2009 VALUES LESS THAN (TO_DATE('05/01/2009','DD/MM/YYYY')));

Page 16: Compression ow2009 r2

16Carl Dudley – University of Wolverhampton, UK

Effect of Partition Operations

Consider individual partitions compressed as shown

– Produces two new compressed partitions

PARTITION p1a COMPRESS VALUES LESS THAN 50PARTITION p1b COMPRESS VALUES LESS THAN 100PARTITION p2 COMPRESS VALUES LESS THAN 200PARTITION p3 NOCOMPRESS VALUES LESS THAN 300PARTITION p4 NOCOMPRESS VALUES LESS THAN 400

ALTER TABLE s1 SPLIT PARTITION p1 (50) INTO (PARTITION p1a, PARTITION p1b);

Splitting a compressed partition

PARTITION p1 COMPRESS VALUES LESS THAN 100PARTITION p2 COMPRESS VALUES LESS THAN 200PARTITION p3 NOCOMPRESS VALUES LESS THAN 300PARTITION p4 NOCOMPRESS VALUES LESS THAN 400

Page 17: Compression ow2009 r2

17Carl Dudley – University of Wolverhampton, UK

Effect of Partition Operations (contd)

Effect of merging compressed partitions

– New partition p1b_2 is not compressed by default• Same applies if any to be merged are initially uncompressed

PARTITION p1a COMPRESS VALUES LESS THAN 50PARTITION p1b COMPRESS VALUES LESS THAN 100PARTITION p2 COMPRESS VALUES LESS THAN 200PARTITION p3 NOCOMPRESS VALUES LESS THAN 300PARTITION p4 NOCOMPRESS VALUES LESS THAN 400

PARTITION p1a COMPRESS VALUES LESS THAN 50PARTITION p1b_2 NOCOMPRESS VALUES LESS THAN 200PARTITION p3 NOCOMPRESS VALUES LESS THAN 300PARTITION p4 NOCOMPRESS VALUES LESS THAN 400

ALTER TABLE s1 MERGE PARTITIONS p1b,p2 INTO PARTITION p1b_2;

Merge of two compressed partitions

Page 18: Compression ow2009 r2

18Carl Dudley – University of Wolverhampton, UK

Forcing Compression During Partition Maintenance

ALTER TABLE s1 MERGE PARTITIONS p2,p3 INTO PARTITION p2_3 COMPRESS;

Force compression of new partition after a merge operation

ALTER TABLE s1 SPLIT PARTITION p1 AT (50) INTO (PARTITION p1a COMPRESS,PARTITION p2a);

Force compression of the new partition(s) after a split operation

Partitions may be empty or contain data during maintenance operations involving compression

Page 19: Compression ow2009 r2

19Carl Dudley – University of Wolverhampton, UK

Effect of Partitioned Bitmap Indexes

Scenario : – Table having no compressed partitions has bitmap locally partitioned

indexes– The presence of usable bitmap indexes will prevent the first operation that

compresses a partition

SQL> ALTER TABLE sales MOVE PARTITION p4 COMPRESS; ORA-14646: Specified alter table operation involving compressioncannot be performed in the presence of usable bitmap indexes

SQL> ALTER TABLE part2 SPLIT PARTITION p1a AT (25) INTO ( PARTITION p1c COMPRESS,PARTITION p1d);

ORA-14646: Specified alter table operation involving compressioncannot be performed in the presence of usable bitmap indexes

Page 20: Compression ow2009 r2

20Carl Dudley – University of Wolverhampton, UK

Uncompressed partitioned table with bitmap index in 3 partitions

Compression of Partitions with Bitmap Indexes in Place

CREATE TABLE emp_partPARTITION BY RANGE (deptno)(PARTITION p1 VALUES LESS THAN (11), PARTITION p2 VALUES LESS THAN (21), PARTITION p3 VALUES LESS THAN (31))AS SELECT * FROM emp;

CREATE BITMAP INDEX part$empno ON emp_part(empno) LOCAL;

Page 21: Compression ow2009 r2

21Carl Dudley – University of Wolverhampton, UK

First compression operation requires the following

1. Mark bitmap indexes unusable (or drop them)

Compression of Partitions with Bitmap Indexes in Place (continued)

3. Rebuild the bitmap indexes (or recreate them)

ALTER INDEX part$empno UNUSABLE;

2. Compress the first (and any subsequent) partition as required

ALTER TABLE emp_part MOVE PARTITION p1 COMPRESS;

ALTER INDEX part$empno REBUILD PARTITION p1;ALTER INDEX part$empno REBUILD PARTITION p2;ALTER INDEX part$empno REBUILD PARTITION p3;

– Each index partition must be individually rebuilt

Page 22: Compression ow2009 r2

22Carl Dudley – University of Wolverhampton, UK

Oracle needs to know maximum records per block– Correct mapping of bits to blocks can then be done– On compression this value increases

Oracle has to rebuild bitmaps to accommodate potentially larger number of values even if no data is present in the partition(s)– Could result in larger bitmaps for uncompressed partitions

• Increase in size can be offset by the actual compression

Once rebuilt, the indexes can cope with any compression– Subsequent compression operations do not invalidate bitmap indexes

Recommended to create each partitioned table with at least one compressed (dummy/empty?) partition– Can be subsequently dropped

Compression activity does not affect Btree usability

Compression of Partitions with Bitmap Indexes in Place (continued)

Page 23: Compression ow2009 r2

23Carl Dudley – University of Wolverhampton, UK

Table Level Compression for Partitioned Tables

Compression can be the default for all partitions

CREATE TABLE sales(sales_id NUMBER(8), : : sales_date DATE)COMPRESSPARTITION BY (sales_date)...

– Can still specify individual partitions to be NOCOMPRESS

Default partition maintenance actions on compressed tables– Splitting non-compressed partitions results in non-compressed partitions– Merging non-compressed partitions results in a compressed partition– Adding a partition will result in a new compressed partition– Moving a partition does not alter its compression

Page 24: Compression ow2009 r2

24Carl Dudley – University of Wolverhampton, UK

Finding the Largest Tables

SELECT owner ,name ,SUM(gb) ,SUM(pct)FROM (SELECT owner ,name ,TO_CHAR(gb,'999.99') gb ,TO_CHAR((RATIO_TO_REPORT(gb) OVER())*100,'999,999,999.99') pct FROM (SELECT owner ,SUBSTR(segment_name,1,30) name ,SUM(bytes/(1024*1024*1024)) gb FROM dba_segments WHERE segment_type IN ('TABLE','TABLE PARTITION') GROUP BY owner ,segment_name ) )WHERE pct > 3 GROUP BY ROLLUP(owner ,name) ORDER BY 3;

Useful for finding candidates for compression

Page 25: Compression ow2009 r2

25Carl Dudley – University of Wolverhampton, UK

Finding the Largest Tables (contd)

OWNER NAME SUM(GB) SUM(PCT)------------- -------------- ------------- ----------SH COSTS .03 8.23SH SALES .05 14.44SH SALES_HIST .13 32.93SH .21 55.61SYS IDL_UB2$ .01 3.86SYS SOURCE$ .02 6.43SYS .03 10.29 .24 65.90

Page 26: Compression ow2009 r2

26Carl Dudley – University of Wolverhampton, UK

Sampling Data to Predict Compression

Page 27: Compression ow2009 r2

27Carl Dudley – University of Wolverhampton, UK

Compression Factor and Space Saving

Compression Factor (CF)

compressed blocks

non-compressed blocks* 100

compressed blocks

non-compressed blocks - compressed blocks* 100

CF =

SS =

Space Savings (SS)

Page 28: Compression ow2009 r2

28Carl Dudley – University of Wolverhampton, UK

Predicting the Compression Factor

CREATE OR REPLACE FUNCTION compression_ratio (tabname VARCHAR2) RETURN NUMBER IS pct NUMBER := 0.000099; -- sample percentage blkcnt NUMBER := 0; -- original block count (should be < 10K) blkcntc NUMBER; -- compressed block count BEGIN EXECUTE IMMEDIATE ' CREATE TABLE temp_uncompressed PCTFREE 0 AS SELECT * FROM ' || tabname || ' WHERE ROWNUM < 1'; WHILE ((pct < 100) AND (blkcnt < 1000)) LOOP -- until > 1000 blocks in sample EXECUTE IMMEDIATE 'TRUNCATE TABLE temp_uncompressed'; EXECUTE IMMEDIATE 'INSERT INTO temp_uncompressed SELECT * FROM ' ||

tabname || ' SAMPLE BLOCK (' || pct || ',10)'; EXECUTE IMMEDIATE 'SELECT COUNT(DISTINCT(dbms_rowid.rowid_block_number(rowid))) FROM temp_uncompressed' INTO blkcnt; pct := pct * 10; END LOOP; EXECUTE IMMEDIATE 'CREATE TABLE temp_compressed COMPRESS AS SELECT * FROM

temp_uncompressed'; EXECUTE IMMEDIATE 'SELECT COUNT(DISTINCT(dbms_rowid.rowid_block_number(rowid))) FROM temp_compressed' INTO blkcntc; EXECUTE IMMEDIATE 'DROP TABLE temp_compressed'; EXECUTE IMMEDIATE 'DROP TABLE temp_uncompressed'; RETURN (blkcnt/blkcntc); END;/

Page 29: Compression ow2009 r2

29Carl Dudley – University of Wolverhampton, UK

Predicting the Compression Factor (continued)

CREATE OR REPLACE PROCEDURE compress_test(p_comp VARCHAR2)IS comp_ratio NUMBER;BEGIN comp_ratio := compression_ratio(p_comp); dbms_output.put_line('Compression factor for table ' || p_comp ||' is '|| comp_ratio );END;

EXEC compress_test('EMP') Compression factor for table EMP is 1.6

Run the compression test for the emp table

Page 30: Compression ow2009 r2

30Carl Dudley – University of Wolverhampton, UK

Compression Test – Clustered Data

CREATE TABLE clust (col1 VARCHAR2(1000))COMPRESS;

INSERT INTO clust VALUES ('VV...VV');INSERT INTO clust VALUES ('VV...VV');INSERT INTO clust VALUES ('VV...VV');INSERT INTO clust VALUES ('VV...VV');INSERT INTO clust VALUES ('VV...VV');INSERT INTO clust VALUES ('WW...WW'); : : :INSERT INTO clust VALUES ('WW...WW');INSERT INTO clust VALUES ('XX...XX'); : : :INSERT INTO clust VALUES ('YY...YY'); : : :INSERT INTO clust VALUES ('ZZ...ZZ');

CREATE TABLE noclust (col1 VARCHAR2(1000)) COMPRESS;

INSERT INTO noclust VALUES ('VV...VV');INSERT INTO noclust VALUES ('WW...WW');INSERT INTO noclust VALUES ('XX...XX');INSERT INTO noclust VALUES ('YY...YY');INSERT INTO noclust VALUES ('ZZ...ZZ');INSERT INTO noclust VALUES ('VV...VV');INSERT INTO noclust VALUES ('WW...WW');INSERT INTO noclust VALUES ('XX...XX');INSERT INTO noclust VALUES ('YY...YY');INSERT INTO noclust VALUES ('ZZ...ZZ');INSERT INTO noclust VALUES ('VV...VV');

: : :INSERT INTO noclust VALUES ('ZZ...ZZ');

Every value for column col1 is 390 bytes long

Both tables have a total of 25 rows stored in blocks of size 2K– So a maximum of four rows will fit in each block

Both have same amount of repeated values but the clustering is different

Page 31: Compression ow2009 r2

31Carl Dudley – University of Wolverhampton, UK

Compression Test (continued)

ww…ww

xx…xx

yy…yy zz…zz

vv…vv

clust - 20 rows per block. Rows 2,3,4,5 are duplicates of the first row in the block.Rows 7,8,9,10 are duplicates of the 6th row in the block, and this pattern is repeated.The residual space in the first block is used by the compressed data

header header

noclust - 4 rows per block. (7 blocks in total) The 5th row to be inserted must go in the next block as it contains different data

vv…vv

ww…ww

xx…xx

yy…yy

zz…zz

xx…xx

yy…yy

vv…vv

ww…ww

zz…zz

vv…vv

ww…ww

xx…xx

yy…yy

zz…zz

vv…vv

header header header header

Page 32: Compression ow2009 r2

32Carl Dudley – University of Wolverhampton, UK

Compression Test - Compression Factors

Compression test routine is accurate due to sampling of actual data– Make sure default tablespace is correctly set

• Temporary sample tables are physically built for the testing

EXEC compress_test('NOCLUST') Compression factor for table NOCLUST is 1

EXEC compress_test('CLUST') Compression factor for table CLUST is 3.5

Page 33: Compression ow2009 r2

33Carl Dudley – University of Wolverhampton, UK

Testing Compression : Sampling Rows

SELECT * FROM emp SAMPLE (10) SEED (1);

• Selects a 10% sample of rows• If repeated, a different sample will be taken

SELECT * FROM emp SAMPLE (10);

Tables can be sampled at row or block level– Block level samples a random selection of whole blocks– Row level (default) samples a random selection of rows

Samples can be 'fixed' in Oracle10g using SEED– SEED can can have integer values from 0 -100– Can also have higher numbers ending in '00'

– Shows a 10% sample of rows– If repeated, the exact same sample will be taken

• Also applies to block level sampling• The sample set will change if DML is performed on the table

Page 34: Compression ow2009 r2

34Carl Dudley – University of Wolverhampton, UK

Pre-Sorting the Data for Compression

Page 35: Compression ow2009 r2

35Carl Dudley – University of Wolverhampton, UK

Sorting the Data for Compression

Presort the data on a column which has : no. of distinct values ~ no. of blocks (after compression)

Information on column cardinality is shown in:

ALL_TAB_COL_STATISTICS ALL_PART_COL_STATISTICS ALL_SUBPART_COL_STATISTICS

Reorganize (pre-sort) rows in segments that will be compressed to cause repetitive data within blocks

For multi-column tables, order the rows by the low cardinality column(s)

CREATE TABLE emp_comp COMPRESS AS SELECT * FROM emp ORDER BY <some unselective column(s)>;

– For a single-column table, order the table rows by the column value

Page 36: Compression ow2009 r2

36Carl Dudley – University of Wolverhampton, UK

Sorting the Data for Compression (continued)

Presort data on column having no. of distinct values ~ no. of blocks~

SELECT COUNT(DISTINCT ename) FROM large_emp; 170 enames

SELECT COUNT(DISTINCT job) FROM large_emp; 5 jobs

SELECT * FROM large_emp; (114368 rows)

EMPNO ENAME JOB------- ------------ ---------- 43275 25***** CLERK 47422 128**** ANALYST 79366 6****** MANAGER : : :

Page 37: Compression ow2009 r2

37Carl Dudley – University of Wolverhampton, UK

Sorting the Data for Compression (continued)

Sorting on the job column is not the most effective

Compressed table sorted on ename : Number of used blocks = 172

CREATE TABLE cename COMPRESS AS SELECT empno,ename,job FROM large_emp ORDER BY ename;

Compressed table sorted on job : Number of used blocks = 243

CREATE TABLE cjob COMPRESS AS SELECT empno,ename,job FROM large_emp ORDER BY job;

Non-compressed table : Number of used blocks = 360

CREATE TABLE nocomp AS SELECT empno,ename,job FROM large_emp;

Page 38: Compression ow2009 r2

38Carl Dudley – University of Wolverhampton, UK

Behaviour of DML/DDL on Tables with Default (Direct Load)

Compression

Page 39: Compression ow2009 r2

39Carl Dudley – University of Wolverhampton, UK

Default Compressed Table Behaviour

Ordinary DML produces UNcompressed data– UPDATE

• Wholesale updates lead to large increases in storage (>250%)• Performance impact on UPDATEs can be around 400%• Rows are migrated to new blocks (default value of PCTFREE is 0)

– DELETE• Performance impact of around 15% for compressed rows

Creating a compressed table can take 50% longer

Page 40: Compression ow2009 r2

40Carl Dudley – University of Wolverhampton, UK

Default Compressed Table Behaviour (continued) Operations which 'perform' compression

– CREATE TABLE ... AS SELECT ...– ALTER TABLE ... MOVE ...– INSERT /*+APPEND*/ (single threaded) – INSERT /*+PARALLEL(sales,4)*/

• Requires ALTER SESSION ENABLE PARALLEL DML;• Both of the above inserts work with data from database tables and

external tables– SQL*Loader DIRECT = TRUE– Various partition maintenance operations

Can not be used for:– Certain LOB and VARRAY constructs (see 11g Docs)– Index organized tables

• Can use index compression on IOTs– External tables, index clusters, hash clusters– Tables with more than 255 columns

Page 41: Compression ow2009 r2

41Carl Dudley – University of Wolverhampton, UK

Compression Internals

Page 42: Compression ow2009 r2

42Carl Dudley – University of Wolverhampton, UK

Hexadecimal Dump of Compressed Data

Symbol Table :

14 unique names

5 unique jobs

Start of next block

Page 43: Compression ow2009 r2

43Carl Dudley – University of Wolverhampton, UK

Oracle Dump - Two Uncompressed Employee Rows

...block_row_dump:tab 0, row 0, @0x4a1tl: 41 fb: --H-FL-- lb: 0x2 cc: 8col 0: [ 3] c2 03 38col 1: [ 5] 43 4c 41 52 4bcol 2: [ 7] 4d 41 4e 41 47 45 52col 3: [ 3] c2 4f 28col 4: [ 7] 77 b5 06 09 01 01 01col 5: [ 3] c2 19 33col 6: *NULL*col 7: [ 2] c1 0btab 0, row 1, @0x4catl: 40 fb: --H-FL-- lb: 0x2 cc: 8col 0: [ 3] c2 03 39col 1: [ 5] 53 43 4f 54 54col 2: [ 7] 41 4e 41 4c 59 53 54col 3: [ 3] c2 4c 43col 4: [ 7] 77 bb 04 13 01 01 01col 5: [ 2] c2 1fcol 6: *NULL*col 7: [ 2] c1 15...

ALTER SYSTEM DUMP DATAFILE 8 BLOCK 35; Creates a trace file in USER_DUMP_DEST (10g) trace directory (11g)

Page 44: Compression ow2009 r2

44Carl Dudley – University of Wolverhampton, UK

tl: 18 fb: --H-FL-- lb: 0x0 cc: 2col 0: [ 9] 50 52 45 53 49 44 45 4e 54col 1: [ 4] 4b 49 4e 47bindmp: 00 0b 02 d1 50 52 45 53 49 44 45 4e 54 cc 4b 49 4e 47tab 0, row 1, @0x746tl: 9 fb: --H-FL-- lb: 0x0 cc: 2col 0: [ 7] 41 4e 41 4c 59 53 54col 1: [ 4] 46 4f 52 44bindmp: 00 0a 02 11 cc 46 4f 52 44.....

tab 0, row 13, @0x6cctl: 10 fb: --H-FL-- lb: 0x0 cc: 2col 0: [ 5] 43 4c 45 52 4bcol 1: [ 5] 53 4d 49 54 48bindmp: 00 0b 02 0e cd 53 4d 49 54 48tab 0, row 14, @0x780tl: 8 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 5] 43 4c 45 52 4bbindmp: 00 04 cd 43 4c 45 52 4b.....

tab 0, row 17, @0x761tl: 10 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 7] 41 4e 41 4c 59 53 54bindmp: 00 02 cf 41 4e 41 4c 59 53 54tab 1, row 0, @0x6c3tl: 9 fb: --H-FL-- lb: 0x0 cc: 3col 0: [ 5] 43 4c 45 52 4bcol 1: [ 5] 53 4d 49 54 48col 2: [ 3] c2 02 34bindmp: 2c 00 02 02 0d cb c2 02 34

Oracle Dump of Table of empno,ename,job

PRESIDENT KING

FORD

SMITH

CLERK

ANALYST

Page 45: Compression ow2009 r2

45Carl Dudley – University of Wolverhampton, UK

Oracle Avoids Unnecessary Compression

Create two tables with repeating small values in one column

CREATE TABLE tnocomp ( col1 VARCHAR2(1) ,col2 VARCHAR2(6))PCTFREE 0;

COL1 COL2---- ------1A 1ZZZZZ2A 2ZZZZZ3A 3ZZZZZ4A 4ZZZZZ5A 5ZZZZZ1A 6ZZZZZ2A 7ZZZZZ...4A 319ZZZ5A 320ZZZ

Insert data (320 rows) as follows

CREATE TABLE tcomp ( col1 VARCHAR2(1) ,col2 VARCHAR2(6))COMPRESS;

Values unique in col2Values repeat in col1 every 5 rows

Page 46: Compression ow2009 r2

46Carl Dudley – University of Wolverhampton, UK

Evidence of Minimal Compression

SELECT dbms_rowid.rowid_block_number(ROWID) block ,dbms_rowid.rowid_relative_fno(ROWID) file ,COUNT(*) num_rowsFROM &table_nameGROUP BY dbms_rowid.rowid_block_number(ROWID) ,dbms_rowid.rowid_relative_fno(ROWID);

BLOCK FILE NUM_ROWS----- ---- -------- 66 8 128 67 8 132 68 8 60

BLOCK FILE NUM_ROWS----- ---- -------- 34 8 126 35 8 126 36 8 68

tnocomp tcomp

Evidence of compression in the 'compressed' table

Page 47: Compression ow2009 r2

47Carl Dudley – University of Wolverhampton, UK

Further Evidence of Compression

ALTER SYSTEM DUMP DATAFILE 8 BLOCK 67;

block_row_dump:tab 0, row 0, @0x783tl: 5 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 2] 31 41bindmp: 00 1b ca 31 41tab 0, row 1, @0x77etl: 5 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 2] 32 41bindmp: 00 1b ca 32 41tab 0, row 2, @0x779tl: 5 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 2] 33 41bindmp: 00 1b ca 33 41tab 0, row 3, @0x774tl: 5 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 2] 34 41bindmp: 00 1a ca 34 41tab 0, row 4, @0x76ftl: 5 fb: --H-FL-- lb: 0x0 cc: 1col 0: [ 2] 35 41bindmp: 00 19 ca 35 41tab 1, row 0, @0x763tl: 12 fb: --H-FL-- lb: 0x0 cc: 2col 0: [ 2] 31 41col 1: [ 6] 31 30 31 30 30 30bindmp: 2c 00 02 02 00 c9 31 30 31 30 30 30tab 1, row 1, @0x757tl: 12 fb: --H-FL-- lb: 0x0 cc: 2col 0: [ 2] 32 41col 1: [ 6] 31 30 32 30 30 30bindmp: 2c 00 02 02 01 c9 31 30 32 30 30 30

HEX(A) = 41

Symbol Table (tab0) (1A,2A.3A.4A.5A)

Page 48: Compression ow2009 r2

48Carl Dudley – University of Wolverhampton, UK

Compression not Performed on Unsuitable Data

Both tables recreated with values in col1 now set to TO_CHAR(MOD(ROWNUM,50))– Much less repetition of values (only every 50 rows) allowing less

compression

COL1 COL2---- ------1 1ZZZZZ2 2ZZZZZ3 3ZZZZZ4 4ZZZZZ... 5ZZZZZ49 49ZZZZ50 50ZZZZ1 51ZZZZ2 52ZZZZ3 53ZZZZ

BLOCK FILE NUM_ROWS----- ---- -------- 66 8 128 67 8 128 68 8 64

BLOCK FILE NUM_ROWS----- ---- -------- 34 8 128 35 8 128 36 8 64

tnocomp

tcompOracle decides not to compress(if the compression factor is likely to be less than 1.03)

Page 49: Compression ow2009 r2

49Carl Dudley – University of Wolverhampton, UK

Comparison of Heap and IOT Compression

IOT = Index Organized table

Page 50: Compression ow2009 r2

50Carl Dudley – University of Wolverhampton, UK

Comparison of IOT and Heap Tables

Tests constructed using a standard set of data in emptest– Six columns with absence of nulls

EMPNO ENAME JOB HIREDATE SAL DEPTNO------- ---------- --------- --------- ------ ------ 1 KING PRESIDENT 17-NOV-81 5000 10 2 FORD ANALYST 03-DEC-81 3000 20 3 SCOTT ANALYST 09-DEC-82 3000 20 4 JONES MANAGER 02-APR-81 2975 20 5 BLAKE MANAGER 01-MAY-81 2850 30 6 CLARK MANAGER 09-JUN-81 2450 10 7 ALLEN SALESMAN 20-FEB-81 1600 30 8 TURNER SALESMAN 08-SEP-81 1500 30 9 MILLER CLERK 23-JAN-82 1300 10 10 WARD SALESMAN 22-FEB-81 1250 30 11 MARTIN SALESMAN 28-SEP-81 1250 30 12 ADAMS CLERK 12-JAN-83 1100 20 13 JAMES CLERK 03-DEC-81 950 30 14 SMITH CLERK 17-DEC-80 800 20 15 KING PRESIDENT 17-NOV-81 5000 10 16 FORD ANALYST 03-DEC-81 3000 20 ... ... ... ... ... ...2000000 SMITH CLERK 17-DEC-80 800 20

Page 51: Compression ow2009 r2

51Carl Dudley – University of Wolverhampton, UK

Creation of IOTs

empi : Conventional IOT based on emptest data

CREATE TABLE empi (empno,ename,job,hiredate,sal,deptno,CONSTRAINT pk_empi PRIMARY KEY(ename,job,hiredate,sal,deptno,empno))ORGANIZATION INDEXPCTFREE 0AS SELECT * FROM emptest

empic : First five columns compressed

CREATE TABLE empic (empno,ename,job,hiredate,sal,deptno,CONSTRAINT pk_empic PRIMARY KEY(ename,job,hiredate,sal,deptno,empno))ORGANIZATION INDEXPCTFREE 0COMPRESS 5AS SELECT * FROM emptest

Page 52: Compression ow2009 r2

52Carl Dudley – University of Wolverhampton, UK

Test tables

Average row length obtained from user_tables (avg_row_len)– Compressed tables show no reduction in average row length

Table Table Compress Blocks AverageName Type row length

emph Heap No 9669 33emphc Heap Yes 3288 33empi IOT No 10240 33empic IOT Yes 2560 33

Four tables built having heap/IOT structures and compressed/noncompressed data

Page 53: Compression ow2009 r2

53Carl Dudley – University of Wolverhampton, UK

Compression Data

Number of blocks in IOTs obtained from index validation

VALIDATE INDEX &index_name;SELECT * FROM index_stats;

Compressed IOTs have compression shown as DISABLED in user_tables, but ENABLED in user_indexes

SELECT COUNT( DISTINCT sys.dbms_rowid.rowid_block_number(ROWID)) BLOCKSFROM &table_name;

Number of blocks in heap tables obtained using dbms_rowid

Page 54: Compression ow2009 r2

54Carl Dudley – University of Wolverhampton, UK

Timings to Scan Tables (1)

SELECT deptno FROM <table_name>;

Table Name

Table Type

Compress CPU Time

Elapsed Time

DISK I/O Query

emph Heap No 2.40 2.34 9670 142386

emphc Heap Yes 2.95 3.13 3289 136316

empi IOT No 2.59 3.85 9507 142531

empic IOT Yes 2.60 2.63 2398 135590

Repeat queries on empi and empic have 0 physical reads– Could be suffering from cold buffer flooding

Page 55: Compression ow2009 r2

55Carl Dudley – University of Wolverhampton, UK

Updates to Heap and IOT Tables

Note the ‘explosion’ in size of the compressed heap table

Table Name

Table type

Compress Blocks before update

Blocks after update

PCT Increase

CPU time

Elapsed time

emph Heap No 9669 11020 13% 61.53 148.70

emphc Heap Yes 3288 20010 509% 112.54 224.13

empi IOT No 10240 19864 94% 155.06 275.73

empic IOT Yes 2560 4792 87% 52.45 208.82

UPDATE <table_name> SET ename = 'XXXXXXX‘;

Results for same update on Oracle9i with 230,000 row table

Table Name

Table type

Compress Blocks before update

Blocks after update

PCT Increase

Elapsed time

emph Heap No 1092 1280 17% 5mins

emphc Heap Yes 361 2291 535% 15mins

empi IOT No 1077 2218 106% 4mins

empic IOT Yes 261 527 101% 12mins

Lengthens each employee name by at least one character

Page 56: Compression ow2009 r2

56Carl Dudley – University of Wolverhampton, UK

Advanced Compression in Oracle11g (for OLTP Operations)

Page 57: Compression ow2009 r2

57Carl Dudley – University of Wolverhampton, UK

Advanced Compression in Oracle 11g

Conventional DML maintains the compression– Inserted and updated rows remain compressed

The compress activity is kept at a minimum

Known as the Advanced Compression option - extra cost

Syntax :COMPRESS [FOR OLTP]|[BASIC]

Compression settings tracked in user_tables– PCTFREE is 10 by default

SELECT table_name,compression,compress_for,pct_free FROM user_tables;

TABLE_NAME COMPRESSION COMPRESS_FOR PCT_FREE--------------- ----------- ------------ --------EMP_TEST DISABLED 10EMP_DSS_COMP ENABLED BASIC 0EMP_OLTP_COMP ENABLED OLTP 10

Page 58: Compression ow2009 r2

58Carl Dudley – University of Wolverhampton, UK

Compressing for OLTP Operations

Requires COMPATIBILITY = 11.1.0 (or higher)CREATE TABLE t1 ... COMPRESS FOR ALL OPERATIONS;

Conventionally inserted rows stay uncompressed until PCTFREE is reached– Mimimises compression operations

Header Header Header HeaderHeader

Conventional inserts are notcompressed

Block becomes full

Rows are nowcompressed

More uncompressedInserts fill the block

Rows arecompressed again

Transaction activity

Free space Free space Free space Free space Free space

Page 59: Compression ow2009 r2

59Carl Dudley – University of Wolverhampton, UK

OLTP Compression Behaviour

Table of employee data

Rows repeat every 7th row – lots of duplication

8K Block filled up to PCTFREE with 166 rows– Rows are uncompressed – Block dump reports 826 bytes of free space

Additional 10 rows inserted into table– Block dump reports 5636 bytes of free space– Shows evidence of compression of 166 rows

Page 60: Compression ow2009 r2

60Carl Dudley – University of Wolverhampton, UK

OLTP Compression

Some early results

Table containing 3million parts records

— Avoid large scale (batch) updates on OLTP compressed tables• Significant processing overheads

No compression

Compression for direct path operations

Compression for all operations

35M 12M 16M

Page 61: Compression ow2009 r2

61Carl Dudley – University of Wolverhampton, UK

OLTP Compression Test

Table containing 500000 sales recordsPROD_ID CUST_ID TIME_ID CHANNEL_ID PROMO_ID AMOUNT_SOLD ACCOUNT_TYPE------- ------- --------- ---------- -------- ----------- ------------- 13 5590 10-JAN-04 3 999 1232.16 Minor account 19 6277 23-FEB-05 1 123 7690.00 Minor account 16 6859 04-NOV-05 3 999 66.16 Minor account : : : : : : :

Three tables created

CREATE TABLE s_non_c PCTFREE 0 AS SELECT * FROM sales;

CREATE TABLE s_c_all PCTFREE 0 COMPRESS FOR OLTPAS SELECT * FROM sales;

— Non-compressed table with fully packed blocks

CREATE TABLE s_c COMPRESS AS SELECT * FROM sales;

— Compressed table for DSS operations

— Compressed table for OLTP operations

Page 62: Compression ow2009 r2

62Carl Dudley – University of Wolverhampton, UK

OLTP Compression Test (continued)

Non-compressed table

Compressed Table for DSS

Compressed Table for OLTP

Original size (blocks)

3043 848 848

Size after update (blocks)

3043 5475 5570

Elapsed time for update

18 secs 40 secs 1:32:03 secs

Test somewhat unfair on OLTP compression— Update is large and batch orientated— I/O subsystem was single disk — But still interesting?

UPDATE sales SET account_type = ‘Major account’ WHERE prod_id > 13;

Update stress test – update 94% of the rows— No lengthening of any data

Page 63: Compression ow2009 r2

63Carl Dudley – University of Wolverhampton, UK

OLTP Compression Characteristics

Type of Table

Timeto create(secs)

Initial Space(blocks)

Time to Scan

Update 1500 rows

Space after update (blocks)

Update 15000 rows

Space after update(blocks)

Update 50000 rows

Space after update(blocks)

Regular 11.45 3712 1.29 0.07 3712 2.09 3712 3.30 3712

DSS 5.26 768 0.35 1.38 802 4.91 976 9.41 1280

OLTP 4.23 896 0.40 1.42 980 10.04 1024 1m18.50 1938

Tests performed on 500,000 row employee table with data repeating every 14th row

Transactions update 1500, 15000 and 50,000 rows with no increase in length of any data– Update performance of OLTP compression gets worse as we move more

into a batch environment

Page 64: Compression ow2009 r2

64Carl Dudley – University of Wolverhampton, UK

Test for Conventional Inserts

Two empty tables created – Uncompressed (emp_ins)– Compressed for all operations (emp_ins_c_all)

1,900,000 employee rows inserted via conventional singe row inserts– Elapsed time to insert into emp_ins : 3m 54s– Elapsed time to insert into emp_ins_c_all : 4m.33s

Further tests show a 40% overhead

Page 65: Compression ow2009 r2

65Carl Dudley – University of Wolverhampton, UK

Introducing OLTP Compression

Changing a non-compressed table to OLTP compressed

ALTER TABLE t1 COMPRESS FOR OLTP;

— Will not compress existing rows— Compresses new rows including those inserted into partially full blocks below

the high water mark

Compression advisor for OLTP compression (test kit) can be obtained from — http://www.oracle.com/technology/products/database/compression/download

.html— dbmscomp.sql package header build— prvtcomp.plb package body build

Page 66: Compression ow2009 r2

66Carl Dudley – University of Wolverhampton, UK

Shrinking Unused Space in Oracle10g

Page 67: Compression ow2009 r2

67Carl Dudley – University of Wolverhampton, UK

Reclaiming Space in Oracle10g

Unused space below the High Water Mark can be reclaimed online

― Space caused by delete operations can be returned to free space― Object must be in a tablespace with ASSM

Two-step operation1. Rows are moved to blocks available for insert

• Requires row movement to be enabled

2. High water mark (HWM) is repositioned requiring table-level lock

Newly freed blocks are not returned to free space if COMPACT is used― HWM is not repositioned

ALTER TABLE <table_name> SHRINK SPACE;

ALTER TABLE <table_name> ENABLE ROW MOVEMENT;

ALTER TABLE <table_name> SHRINK SPACE COMPACT;

Page 68: Compression ow2009 r2

68Carl Dudley – University of Wolverhampton, UK

Repositioning of HWM During shrinks

HWM

Allocated block above the high

water mark

HWM

Free block Free block Free block

Rows are physically moved to blocks that have free space– Shrinking causes ROWIDs to change– Indexes (bitmap and btree) are updated accordingly

Shrinking does not work with compressed tables

Page 69: Compression ow2009 r2

69Carl Dudley – University of Wolverhampton, UK

Shrinking Objects

Why Shrink?― To reclaim space― To increase speed of full table scans― To allow access to table during the necessary reorganisation― The shrink operation can be terminated/interrupted at any time

• Can be continued at a later time from point of termination

Objects that can be SHRINKed― Tables― Indexes― Materialized views― Materialized view logs― Dependent objects may be shrunk when a table shrinks

ALTER TABLE <table_name> SHRINK SPACE [CASCADE];

Page 70: Compression ow2009 r2

70Carl Dudley – University of Wolverhampton, UK

Compression Outside of the Database

RMAN backups and LOB (Secure Files) can be compressed at three levels LOW, MEDIUM and HIGH

Data Pump exports can be compressed― Inline operation with the actual imp/exp job

Log Transport services in Data Guard can compress redo data to reduce network traffic

For further information, see http://www.oracle.com/technology/products/database/ oracle11g/pdf/advanced-compression-whitepaper.pdf

Page 71: Compression ow2009 r2

71Carl Dudley – University of Wolverhampton, UK

Data Compression in Oracle

Carl Dudley

University of Wolverhampton, UK

UKOUG SIG Director

[email protected]