Sizing Basics Why Size? When to size Sizing issues: Bits and Bytes Blocks (aka pages) of Data ...

18
Sizing Basics Why Size? When to size Sizing issues: Bits and Bytes Blocks (aka pages) of Data Different Data types Row Size Table Sizing Index Sizing

Transcript of Sizing Basics Why Size? When to size Sizing issues: Bits and Bytes Blocks (aka pages) of Data ...

Page 1: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Sizing Basics

Why Size? When to size Sizing issues:

Bits and Bytes Blocks (aka pages) of Data Different Data types Row Size Table Sizing Index Sizing

Page 2: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Why and When

Initially to establish the scale of the required database to help select OS environment and DBMS

Establish HDD requirements To get a “feel” for the data:

which tables need special treatment: separate tablespaces? Partitioning?….

Generate statistics which help in physical design

Continually monitor

Page 3: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Sizing Basics

• Bit– “One of the two digits 0 and 1 used in

binary notation. The word comes from Binary digit”

• Byte– “A set of binary digits usually

representing one character, which is treated by the computer as one unit”

Page 4: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Common Data Types

Type Bytes Rangebit 0 or 1

tinyint 1 0 to 255

smallint 2 -216 to 216 - 1

integer 4 -231 to 231 - 1

decimal(m,n) 8 -1038 to 1038 - 1

datetime 8 depends on DBMS

char(n) , string(n) n Maximum 255

varchar2(n) n Maximum 4000

Page 5: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Oracle Number Data Type

Input Data Specified As Stored As

7,456,123.89 NUMBER 7456123.89

7,456,123.89 NUMBER (9) 7456124

7,456,123.89 NUMBER (9,2) 7456123.89

7,456,123.89 NUMBER (9,1) 7456123.9

7,456,123.89 NUMBER (6) (not accepted, exceeds precision)

7,456,123.89 NUMBER (7, -2) 7456100

Page 6: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Oracle Date Data Type

• Dates are renowned for causing problems when transfering data between DBMS because the method used to store the data internally differs. For example:

– In Oracle the DATE datatype stores the century, year, month, day, hours, minutes, and seconds.

– Paradox date fields can contain any valid date from January 1, 9999 BC to December 31, 9999 AD.

Page 7: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Oracle Date Data Type

Example with Oracle:

CREATE TABLE Birthdays_tab (Bname VARCHAR2(20),Bday DATE) ;

INSERT INTO Birthdays_tab (bname, bday) VALUES

('ANNIE',TO_DATE('13-NOV-92 10:56 A.M.','DD-MON-YY HH:MI A.M.'));

• Oracle uses its own internal format to store dates. Date data is stored in fixed-length fields of seven bytes each, corresponding to century, year, month, day, hour, minute, and second.

Page 8: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Oracle Varchar2 Data Type

• The VARCHAR2 datatype stores variable-length character strings.

• specify a maximum string length (in bytes or characters) between 1 and 4000 bytes for the VARCHAR2 column.

• For each row, Oracle stores each value in the column as a variable-length field unless a value exceeds the column's maximum length, in which case Oracle returns an error.

• Using VARCHAR2 saves on space used by the table.

• For example, storing “PETER” in a column defined as VARCHAR2(50) will cost only 5 bytes of storage, not 50.

• More efficient, but more difficult for sizing!

Page 9: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Oracle LOB Data Types

• The LOB datatypes BLOB, CLOB, and BFILE enable you to store large blocks of unstructured data (such as text, graphic images, video clips, and sound waveforms) up to 4 gigabytes in size. They provide efficient, random access to the data.

• CLOB is roughly equivalent to a MEMO in Paradox.

• You can manipulate and search CLOB fields using special tools

• Again, sizing is difficult as only the space needed is taken

• There are lots of other data types, but these will do for the time being!

Page 10: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Row Sizing

• Maximum row size can be determined by ascertaining the data-types of different columns of the table and adding together the respective number of bytes. Create Table SizeDemo (id Integer, Name Varchar2(20), Dayte DATE) ;

• Max Row Length = 4 + 20 +7 = 31 bytes• Max Row size is a safe estimation, but can

be considerably over estimated.

Page 11: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Oracle Row Sizing (8i onwards)

• As an alternative to manual calculation the average Row Size can be discovered using the ANALYZE function:

ANALYZE TABLE Member ESTIMATE STATISTICS;

Then ask for the statistic:SELECT AVG(NVL(VSIZE(SURNAME),1)) from

member;

Page 12: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Oracle Block Space

• block header = fixed header + variable transaction header + table directory + row directory

• block header = 57 + (23*1) + 4 + 2x = (84 + 2x) bytes, where x = number of rows in the block (assumes initrans=1)

• available data space = (block size - total block header) - ((block size - total block header) * (PCTFREE/100))

• For example, with PCTFREE = 10 and a block size of 2048, the total space for new data in a block is:

• available data space = (2048 - (84 + 2x)) - ((2048 - (84 + 2x)) * (10/100))

• = (1964 - 2x) - ((2048 - 84 - 2x) * (10/100))

• = (1964 - 2x) - (1964 - 2x) * 0.1

• = (1964 - 2x - 196 + 0.2x) bytes

• = (1768 - 1.8x) bytes Ref: Oracle Metalink support

Page 13: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Sizing: Rows per Block• The next Step is to take your average Row Size and

calculate the average number of rows that can fit into a database block

• average number of rows per block = floor(available data space / average row size)

• Eg, for a average row size of 28 bytes for the previous example:

• average number of rows per block = x = (1768 - 1.8x)/28 bytes • 28x = 1768 - 1.8x • 29.8x = 1768 • x ~ 59 = average number of rows per block

• Make sure you round x or the average number of rows per block

DOWN.

Page 14: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Table Sizing

• Once you know the number of rows that can fit inside the available space of a database block, you can calculate the number of blocks required to hold the proposed table:

• number of blocks for the table = number of rows / average number of rows per block

• Using 10,000 rows for table test:

• number of blocks for table test = 10000 rows / 59 rows per block

• ~ 169 blocks

Page 15: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Index Sizing• The method is the same, but there are some

differences in the numbers for Index Blocks:– INITRANS is usually = 2

– Fixed Header = 113

• So block header size = 113 + (23 * 2) bytes = 159

• available data space is still= (block size - block header size) - ((block size - block header size) * (PCTFREE/100))

• Assuming a block size of 2048 bytes and PCTFREE of 10:

available data space = (2048 bytes - 159 bytes) - ((2048 bytes - 159 bytes) * (10/100)) = 1889 bytes - 188.9 bytes = 1700.1 bytes

Page 16: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Index Sizing cont...

• Now find the total average column widths of the columns used in the index.

• Eg: Put an index on the NAME column of SizeDemo. Assuming average width of 22

• Take that into our calculation of bytes per index entry:• bytes per entry = entry header + ROWID length + F + V + D • entry header = 1 byte • ROWID length = 6 bytes • F = total length bytes of all columns with 1 byte column lengths

(CHAR, NUMBER, DATE, and ROWID types) • V = total length bytes of all columns with 3 byte column lengths

(VARCHAR2 and RAW datatypes) • D = 22 (from above)

Page 17: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Index Sizing cont...

• bytes per entry = 1 + 6 + (0 * 1) + (1 * 3) + 22 bytes

• = 32 bytes

• To calculate the number of blocks and bytes required for the index, use:

• number of blocks for index = 1.1 * ((number of not null rows * avg. entry size) / avail. data space

• The additional 10% added to this result accounts for branch blocks of the index.

• number of blocks for index = 1.1 * ((10000 * 32 bytes) / 1700)

• = 208 blocks (rounded up)

Page 18: Sizing Basics  Why Size?  When to size  Sizing issues:  Bits and Bytes  Blocks (aka pages) of Data  Different Data types  Row Size  Table Sizing.

Database Sizing

• Repeat this exercise for all your major tables and indexes

• 80/20 rule applies: don’t waste time on lookups for example, just make an appropriate, but safe, guess

• Add all the table sizes (in blocks) together and you have the disk space required

• To get this value in bytes, multiply by the database block size.