12/11/2008 SEG 3550

26
12/11/2008 SEG 3550 SEG3550 Fundamentals of Information System Tutorial 8 Storage and File Structure

Transcript of 12/11/2008 SEG 3550

Page 1: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

SEG3550 Fundamentals of Information System

Tutorial 8

Storage and File Structure

Page 2: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Overview

Storage Hierarchy Physical Storage Media RAID RAID Levels File Organization

Page 3: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Why do we need to know about storage/file structure

Many database technologies are developed to utilize the storage architecture/hierarchy

Data in the database needs to be organized and stored/retrieved efficiently

Page 4: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Physical Storage Media

Physical storage media classified according to:

Data access speed

Cost per unit of data

Reliability Can differentiate storage into:

Volatile storage: Loses contents when power is switched off

Non-volatile storage: Contents persist even when power is switched off

Page 5: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Storage Hierarchy

Magnetic Tape

Optical Disk

Magnetic Disk

Flash Memory

Cache

unit priceMemory

Volatile

primary storage

Non-volatile speed

Secondary storage

Tertiary storage

Page 6: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Primary Storage Cache Volatile - managed by hardware

Speed: 7 to 20 ns (1 nanosecond = 10–9 seconds) Capacity:

A typical PC level 2 cache 64KB-2 MB. Within processors, level 1 cache usually ranges in size from 8 KB

to 64 KB.

Main memory: Volatile

Speed: 10s to 100s of nanoseconds; Capacity: Up to a few Gigabytes widely used currently

per-byte costs have decreased roughly factor of 2 every 2 to 3 years)

Page 7: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Secondary Storage

Flash memory Non-volatile

Speed: read speed similar to main memory. But writes are slow (few microseconds), erase is slower.

Capacity: 32M to a few Gigabytes currently Forms: SmartMedia, memory stick, secure digital, BIOS Cost: roughly same as main memory

Magnetic-disk Non-volatile

Capacities: up to roughly 1 TB(1000 GB) currently Data must be moved from disk to main memory for access, and

written back for storage. Growing constantly and rapidly with technology improvements (factor of 2 to 3 every 2 years)

Page 8: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Tertiary Storage (Non-volatile)

Optical storage CD-ROM (640 MB) and DVD (4.7 to 17 GB) most popular

forms Reads and writes are slower than with magnetic disk

Tape storage used primarily for backup (to recover from disk failure), and

for archival data sequential-access – much slower than disk very high capacity (40 to 300 GB tapes available)

Page 9: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Between Memory and Disk

The permanent residency of database is mostly on disk In database, cost is usually measured by the number of disk I/O But disks are too slow and we need memory to be the buffers …

but memory is volatile this introduced a number of issues

Page 10: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

RAID

RAID: Redundant Arrays of Independent Disks Disk organization techniques that manage a large

numbers of disks. high capacity and high speed by using multiple disks in

parallel, and high reliability by storing data redundantly

Page 11: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Mean time to failure (MTTF)

Average time the disk is expected to run continuously without any failure. Typically 3 to 5 years (1 year = 8,760 hours) MTTF = 30,000 to 1,200,000 hours for a new disk

an MTTF of 1,200,000 hours for a new disk means that given 1000 relatively new disks, on an average one will fail every 1200 hours

(assuming by Exponential Distribution)

When number of disks increase, the chance of some disk failure increase proportionally

Page 12: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Parallelism

Two main goals of parallelism in a disk system: 1. Load balance multiple small accesses to increase throughput

2. Parallelize large accesses to reduce response time.

Basic strategy: Stripping Compare and contrast bit stripping and byte stripping

Page 13: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Redundancy

store extra information that can be used to rebuild information lost in a disk failure

Basic strategy: mirroring, parity

Mean time to data loss depends on mean time to failure, and mean time to repair

10010010 1

Data Parity

Page 14: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

RAID Levels

Twice the Read transaction rate of single disks, same Write transaction rate as single disks.

Due to its cost and complexity, level 2 never really "caught on". Therefore, much of the information below is based upon theoretical analysis, not empirical evidence.

Page 15: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

RAID Levels (cont)

Very high read data transfer rate and very high write data transfer rate, but Controller design is fairly complex.

Very high read data transaction rate,but quite complex controller design. Difficult and inefficient data rebuild in the event of disk failure.

Page 16: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

RAID Levels (cont)

Highest read data transaction rate and medium write data transaction rate. Most complex controller design. Difficult to rebuild in the event of a disk failure (compared to RAID level 1)

RAID 6 is essentially an extension of RAID level 5 which allows for additional fault tolerance by using a second independent distributed parity scheme (dual parity)

Page 17: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Choice of RAID Levels

Level 1 provides much better write performance than level 5 Level 5 requires at least 2 block reads and 2 block writes to write a

single block, whereas Level 1 only requires 2 block writes Level 1 preferred for high update environments such as log disks

Level 1 had higher storage cost than level 5 disk drive capacities increasing rapidly (50%/year) whereas disk

access times have decreased much less (x 3 in 10 years) I/O requirements have increased greatly, e.g. for Web servers When enough disks have been bought to satisfy required rate of I/O,

they often have spare storage capacity so there is often no extra monetary cost for Level 1!

Level 5 is preferred for applications with low update rate,and large amounts of data

Level 1 is preferred for all other applications

Page 18: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Buffer Management

Database can not fit entirely in memory, needs memory as a buffer for speed reasons

LRU is used in many OS Spatial and temporal locality due to loops

Database has a more predictable behavior Example: join

Page 19: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

File Organization

The database is stored as a collection of files. Each file is a sequence of records. A record is sequence of fields

Approaches:

– assume record size is fixed

– each file has records of one particular type only

– different files are used for different relations

Page 20: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Fixed-Length Records

Simple approach:

– Store record i starting from byte n *(i − 1), where n is the size of each record

– Record access is simple but records may cross blocks

Deletion of record i – Several alternatives:

– move records i + 1, . . . , n to i, . . . , n − 1

– move record n to i

– link all free records on a free list

Page 21: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Free Lists

Store the address of the first record whose contents are deleted in the file header

Use this first record to store the address of the second available record, and so on

Can think of these stored addresses as pointers since they “point” to the location of a record

More space efficient representation: reuse space for normal attributes of free records to store pointers. (No pointers stored in in-use records)

Page 22: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Variable-Length Records

Variable-length records arise in database systems in several ways:

– Storage of multiple record types in a file

– Record types that allow variable lengths for one or more fields

– Record types that allow repeating fields (used in some older data models)

Byte string representation

– Attach an end-of-record ( ) control character to the end of each record

Page 23: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Organization of Records in Files

Sequential File Organization Suitable for applications that require sequential processing of the

entire file. The records in the file are ordered by a search-key. Need to reorganize the file from time to time to restore sequential order.

Deletion – use pointer chains Insertion – must locate the position in the file where the record is

to be inserted – If there is free space insert there – If no free space, insert the record in an overflow block – In either case, pointer chain must be updatedClustering File Organization Simple file structure stores each relation in a separate file Can instead store several relations in one file using a clustering

file organization

Page 24: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Exercises (1)

Show the structure of the file below after each of the following steps:

a. Insert(Brighton, A-323,1600) b. Delete record 2 c. Insert( Brighton, A-626,2000)

Page 25: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Exercises (2)

Show the structure of the file below after each of the following steps:

a. Insert(Mianus, A-101, 2800) b. Insert( Brighton, A-323, 1600) c. Delete( Perryridge, A-102, 400)

Page 26: 12/11/2008 SEG 3550

12/11/2008 SEG 3550

Thanks!