Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Provisioning Storage for...
-
Upload
millicent-taylor -
Category
Documents
-
view
215 -
download
1
Transcript of Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Provisioning Storage for...
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Provisioning Storage for Oracle Database with ZFS and NetApp
Mike Carew
Oracle University
UK
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 2
Content
In this presentation:
• Background: Some interesting things about disks.
• Key Features of ZFS managed storage
• Key Features of NetApp managed storage
• Provisioning storage for Oracle DB using ZFS
• Provisioning storage for Oracle DB using NetApp
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 3
A few interesting things about disks …
Two categories for all disks (including fc, sas, sata, pata, scsi, ssd): Failed
…. and Failing
Always a disappointment: size speed reliability
… we know this already, this is why we have RAID systems.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 4
Trends in Storage
As disk capacity increases: MTBF decreases disk bottleneck increases
Uncorrectable bit error rates have stayed roughly constant 1 in 10^14 bits (~12TB) for desktop-class drives 1 in 10^15 bits (~120TB) for enterprise-class drives (allegedly) Bad sector every 8-20TB in practice (desktop and enterprise)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 5
Some facts: Measurements at CERNHow valuable is my data? How secure is my data on disk?
They wrote a simple application to write/verify 1GB file Write 1MB, sleep 1 second, etc. until 1GB has been written Read 1MB, verify, sleep 1 second, etc.
Ran continuously on servers with traditional HW RAID
After 3 weeks, found 152 instances of silent data corruption Previously thought “everything was fine”.
Traditional HW RAID only detected “noisy” data errors
Need end-to-end verification to catch silent data corruption
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 6
ZFS Key Features
Pooled Storage – Defines the physical aspects of capacity and redundancy
Transactional object store – FS is always consistent Application still has to deal with file content consistency, but
ZFS manages the File System consistency. End to end data integrity authentication: Recognition of
and Recovery from: bit rot, lost writes, misdirected writes, phantom writes
Snapshot backup through Copy on Write Lightweight, fast, low cost
Unparalleled scalability
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 7
ZFS Data Authentication
Checksum of data stored with parent data structure
Isolates checksum from data, therefore can validate the data Safeguards against: Bit rot, Phantom writes, Misdirected
reads and writes, DMA parity errors, Driver bugs, Accidental overwrite
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 8
ZFS Self Healing
With redundant storage, ZFS detects the bad block from the CRC stored in parent structure, then reconstructs from alternative copy and re-writes the defective block to heal the data.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 9
Virtual Devices and Dynamic Striping
ZFS dynamically stripes data across all of the top-level virtual devices.
36 GB 36 GB 36 GB
DataStripe 1 Stripe 3
Stand-alone Devices
36 GB 36 GB 36 GB
Stripe 2
Stripe 1
Mirror Device
36 GB 36 GB
Stripe 2
36 GB 36 GB
Mirror Device
Data
Mirrored Devices
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 10
RAID-Z Dynamic Stripe Width
All writes are full stripe Writes Adjusted to the size of the IO
Each logical block is its own stripe.
Stripes written to vdevs Avoids the Read-Modify-Writes Record size/block size/stripe
size needs consideration for Database use.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 11
NetApp Key Features
Write Anywhere file layout – WAFL Coalesces otherwise random writes into contiguous
sequential IO Snapshots by reference – lightweight, low cost, fast Write optimized – (Correspondingly not read optimized)
NVRAM write cache for write performance and commitment Mature data management applications: data backup, DR
replication, Application Integration (e.g. Snap Manager for Oracle) – All based around the snapshot
ONTAP 8 Cluster Mode has scale out capabilities which offer very high scalability options.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 12
NetApp Data AuthenticationBlock Checksums co-located with data
Not as extensive as ZFS measuresSafeguards against:
Bit rotOther measures (RAID Scrubbing) needed to safeguard:
Phantom writesMisdirected writes
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 13 6-13
NetApp Disk Aggregation TechnologyDual Parity raid groups – RAID-DP
The Raid Group is the protection boundary.
RAID-DP protects against dual concurrent disk failure within the raid group. DP is the only practical choice!Everything operating against a fixed size File System block – 4KB
(4KB WAFL Block size is not configurable or negotiable)
parity double
parity
data data data data data data
Raid group
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 14 6-14
NetApp Disk Aggregation ConceptThe Aggregate
Aggregates constructed from 1 or more raid groups
DP
rg1
rg2
aggr0
DP
DP
P
P
P
D D D D D rg0
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 15 6-15
WAFL Overhead
WAFL Aggregate Space
95%
FlexVol
Space
plus
Aggregate
Snapshot
Reserve
Aggregate Snapshot Reserve
FlexVol1
FlexVol#n
.snapshot
.snapshot
20%
80%
20%
80%
10%
90%
5%(adjustable)
Aggregates
The NetApp Aggregate is equivalent to the ZFS pool. It represents the useable capacity of the disk.
Flexible Volumes
Are the means of using space. They contain NAS file systems or SAN luns (you choose) and can be resized easily.
Snapshot Reserve
Management of snapshot backup space is through snapshot reserve.
NetApp Disk Space Allocation: Flexible Volumes
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 16 6-16
Disk and Data Protection
Data ONTAP protects against media flaws, misdirected writes and lost writes in several ways:
• RAID-4 and RAID-DP protecting against disk failure
• Media Scrubbing – Periodic checking block data against checksums– Bit rot
• RAID scrubbing – Periodic checking parity in 2 Dimensions is good– Lost writes– Misdirected writes– Phantom writes
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 17
Provisioning Storage for Oracle with ZFSArray Considerations
ZFS designed to work with JBOD and disk level caches NVRAM write cache based arrays should ignore ZFS cache
flush requests General ZFS rules apply ref use of whole disks If using HW RAID Storage array to present LUNS, then
quantity of LUNs should equal the number of physical disk Avoid dynamic space provisioning arrays for allocating
LUN’s for ZFS. ZFS uses the whole LUN space quickly negating the benefits of thin and dynamic provisioning.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 18
Provisioning Storage for Oracle with ZFSPool Considerations
If the array technology gives enough redundancy then use it. Duplicating the protection may work against you. ZFS may offer higher protection and recovery, but your array
may give enough RAID-Z not recommended where IOPS performance is
important. Then use mirrors.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 19
Provisioning Storage for Oracle with ZFSZFS Record Size Considerations
Match ZFS record size to Oracle database block size - The general rule is to set recordsize = db_block_size for the file system that contains the Oracle data files. This sets the maximum ZFS block size equal to DB_BLOCK_SIZE. Resulting efficiencies ensue in read performance and buffer cache occupancy.
When the db_block_size is less than the OS memory page size, 8 KB on SPARC systems and 4 KB on x86 systems, set the record size to the page size.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 20
Provisioning Storage for Oracle with ZFSZFS Record Size Considerations contd.
Modifying Record size is not retrospective Must copy files after record size change to have change
effected. Performance may be optimized with different block sizes
for different DB components Set appropriate record sizes for those file systems that
contain the respective files using diff block sizes.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 21
Provisioning Storage for Oracle with ZFSImproving Writing and Caching Performance
logbias ZFS property: Latency or Throughput Redo – Latency Data – Throughput Unless … Storage throughput is saturated, then set redo
logbias to Throughput, (therefore not performing double IO by first writing to ZIL and subsequently to FS, and as a consequence overall improvement in performance results)
primarycache ZFS property – to control what is cached in main memory (the primary ARC – Adaptive Replacement Cache)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 22
Provisioning Storage for Oracle with ZFSUse secondarycache (L2ARC)
Since Solaris 10 10/09 Store a cached copy of data for fast access SSD devices recommended Use the secondarycache ZFS property to determine
which file systems will use the secondary cache and what the cache contents should be.
For read latency sensitive workloads
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 23
Provisioning Storage for Oracle with ZFSSeparation of Data from Redo logs
Consider physical separation of Data files from Redo logs by placement in separate pools. Reduces conflict between sometimes opposite storage
needs Large storage for Data files require emphasis on throughput Small storage for Redo logs require emphasis on latency
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 24
Provisioning Storage for Oracle with NetAppWrite Performance
Write performance primarily achieved with NVRAM Remember NetApp is write optimized storage However, Physical disk must be able to keep up, otherwise
we lose benefits of NVRAM. We fall back to disk performance from memory performance.
Aggregated write throughput achieved with single large aggregate for all volumes of all types: Data, redo, control files.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 25
Provisioning Storage for Oracle with NetAppRead Performance
Read performance achieved through large aggregate with as many disks as possible/necessary Many small disks better than few large disks
ONTAP 7.x is 32 bit system, suffers limits on aggregate size (16TB) Large databases may need to span Aggregates Use ONTAP 8.x (64 bit aggregates)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 26
Provisioning Storage for Oracle with NetAppSAN or NAS?
SAN and NAS both supported FC, iSCSI, FCoE NFS
SAN implies some need for RAID management ZFS managed LUN’s provisioned from NetApp?
Suggest not to do this, ZFS suited to JBOD Or if must then focus on one or the other, do not try to use all
features of both. Unnecessarily complicated.
Oracle ASM managed LUNs is a good solution using ASM external redundancy. Not necessary to mirror when already highly redundant.
NFS on NetApp is a perfect solution, thin provisioned file system space, good performance, and easy management
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 27
Provisioning Storage for Oracle using ZFSBackup & Recovery Integration
Home grown self engineered solutions Can use snapshots and clones Replication of snapshots to Secondary DR/Backup location
zfs send operation
Fast and efficient Recommend using granular objects for easy of management i.e. snap and send several small objects rather than single
massive file system – may never succeed.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 28
Provisioning Storage for Oracle using NetAppBackup & Recovery Integration
Mature data management tools Application layer integration with backup snapshots
Snap Manager for Oracle (SMO) offer hot backup integration for OS image copy backup.
SMO Offers some integration with RMAN (Snapshot image copy
cataloging) Supports DB cloning Supports snapshot management of ASM disk groups built upon
NAS files or SAN devices.
Storage layer replication with mature tools Snapmirror (async/sync/semi-sync)
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 - 29
Summary
ZFS is a very powerful file system with unsurpassed scalability, and many very interesting features.
Deploying Oracle on ZFS requires a detailed knowledge of the demands placed by Oracle on the storage system, and of ZFS to meet the need configurationally.
NetApp not so fully featured, but more mature environment.
Simpler aggregation approach, although some severe size limits if restricted to modern large disks on ONTAP 7.3