My sql with enterprise storage

MySQL with Enterprise Storage

Presented By Peter Teitelbaum

Peter’s Background

• Working with MySQL since 2003

• Programming since 1995

• Working with Linux since 1999

• Presently Dir. of Database Ops at Clear Channel Media and Entertainment

– With Clear Channel since 2008 and lead the DB team

– Users generate around 200 million page views per month

– MySQL used as primary data store

– Many instances of MySQL deployed

– Primary applications generate around 15K queries/sec

Introduction to Storage

• Approach to storage will greatly affect things like availability, scalability, replication, backup/recovery, and DR.

• SAN vs. attached storage

• Storage performance/architecture:

– Spindles (IOPS) vs. disk capacity

– Write caching

• Using a hosting facility and leasing SAN storage:

– Disks/networking shared with other customers?

– API/interface available to take/restore snapshots

What is LVM

• Logical Volume Management is a way to virtualize storage

• An abstraction layer over disk storage

LogicalVolumes

PhysicalVolumes

Volume Group

LV / VG / PV Relationship

/ dev/sdc1500GB

/dev/myvolgroup1500GB

/dev/myvolgroup/ images250GB

/ dev/myvolgroup/ logs10GB

/dev/myvolgroup/data600GB

/dev/sdb1500GB

/dev/sda1500GB

VALB1

PB1

Logical blocksin a volume

Physical Blockson disk/LUN

VALB2

PB2

VALB3

PB3

VA = Volume A

Logical / Physical Mapping

Snapshots vs. Clones

• A snapshot is a static/read only image of the data at a specific point in time.

• A clone is a dynamic/writeable image of the data at a specific point in time.

• A clone can appear to be a snap when mounted read only

VALB1

PB1



VBLB1

VA = volume A, VB = snapshot of VA

Creating a snapshot or clone

VALB1

PB1



VBLB1

VA = volume A, VB = snapshot of VA

PB2

XXX

(copy)

(write)

Copy-On-Write (COW)

LVM doesn’t actually use blocks

• File systems – block size– A block is logical container of data with a configurable fixed size in

bytes. A block is written/read from disk in a single disk I/O operation.

• RAID – stripe size– The smallest unit of storage allocation that can be written to each disk. It

is a fixed size measured in bytes. (i.e. RAID 0,4,5,6)

• LVM – extent size– The smallest unit of storage allocated to each physical volume. An

extent is a fixed size measured in bytes.

It is the foundation for:• Linux mdadm (RAID)• Linux LVM2• File system encryption• and more

For more info: http://mbroz.fedorapeople.org/talks/DeviceMapperBasics/dm.pdf

How it works: Device Mapper

• Device Mapper is a Linux framework used to create create block devices which are mapped to other block devices.

http://mbroz.fedorapeople.org/talks/DeviceMapperBasics/dm.pdf

Backing up MySQL

• Backup basics:

– Online vs. offline & physical vs. logical

• http://dev.mysql.com/doc/refman/5.5/en/backup-types.html

– Mysqldump is good for portability, bad for fast recovery

• Sequential export, not globally atomic, holds locks, long running processes

• Can take a long time to restore

• File system copy will be impractical if large dataset

• Snapshot is extremely fast

• Quiescing the database

• Can be more frequent than traditional backups

• Transfer backups to another medium

Quiescing MySQLInitiate backup

Set session timeout

Call lock monitor

Sleep Hanging lock?Lock monitor terminates

Kill global lock

yes

no

Acquire global lock

Take snapshotCapture binlog position and processlist

Release lock

Backup complete

Check dirty pagesDirty pages?Or timeout

Sleep

yes

no

Get dirty page %Set % to zero

Restore dirty page %

Recovery

• Reverting to a backup cannot be undone

• Future backups lost

Recovery & Future Backups

Backups

Recovery

• Reverting to a backup cannot be undone

• Future backups lost

• Near instant recovery, no need to copy data files, untar or source in a dumpfile

• Any slaves will need to be rebuilt

• PITR – can be automated or performed manually

File System Architecture

• Isolated

– Only datadir is restored, binlogs left untouched

– Simpler recovery process

– Transactions will be duplicated during PITR

– Slaves maintain binlog file and position

• Unified

– Datadir and binlogs are restored

– Binlogs must be copied to another location for PITR

– More complicated recovery process

– Slave replication will fail

Isolated

Unified

Datadir & Binary Logs/var/ lib/mysql/

Datadir/var/ lib/mysql/

Binary Logs/var/ lib/mysql-binlog

Granular Recovery

• Table or row level recovery

• Use a dedicated data recovery host

– Create clone of snap (MySQL needs r/w file system)

– Don’t circumvent with clones as backups

• Extract via mysqldump then source into production

– Use WHERE clause (-w ‘id=123’)

– Pipe to sed to convert INSERT to REPLACE, etc.

• MyISAM, copy the 3 files, rename, then RENAME TABLE

• InnoDB, create dumpfile, source in with a new table name, then RENAME TABLE

Retention

• Process to purge snaps based on age

– Example: 10d; 7d, 4w, 2m; etc.

• Move archive backups to another medium

– Will never be restored

– Disk usage grows while aging

– Greater the difference to parent, the greater IO overhead

Replication

• Run slaves on clone of parent (master)

Replication – shared disk

Aggregate

/vol/mysql_master /vol/mysql_slave1 /vol/mysql_slave2 /vol/mysql_slave3

Underlying physical disks are shared for master/slave volumes

Master Slave1 Slave2 Slave3

Replication

• Run slaves on clone of parent (master)• Shared blocks on parent and slaves• Add slaves quickly to add capacity• Removes need for slave resync tools

Resync a slave

/vol/mysql_master /vol/mysql_slave1

Re-syncing a slave or reclaiming disk space is fast and easy

(destroy clone volume)

X

/vol/mysql_slave1

(create new volume as clone)

(old)

(new)

Master1 Slave1

Slave1

Slave Reclone

• Create a clone of a backup snapshot– Will require time for replication to catch up

• Clone from live master– Will need to quiesce master like during

backup

• Consider urgency and timing

Replication• Run slaves on clone of parent (master)

• Shared blocks on parent and slaves

• Add slaves quickly to add capacity

• Removes need for slave resync tools

• Entire reclone process should be automated

• No need to back up slaves

• Reclone regularly to reclaim space

• Reclone after recover from backup

• Mostly static or r/o data? Consider no MySQL replication

– Replication is inefficient for large volumes of static data

– Consider a reclone instead

• Easier to rely on SAN for replication to DR facility than replicating multiple MySQL instances

Other Things to Consider

• Monitor for most recent backup

• Automated recovery testing host

• Monitor for backup quality– Recoverable

– Dirty

• ETL host with automated reclones from backup

Thanks!

Questions/thoughts?

My sql with enterprise storage

Documents

Transcript of My sql with enterprise storage