My sql with enterprise storage
-
Upload
carolinerose -
Category
Documents
-
view
338 -
download
0
Transcript of My sql with enterprise storage
MySQL with Enterprise Storage
Presented By Peter Teitelbaum
Peter’s Background
• Working with MySQL since 2003
• Programming since 1995
• Working with Linux since 1999
• Presently Dir. of Database Ops at Clear Channel Media and Entertainment
– With Clear Channel since 2008 and lead the DB team
– Users generate around 200 million page views per month
– MySQL used as primary data store
– Many instances of MySQL deployed
– Primary applications generate around 15K queries/sec
Introduction to Storage
• Approach to storage will greatly affect things like availability, scalability, replication, backup/recovery, and DR.
• SAN vs. attached storage
• Storage performance/architecture:
– Spindles (IOPS) vs. disk capacity
– Write caching
• Using a hosting facility and leasing SAN storage:
– Disks/networking shared with other customers?
– API/interface available to take/restore snapshots
What is LVM
• Logical Volume Management is a way to virtualize storage
• An abstraction layer over disk storage
LogicalVolumes
PhysicalVolumes
Volume Group
LV / VG / PV Relationship
/ dev/sdc1500GB
/dev/myvolgroup1500GB
/dev/myvolgroup/ images250GB
/ dev/myvolgroup/ logs10GB
/dev/myvolgroup/data600GB
/dev/sdb1500GB
/dev/sda1500GB
VALB1
PB1
Logical blocksin a volume
Physical Blockson disk/LUN
VALB2
PB2
VALB3
PB3
VA = Volume A
Logical / Physical Mapping
Snapshots vs. Clones
• A snapshot is a static/read only image of the data at a specific point in time.
• A clone is a dynamic/writeable image of the data at a specific point in time.
• A clone can appear to be a snap when mounted read only
VALB1
PB1
Logical blocksin a volume
Physical Blockson disk/LUN
VBLB1
VA = volume A, VB = snapshot of VA
Creating a snapshot or clone
VALB1
PB1
Logical blocksin a volume
Physical Blockson disk/LUN
VBLB1
VA = volume A, VB = snapshot of VA
PB2
XXX
(copy)
(write)
Copy-On-Write (COW)
LVM doesn’t actually use blocks
• File systems – block size– A block is logical container of data with a configurable fixed size in
bytes. A block is written/read from disk in a single disk I/O operation.
• RAID – stripe size– The smallest unit of storage allocation that can be written to each disk. It
is a fixed size measured in bytes. (i.e. RAID 0,4,5,6)
• LVM – extent size– The smallest unit of storage allocated to each physical volume. An
extent is a fixed size measured in bytes.
It is the foundation for:• Linux mdadm (RAID)• Linux LVM2• File system encryption• and more
For more info: http://mbroz.fedorapeople.org/talks/DeviceMapperBasics/dm.pdf
How it works: Device Mapper
• Device Mapper is a Linux framework used to create create block devices which are mapped to other block devices.
Backing up MySQL
• Backup basics:
– Online vs. offline & physical vs. logical
• http://dev.mysql.com/doc/refman/5.5/en/backup-types.html
– Mysqldump is good for portability, bad for fast recovery
• Sequential export, not globally atomic, holds locks, long running processes
• Can take a long time to restore
• File system copy will be impractical if large dataset
• Snapshot is extremely fast
• Quiescing the database
• Can be more frequent than traditional backups
• Transfer backups to another medium
Quiescing MySQLInitiate backup
Set session timeout
Call lock monitor
Sleep Hanging lock?Lock monitor terminates
Kill global lock
yes
no
Acquire global lock
Take snapshotCapture binlog position and processlist
Release lock
Backup complete
Check dirty pagesDirty pages?Or timeout
Sleep
yes
no
Get dirty page %Set % to zero
Restore dirty page %
Recovery
• Reverting to a backup cannot be undone
• Future backups lost
Recovery & Future Backups
Backups
Recovery
• Reverting to a backup cannot be undone
• Future backups lost
• Near instant recovery, no need to copy data files, untar or source in a dumpfile
• Any slaves will need to be rebuilt
• PITR – can be automated or performed manually
File System Architecture
• Isolated
– Only datadir is restored, binlogs left untouched
– Simpler recovery process
– Transactions will be duplicated during PITR
– Slaves maintain binlog file and position
• Unified
– Datadir and binlogs are restored
– Binlogs must be copied to another location for PITR
– More complicated recovery process
– Slave replication will fail
Isolated
Unified
Datadir & Binary Logs/var/ lib/mysql/
Datadir/var/ lib/mysql/
Binary Logs/var/ lib/mysql-binlog
Granular Recovery
• Table or row level recovery
• Use a dedicated data recovery host
– Create clone of snap (MySQL needs r/w file system)
– Don’t circumvent with clones as backups
• Extract via mysqldump then source into production
– Use WHERE clause (-w ‘id=123’)
– Pipe to sed to convert INSERT to REPLACE, etc.
• MyISAM, copy the 3 files, rename, then RENAME TABLE
• InnoDB, create dumpfile, source in with a new table name, then RENAME TABLE
Retention
• Process to purge snaps based on age
– Example: 10d; 7d, 4w, 2m; etc.
• Move archive backups to another medium
– Will never be restored
– Disk usage grows while aging
– Greater the difference to parent, the greater IO overhead
Replication
• Run slaves on clone of parent (master)
Replication – shared disk
Aggregate
/vol/mysql_master /vol/mysql_slave1 /vol/mysql_slave2 /vol/mysql_slave3
Underlying physical disks are shared for master/slave volumes
Master Slave1 Slave2 Slave3
Replication
• Run slaves on clone of parent (master)• Shared blocks on parent and slaves• Add slaves quickly to add capacity• Removes need for slave resync tools
Resync a slave
/vol/mysql_master /vol/mysql_slave1
Re-syncing a slave or reclaiming disk space is fast and easy
(destroy clone volume)
X
/vol/mysql_slave1
(create new volume as clone)
(old)
(new)
Master1 Slave1
Slave1
Slave Reclone
• Create a clone of a backup snapshot– Will require time for replication to catch up
• Clone from live master– Will need to quiesce master like during
backup
• Consider urgency and timing
Replication• Run slaves on clone of parent (master)
• Shared blocks on parent and slaves
• Add slaves quickly to add capacity
• Removes need for slave resync tools
• Entire reclone process should be automated
• No need to back up slaves
• Reclone regularly to reclaim space
• Reclone after recover from backup
• Mostly static or r/o data? Consider no MySQL replication
– Replication is inefficient for large volumes of static data
– Consider a reclone instead
• Easier to rely on SAN for replication to DR facility than replicating multiple MySQL instances
Other Things to Consider
• Monitor for most recent backup
• Automated recovery testing host
• Monitor for backup quality– Recoverable
– Dirty
• ETL host with automated reclones from backup
Thanks!
Questions/thoughts?