Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

28
Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect

Transcript of Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Page 1: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Best Practices for Backing Up Your

SystemLuca Ravazzolo

Technology Architect

Page 2: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• Cold file-level backup• Caché shutdown• Server-level copy to disk/tape• Caché restarted

• Caché online backup• Caché’s backup tool copies data blocks from

CACHE.DAT files to disk file or tape.• Full or various incremental backups

Types of backups

Page 3: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• SAN or disk array backup• Backup I/O stays within the SAN or the array• Block level copy from device to device (disk, tape,

virtual tape)• All vendors have some type of software to control

backups.• To backup a consistent image, a point-in-time

snapshot or clone is made of the source device.

Types of backups

Page 4: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• CDP: Continuous Data Protection (Near-CDP)• Use of separate appliance to journal changes out-of-

band allowing for recovery to any point-in-time,• Depending on space available can restore to most any

point-in-time.

• SAN-based Replication• Provides a disk-to-disk copy within the SAN, perhaps

over long distances,• Destination can be archived to tape.

Types of backups: others

Page 5: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Advantages and Challenges

Page 6: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• Advantages: • Caché stays up, users continue to work• Simple to implement, may not need 3rd party software

• Challenges:• Only backs up the CACHE.DAT data – must also backup

journals, other files.• Restores typically take multiple steps

• Create a Caché instance• Restore “*<date>.cbk” files from storage• Apply most recent full backup, then cumulative & incrementals• Apply journal files

Caché online backup

Page 7: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• Advantages:• Point-in-time copy of all data (Caché and otherwise)• Requires no downtime (when using Caché write

daemon freeze and thaw)

• Challenges:• Requires snap/clone technology• Requires additional software to coordinate

Disk array/SAN-based snapshot

Page 8: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• Advantages• CDP allows restore to nearly any point-in-time• Replication allows geographically separated backups

• Challenges• Non-Caché technologies require coordination with

Caché, i.e.• May end up with Caché in a crash-consistent state and

require recovery before use

• Requires appliances and software

CDP or replication

Page 9: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

External Backup

Coordinating with Caché

Page 10: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• For a consistent database image on your backup media (i.e. a CACHE.DAT without integrity errors) the write daemon’s cycle must be complete.

• Use the Backup.General.ExternalFreeze() method• Keeps write daemon from writing• Waits for current write daemon cycle (if active) to finish• Switches journal file• Logs information to the cconsole.log file.

Freeze the write daemon(s)

Page 11: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• ExternalFreeze command:

• OS command returns a code:• 5 – successful• 3 - failure

• While frozen, all updates are made as usual to database cache

• Processes continue to run normally UNLESS:• Available buffers in the database cache falls too low.• The ExternalFreeze lasts longer than the default limit (600

seconds)

Freezing the write daemon

%SYS>SET rc=##class(Backup.General).ExternalFreeze()#csession cache –U%SYS “##class(Backup.General).ExternalFreeze()”#echo $?

Page 12: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• Use Backup.General.ExternalThaw to allow write daemon(s) to resume writing.

• Thaw command:

• OS-level command returns one of these codes:• 5 – success• 3 - failure

Thaw the write daemon

%SYS>SET rc=##class(Backup.General).ExternalThaw()#csession cache –U%SYS “##class(Backup.General).ExternalThaw()”#echo $?

Page 13: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• Use Backup.General.ExternalSetHistory to log successful backups in the Backup History

• log is name of an externally created backup log• desc is free text

Another useful method

%SYS>SET log=“/var/logs/backup.log”,desc=“Full Backup”%SYS>S rc=##class(Backup.General).ExternalSetHistory(log,desc)

Page 14: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• The operating system user that executes the freeze/thaw command must have access to Caché.• In normal install, the “backup” user must be a Caché

user.• %Service_Terminal must allow OS-level

authentication.• Caché “backup” user needs RW on the

%DB_CACHESYS resource as well as use of %Admin_Operate and %Service_Terminal

Who runs the freeze/thaw?

Page 15: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Case Study: External Backup

Using snapshots, a de-duplication appliance and replication for an external backup of Caché

Page 16: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

External backup 1: Caché & snaps

Invoke script on server running Caché to FREEZE write daemon

Invoke script on server running Caché to THAW write daemon

Backup software initiates the backup process from media serverMEDIA

Backup software initiates clone or snapshot of all Caché arraysMEDIA

Page 17: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

External backup 2: Mount & copy

Backup software mounts snapshot on the media serverMEDIA

Backup software does file level copy from snapshot to disk-based backup appliance. MEDIA

Backup releases the snapshot via command-line interface call to disk controllerMEDIA

Page 18: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Ext Backup 3: Replicate, verify & archive

In secondary data center, replicated backup is restored, mounted in a Caché instance and an integrity check is run to verify structural integrity.

Depending on space and policy backup is kept online and/or archived to tape for long term storage.

Backup software initiates a backup copy to a secondary data center

MEDIA

Page 19: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Timings and best practices

Backup software initiates the backup process from media serverMEDIA

• Backup software:• Must be able to call freeze/thaw script on Caché

server• Must be able to initiate the snapshot• Most commercial backup software will work well

including EMC Networker, Symantec NetBackup, IBM Tivoli (TSM), etc.

Page 20: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Timings and best practices

Invoke script on server running Caché to FREEZE write daemon

04/02-02:30:00 (1098) 0 ExternalFreeze: Suspending system04/02-02:30:00 (1098) 0 ExternalFreeze: Description: Backup Performed by TSM at: 2013-04-02 02:30:0004/02-02:30:01 (1098) 0 ExernalFreeze: Start a journal restore for this backup

with journal file: /jrn/20130402.00304/02-02:30:02 (1098) 0 ExernalFreeze: System suspended

• Sample scripts available from the WRC

• Time to freeze and return depends on• Database activity• Current write daemon phase (i.e. is it writing to disk?)

Page 21: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Timings and best practices

Backup software initiates clone or snapshot of all Caché arraysMEDIA

• Creating the clone or snap - this period is when write daemon(s) are frozen.• Timing is based on array controller activity• If greater than a few minutes there is a risk of running

into freeze timeout.

04/02-02:30:02 (1098) 0 ExernalFreeze: System suspended04/02-02:30:52 (9109) 0 ExternalThaw: Resuming system

50 seconds frozen with IBM DS5300 using FlashCopy on a few TB of data with active systems

Page 22: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Timings and best practices

Invoke script on server running Caché to THAW write daemon

• Thawing the write daemon takes seconds at most.• Best practice is to be sure to thaw the database on

any error along the way.• Perhaps have an independent job to check database

status and thaw if frozen---so a failed backup will never leave Caché frozen.

Page 23: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

External backup 2: Mount & copy

Backup software mounts snapshot on the media serverMEDIA

Backup software does file level copy from snapshot to disk-based backup appliance. MEDIA

• Use of a de-duplication appliance as the file-level backup target speeds backup and saves space.

• Timings vary a lot here---disk used, dedupe rate etc.

Page 24: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Ext Backup 3: Replicate, verify & archive

Backup software initiates a backup copy to a secondary data center

MEDIA

• SAN level replication or replication via de-duplication appliance.

• Timings vary a lot here based on bandwidth and de-dupe rate if applicable.

Page 25: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Ext Backup 3: Replicate, verify & archive

In secondary data center, replicated backup is restored, mounted in a Caché instance and an integrity check is run to verify structural integrity.

Depending on space and policy backup is kept online and/or archived to tape for long term storage.

• Integrity checks vary in timing

• Another option is to have media server in primary data center run the check.

Page 26: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• Considering cost and effort, Caché online backup works well for small to medium size databases (~ 100s of GB total) with generous RTOs

• Use InterSystems Mirroring in conjunction with your backup mechanism. • Perhaps there will be no need to restore a backup• If needed, the mirror destination will have CACHE.DAT

files and journal files.

Final points

Page 27: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

• Backup should have minimal impact on live database

• Using SAN/disk controller based backups offloads the work to other appliances/servers

• SAN/disk-based backups meet the fastest RTOs.

• Restore from backup RPOs are as good as the most recently available journal file.

Final points

Page 28: Best Practices for Backing Up Your System Luca Ravazzolo Technology Architect.

Best Practices for Backing Up Your

SystemLuca Ravazzolo

Technology Architect