Disaster Recovery Planning. Questions to the Audience.

30
Disaster Recovery Planning

Transcript of Disaster Recovery Planning. Questions to the Audience.

Page 1: Disaster Recovery Planning. Questions to the Audience.

Disaster Recovery Planning

Page 2: Disaster Recovery Planning. Questions to the Audience.

Questions to the Audience

Page 3: Disaster Recovery Planning. Questions to the Audience.

What is an IT Disaster• What is an IT Disaster?

• ‘Disaster’ – the unplanned interruption of normal business processes resulting from the interruption of the IT infrastructure components used to support them.

Common Types 1 :

1. Healthcare Information and Management Systems Society (himss.org)

Power outages 28% Hurricanes 6%

Storm Damage 12% Fires 6%

Floods 10% Software Error 5%

Hardware Error 8% Power surge/spike 5%

Physical Attack 7% Earthquake 5%

Page 4: Disaster Recovery Planning. Questions to the Audience.

What is an IT Disaster• What is an IT Disaster?

• ‘Disaster’ – the unplanned interruption of normal business processes resulting from the interruption of the IT infrastructure components used to support them.

Common Types:

✔ Power outages 28% Hurricanes 6%

✔ Storm Damage 12% ✔ Fires 6%

✔ Floods 10% ✔ Software Error 5%

✔ Hardware Error 8% ✔ Power surge/spike 5%

Physical Attack 7% Earthquake 5%

Page 5: Disaster Recovery Planning. Questions to the Audience.

Business Continuity versus Disaster Recovery

• These are not the same thing!

• Business Continuity (BC): Considers the academic, research and business functioning of the institution as a whole. Includes risk assessment, and plans for functional units and business processes. Potentially wider variety of scenarios to consider.

• Disaster Recovery (DR): IT activities to enable recovery to an acceptable condition after a disaster. BC includes DR. DR requires guidance from BC to direct priorities and set scope.

Page 6: Disaster Recovery Planning. Questions to the Audience.

What is the York DR Plan?Review 2008 Plan• Project start: January 2003• Sponsored by CIO and VP Finance and Administration• Scope

• Systems: “key information systems”• Scenarios: “localized disaster or failure”

• Intended to be a multi-phase, multi-year project

Page 7: Disaster Recovery Planning. Questions to the Audience.

What is the York DR Plan?• Engaged functional unit leaders and IT support areas

• Asked to identify maximum tolerable outage and data loss• Surprise: >50% of business processes ranked “critical”• Reality check based on observed impacts from lesser-scale

outages• VP and AVP consultations were the final step to confirm

criticality

Page 8: Disaster Recovery Planning. Questions to the Audience.

Risk Management

Cost ofIncidents

Cost of Countermeasures

Degree of Assurance

OptimalCost/Benefit

Low High

Page 9: Disaster Recovery Planning. Questions to the Audience.

What is the York DR Plan?• DR Threat Assessment

• Proximity to heavy industry – Oil depot across street• Freight train corridor (chemical spill 1980)• Near intersection of major highways (400 & 407)• York main campus on flight path of two airports• Main data centre in basement of old building with UPS but no

generator• High pedestrian traffic (Science Library and washrooms

upstairs) directly overhead• Worst case scenario chosen:

• Loss of building containing main data centre

Page 10: Disaster Recovery Planning. Questions to the Audience.

What is the York DR Plan?• By 2008

• Secured Telus site for secondary site • Identified 4 categories of information systems

• Recovery Point Objectives (RPO)• Recovery Time Objectives (RTO)• Strategy defined on style of recovery for each• Business owners classified which systems belong in which

categories• Large infrastructure upgrades identified to meet the RTO/RPOs• Planned to annually refresh DR plan

Page 11: Disaster Recovery Planning. Questions to the Audience.

2012 DRP Refresh• It’s been 4 years

• Big upgrade on storage and core network• Acquisition of second on-campus data centre• IT department merger• And …

Page 12: Disaster Recovery Planning. Questions to the Audience.

2012 DRP Refresh

Page 13: Disaster Recovery Planning. Questions to the Audience.

2012 DRP Refresh

Page 14: Disaster Recovery Planning. Questions to the Audience.

2012 DRP Refresh

Page 15: Disaster Recovery Planning. Questions to the Audience.

2012 DRP Refresh

Page 16: Disaster Recovery Planning. Questions to the Audience.

2012 DRP Refresh

Page 17: Disaster Recovery Planning. Questions to the Audience.

2012 DRP Refresh

Page 18: Disaster Recovery Planning. Questions to the Audience.

2012 DRP Refresh

Page 19: Disaster Recovery Planning. Questions to the Audience.

2012 DRP Refresh

Page 20: Disaster Recovery Planning. Questions to the Audience.

Goals for 2012 Refresh2012 Goals

• Focus on C1 business applications as of 2012• IT staff / office space not in scope• Scenario is the loss of a single data centre (not both)

• Validate the categorization of “information systems”• Gap Analysis for C1 information systems• Table-top recovery scenario for supporting infrastructure

Page 21: Disaster Recovery Planning. Questions to the Audience.

Methodology• Produce the complete UIT-supported application inventory

• How hard can this be?• The one list did not exist

• Categorize Applications and focus on 2012 C1 Applications• Gap Analysis and Planning• Tabletop Recovery of supporting infrastructure

Page 22: Disaster Recovery Planning. Questions to the Audience.

DR CategoriesCategories and associated RTOs/RPOs

Category Summary Recovery Time Objective (RTO)

Recovery Point Objective (RPO)

Category 1 Vital Communications and Emergency Services

<= 4 hours <= 15 minutes

Category 2 Critical Customer / Partner Interfaces and Emergency Systems

<= 48 hours <= 15 minutes

Category 3 Critical Customer / Partner Interfaces and Emergency Systems

<=7 days <= 24 hours

Category 4 Critical Internal Departmental Services and Non-Critical Customer Interface

<= 14 days <= 48 hours

Page 23: Disaster Recovery Planning. Questions to the Audience.

Application Categorization• CIO/Business owners re-categorized the application list• Result:

• “information systems” changed criticality2008

• C1 – 5 services; C2 – None

2012• C1 – 5 different services; C2 – 7 services

Page 24: Disaster Recovery Planning. Questions to the Audience.

C1/C2 Applications• Gap Analysis• Table-top recovery scenario

• “That is still in service, why?”, “That does what? When did that start?”

• Documentation, documentation, documentation• Update deployment and SOP for services

Page 25: Disaster Recovery Planning. Questions to the Audience.

Example Normal Service

Page 26: Disaster Recovery Planning. Questions to the Audience.

Example Recovered Service

Page 27: Disaster Recovery Planning. Questions to the Audience.

DR of Supporting Infrastructure• The Business focuses on applications• Document infrastructure service dependencies

• Determine the services required by Infrastructure groups to complete a recovery• ie: Monitoring, secure access, system inventory, recovery

documentation, etc

• Some services are considered Category 0 services• ie: storage, network, and power

• Tabletop recovery exercise

Page 28: Disaster Recovery Planning. Questions to the Audience.

Lessons Learned• RTOs and RPOs are set by the business not IT

• IT helps in getting to the real requirement• Services evolve and RTOs change• Infrastructure capabilities change• Identify key technologies• Continual Improvement

• DR is big .. Do it in small chunks• DR is not Backup

• DR Planning can be used in more than just DR

Page 29: Disaster Recovery Planning. Questions to the Audience.

Next Steps• Review the DR plan for remaining services• Asking the DR question up front• Disaster RTO/RPO versus Operational RTO/RPO• Bring staff space and equipment into scope

Page 30: Disaster Recovery Planning. Questions to the Audience.

QuestionsChris Russell Director of Information and Communication Technology Infrastructure, York [email protected]

Rick Smith Lead Architect, York [email protected]