Download - Samit Roy, President, SciCom Infrastructure Services, Inc

Transcript
Page 1: Samit Roy, President, SciCom Infrastructure Services, Inc

From Disaster Recovery to Business Continuity

September 21, 2006

New Challenges for CIOs

Page 2: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 2INFORMATION SYSTEMS

Agenda

• Examples at DeKalb County

• Disaster Recovery to Business Continuity

• Developing a Business Continuity Plan

Page 3: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 3INFORMATION SYSTEMS

From simple Disaster Recovery to

Business Continuity

Page 4: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 4INFORMATION SYSTEMS

Threats, Risks to businesses

Tsunami strike

• Geo - Political - Biological -Technological uncertainties in today’s world

• But similar threats were always present in the past, then why is it an issue today?

Page 5: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 5INFORMATION SYSTEMS

Quantifying the effect of Disaster

RPO is the point in time that marks the end of the period during which data can still be recovered using backups, journalsRTO defines how quickly information systems, services and processes must be operational following some kind of incident including recovery of applications and data

RPO

RTO

Disaster

Busin

ess

Proc

ess/

Know

ledg

e

Time

Resume Business

The following parameters are used to quantify the magnitude of the loss to businesses because of disaster

Page 6: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 6INFORMATION SYSTEMS

Early Justice Processes• I

nfor

mat

ion

Shar

ing

• Com

plex

ity• T

echn

olog

ical A

dvan

ces

Time

Independent ProcessSingle System

• Justice processes were independent• Manageable number of cases, convictions• There were less number of agencies to interact• Mostly paper based or Mainframe stored data

< 1990’s

Page 7: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 7INFORMATION SYSTEMS

How did we cope with disasters?• I

nfor

mat

ion

Shar

ing

• Com

plex

ity• T

echn

olog

ical A

dvan

ces

TimeIndependent Process

Single System

Disaster Recovery is a technology recovery often conducted in “reactive mode”

• Focus: Data Center/Paper Copies• Technology: Mainframe/Mid Range Computing• Behavior: Casual Reaction, resumption > 3 days• Limited Business Scenarios

Disaster Recovery

Business Recovery

• Focus: Data Center and Back Office• Technology: Client Server Computing• Behavior: Rapid Reactions

< 1990’s

Page 8: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 8INFORMATION SYSTEMS

Today’s Justice Processes• I

nfor

mat

ion

Shar

ing

• Com

plex

ity• T

echn

olog

ical A

dvan

ces

Time

Interdependent ProcessesMultiple Systems

• Today Government’s responsibility and

responsiveness to the Society has increased • Justice needs has changed based on increase in

population, immigration, socio-economic changes,

changing demographics• Accuracy, Timeliness, reliability of Justice System

rests on Inter-Agency dependence and Integrated

Systems• Any disruption in any part of the Justice System

will interfere with the overall justice process and

thus affect the services to the Citizens

> 2000’s

Page 9: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 9INFORMATION SYSTEMS

Let’s take a sample scenario…..

StatewideWarrant System

StatewideCriminal History

Records Repository

National Criminal History Records

Repository

Victim Compensation Fund

Court Information System

Justice InformationServer

Sheriff Information System

Booking Information System

Prison Information System

State/CountyTreasurer System

Police InformationSystem

Prosecutor InformationSystem

Public Defender System

Dept of Health and HumanServices Inf. System

Dept of Welfare Info System

Dept of Education Info System

Dept of Motor Vehicle Info System

Medical Licensing Board Info System

State and CountyDay Care Licensing

Sex Offenderregistry

1. Query Subject

2. Arrest Warrant 3. Arrest Report

6. Prosecution Case Intake Document

7. Charging Document

Pre-trail Services Info Sys

8. Calendar8. Calendar 9. Request Pre-Trial Report

8. Motions 8. Notification of hearings

10. Subpoenas to witness 10. Subpoenas to victims

11. Sentence

4. Booking Fingerprint Images

5. Arrest Information Broadcast

Sample Integrated Criminal Justice System

Page 10: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 10INFORMATION SYSTEMS

Possible Disruptions

StatewideWarrant System

StatewideCriminal History

Records Repository

National Criminal History Records

Repository

Victim Compensation Fund

Court Information System

Justice InformationServer

Sheriff Information System

Booking Information System

Prison Information System

State/CountyTreasurer System

Police InformationSystem

Prosecutor InformationSystem

Public Defender System

Dept of Health and HumanServices Inf. System

Dept of Welfare Info System

Dept of Education Info System

Dept of Motor Vehicle Info System

Medical Licensing Board Info System

State and CountyDay Care Licensing

Sex Offenderregistry

1. Query Subject

2. Arrest Warrant 3. Arrest Report

6. Prosecution Case Intake Document

7. Charging Document

Pre-trail Services Info Sys

8. Calendar8. Calendar 9. Request Pre-Trial Report

8. Motions 8. Notification of hearings

10. Subpoenas to witness 10. Subpoenas to victims

11. Sentence

4. Booking Fingerprint Images

5. Arrest Information Broadcast

• County/State/GCIC Network is down, Wireless connection is down

• Bomb explosion in Court House

• Precinct is flooded because of Water Main burst, Hurricane etc

• 911 Center is infected by biological agents

• Electrical grid failure

• The following are typical disasters that may disrupt the prosecution of the convict

• Employee has lost the file folder containing the “Sentence Report”

• The “Court Information System” is corrupted• Backup tape was stolen during the transport

• Jail is infected with pandemic influenza

• Fire has burned down the Records Retention Center• “Cyber crime” has broken security and altered criminal records

Seve

rity

Page 11: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 11INFORMATION SYSTEMS

How do we recover from the disruption?

• Restoring a Server from backup will not suffice

• We need to think about the whole Integrated Justice Process• We need to plan for Disaster with every possible scenarios

Readiness

Preven

tion

Recover

y

Resumptio

n

Response

Test & TrainEvaluate & Maintain

Business Continuity

Page 12: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 12INFORMATION SYSTEMS

Business continuity goes beyond disaster recovery to ensure the continued availability of essential services, programs, and operations in the event of unexpected interruptions. Business continuity addresses enterprise-level and end-to-end solutions, from design and planning to implementation and management, with a focus on urgency

Business continuity must be approached holistically, including supporting process management interdependent with availability and security, to manage operational risk effectively. Business Continuity Management is not just a job for your IT team - it is an operational issue.

A team effort is required to develop comprehensive plans for critical operations including not just computing processes but also operational, building systems, suppliers, and other processes. Organizations should also consider long-term operations:

What can they do with their displaced workers? How do they communicate with other stakeholders and partners? How can we get the job done without our existing support network?

What is Business Continuity?

By taking a holistic approach to business continuity service and evaluating the solution from an IT and a business level both internally and externally, organizations ensure business is always available, performing and secure.

Page 13: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 13INFORMATION SYSTEMS

Developing a Business Continuity Planfor the Justice Organization

Page 14: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 14INFORMATION SYSTEMS

Business Continuity Plan - Basics

Readiness

Preven

tion

Recover

y

Resumptio

n

Response

Test & TrainEvaluate & Maintain

• Build the business continuity awareness, plans and strategies as a part of the enterprise culture

• Business continuity planning is evolutionary. Maintenance of the plan and events experienced will necessitate revisions and/or additions of plans

• The Business Continuity Planning methodology is basically one. However consultants and vendors twists it in different ways to sell their uniqueness

• The Business Continuity Planning should basically be driven by the inside organization and not the vendorSource: ASIS International

Page 15: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 15INFORMATION SYSTEMS

Business Continuity Plan - Steps

Readiness

Preven

tion

Recover

y

Resumptio

n

Response

Test & TrainEvaluate & Maintain

Readiness Objective: Address the preparatory steps required to provide a strong foundation on which to build BCP.

Tasks:• Assign Accountability

• Corporate Policy • Ownership of Systems, Processes and Resources• Planning Team• Communicate BCP

• Perform Risk Assessment • Review Types of Risks that can impact business

• Conduct Business Impact Analysis• Identify Critical Processes• Assess Impact if there is a Crisis• Determine RPO and RTO• Identify Resources required for recovery

• Agree on Strategic Plans – Agreeable, Attainable, Probable, Verifiable and Cost Effective

• Crisis Management and response team DevelopmentSource: ASIS International

Page 16: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 16INFORMATION SYSTEMS

Business Continuity Plan - Steps

Readiness

Preven

tion

Recover

y

Resumptio

n

Response

Test & TrainEvaluate & Maintain

Prevention Objective: Address those areas where good planning will allow an organization to avoid, prevent or limit the impact of a crisis occurring

Tasks:• Compliance with Corporate Policy

• Mitigation Strategies

• Devise Mitigation Strategies

• Resources needed for Mitigation

• Monitoring Systems and Resources

• Avoidance, Deterrence and Detection

• Based on warnings

• Employee behavior based on motivation

• Security based programs for warnings

Source: ASIS International

Page 17: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 17INFORMATION SYSTEMS

Business Continuity Plan - Steps

Readiness

Preven

tion

Recover

y

Resumptio

n

Response

Test & TrainEvaluate & Maintain

Response Objective: Develop the steps that will be required to respond effectively, appropriately, and timely should a crisis occur

Tasks:• Potential Crisis Recognition

• Identification of Danger Signals• Civil or political instability • Hostile labour unions• Impending strikes and likely protests

• Responsibility to Report Potential Crisis• Notify the Teams

• Parameters for Notification• Custody and Updates to contact information• Types of Notification

• Assess the Situation• Declare a Crisis

• Execute the Plan – Routine emergency incidents, minor/moderate/major business interruptions

• Communications – Audiences, Call Center, MediaSource: ASIS International

Page 18: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 18INFORMATION SYSTEMS

Business Continuity Plan - Steps

Readiness

Preven

tion

Recover

y

Resumptio

n

Response

Test & TrainEvaluate & Maintain

Response • Resource Management

• Accounting for all individuals

• Notification of Next-of-Kin

• Crisis Counseling

• Crisis Management Center

• Financial Support

• Immediate functional Payroll systems, Telephone

Systems, Email System, Internet Sites

• Alternate Worksites, Secondary Data Centers

• Offsite Storage

• Transportation – Air, Road, Rail

• Supplies/Service Providers

Source: ASIS International

Page 19: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 19INFORMATION SYSTEMS

Business Continuity Plan - Steps

Readiness

Preven

tion

Recover

y

Resumptio

n

Response

Test & TrainEvaluate & Maintain

Recovery and Resumption Objective: Develop policies, procedures and plans to bring the organization out of crisis, recover/resume critical processes and finally return to normal operations

Tasks:• Damage and Impact Assessment

• Crisis involving Physical Damage

• Crisis not involving Physical Damage

• Resumption of Critical and Remaining Processes• Process Resumption Prioritization• Resumption of Critical Processes• Resumption of remaining processes

• Return to Normal Operation

Source: ASIS International

Page 20: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 20INFORMATION SYSTEMS

Business Continuity Plan - Steps

Readiness

Preven

tion

Recover

y

Resumptio

n

Response

Test & TrainEvaluate & Maintain

Test and Train Objective: Train and educate team members as well as general employee population, and validate and embrace the BCP

Tasks:• Education and Train

• Educate and Train teams• Educate and Train all personnel

• Test the BCP• Benefits of Testing• Goals and Expectations• Planning and Development • Timeline• Scope of Testing• Test Monitoring• Test and Exercise Scenarios• Test and Exercise Roles• Test and Exercise participation• Test and Exercise Evaluation• Ongoing evaluation of Test Schedules

Source: ASIS International

Page 21: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 21INFORMATION SYSTEMS

Business Continuity Plan - Steps

Readiness

Preven

tion

Recover

y

Resumptio

n

Response

Test & Train

Evaluate & Maintain

Evaluate and Maintain Objective: Keep the BCP relevant to the Organization using a rigorous maintenance and evaluation programs

Tasks:• Develop BCP Review Schedule

• Risk Assessment

• Sector/Industry Trends

• Regulatory Requirements

• Event Experience

• Test/Exercise results

• Develop BCP Maintenance Schedule: Following are

examples of procedures, systems, processes that may

affect the BCP• System and Application software changes • Changes to the organization and its business

processes• Critical lessons learned in testing • Change to external environmentSource: ASIS International

Page 22: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 22INFORMATION SYSTEMS

The BCP at DeKalb County

Page 23: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 23INFORMATION SYSTEMS

Status as of end of Yr 2005

• A Gap Analysis was performed using a set of 25 questions that reflect the industry standard for a comprehensive BCP program• Result: Based on the study, it was estimated that the BCP practices, processes and

infrastructure are operating at a maturity level of 35%

• County operated in NT domain, with a security rating of 2.0 compared to Federal guidelines of 8.0 in a scale of 0 to 10

• County operated under individual departmental budget creating isolated IS operations, multiple servers and storages, lack of policies and overall fragmented IT management

• County operated under multiple operating systems (six OS), older versions of applications with minimum data validations

• Multiple departments depended on single employee knowledge both in business processes and applications knowledge

• Absence of awareness across the departments regarding recovery processes, BCP, security and depended on reactive responses in case of any business disruptions

Page 24: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 24INFORMATION SYSTEMS

Actions taken since Jan 2006 • Immediate plans were developed to upgrade computing environments

• Applications upgrades for Court Systems – Banner • Application upgrade for PeopleSoft Payroll System• Hardware upgrade for Kronos clocks for time recording • Design and implementation of Active Directory and migration to Windows 2003• Clustering firewall and enhancing security servers• Clustering Email Exchange and implementing archiving technology

• Consolidation of Servers • AIX (Unix) Servers was reduced from 11 to 1 enterprise class P590 Server• Introduced failover mechanism for enterprise applications• Consolidated storage into DS 8100• Windows Servers are being reduced from 160 to 15 as follows

• Implementing SQL Cluster • Implementing File Server Cluster

Page 25: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 25INFORMATION SYSTEMS

Issues impending BCP• Currently the systems are isolated based on departmental needs and legacy

AIX Windows

StorageBackup

Different Offsite Tape Storage

C

Windows

StorageBackup

Windows

StorageBackup

Windows – file server

StorageBackup

Issues:• Most of the file servers, SQL Server based systems are configured into one

machine with no failover, no disaster recovery mechanism

• Multiple storage devices, locations will be a challenge to move into a consolidated DR mechanism

• Multiple tape devices and tape storage locations is a challenge to recover in case of emergency

County IS Police Water and SewerGIS and Other

Page 26: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 26INFORMATION SYSTEMS

Consolidation of Computing Resources• Currently the systems are isolated based on departmental needs and legacy

AIX Windows

StorageCounty IS

Backup

Different Offsite Tape Storage

C

Windows

StoragePolice

Backup

Windows

StorageWater and SewerBackup

Windows – file server

StorageGIS and Other

Backup

AIX Windows

StorageColocated

Backup

A

Page 27: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 27INFORMATION SYSTEMS

Business Continuity – multiple level• Build the West Exchange site• Implement the Consistent Backup mechanism

AIX Windows

StorageCallaway Bldg

AIX Windows

StorageWest Exchange

Synchronized

Backup Backup

Offsite Tape Storage

Distant Recovery Site

A B

C

D

Page 28: Samit Roy, President, SciCom Infrastructure Services, Inc

Page 28INFORMATION SYSTEMS

Next steps towards developing BCP

• High powered Planning Committee has been formed

• Currently “Readiness” phase of BCP is in process

• Initiatives are in process to consolidate Applications from multiple environment

• Cross training of 311 call agents with 911 call agents