Disaster Recovery 2015 Survey - CloudEndure · 2016-08-17 · 2015 Public Cloud Disaster Recovery...
Transcript of Disaster Recovery 2015 Survey - CloudEndure · 2016-08-17 · 2015 Public Cloud Disaster Recovery...
2015 Public Cloud Disaster Recovery Survey
Disaster Recovery Challenges and Best Practices
CloudEndure 2015 © 2
Executive Summary
This benchmark survey presents challenges and best practices of companies that host web applications in the public cloud. The results presented here are based on responses from 109 IT professionals from North America and Europe collected through an online survey. Some of the key findings of the survey include:
• The #1 risk to system availability is Human Error followed by Networks Failures and Cloud Provider Downtime.
• While the vast majority of the organizations surveyed (83%) have a service availability goal of 99.9% or better, almost half of the companies (44%) had at least one outage in the past 3 months, and over a quarter (27%) had an outage in the past month.
• The cost of a day of downtime in 37% of the organizations is more than $10,000.
• When it comes to service availability, there is a clear gap between how organizations perceive their track record and the reality of their capabilities. While almost all respondents claim they meet their availability goals consistently (37%) or most of the time (50%), 28% of the organizations surveyed don’t measure service availability at all. It is hard to tell how these organizations claim to meet their goals when they are not able to measure them.
• The top challenges in meeting availability goals are budget limitations, insufficient IT resources, and lack of in-house expertise.
• There is a strong correlation between the cost of downtime and the average hours per week
invested in backup / disaster recovery.
CloudEndure 2015 © 3
Table of Contents
Cloud Infrastructure Provider ................................................................................................................................................................ 4 Importance of Service Availability ........................................................................................................................................................ 5 Service Availability Goals ...................................................................................................................................................................... 6 What is downtime? ................................................................................................................................................................................ 7 Meeting Service Availability Goals ........................................................................................................................................................ 8 Last Downtime Event ............................................................................................................................................................................. 9 Service Availability Measurement Tools and Services ......................................................................................................................... 10 Communicating System Availability Statistics to Customers .............................................................................................................. 11 RPO (Recovery Point Objective) and RTO (Recovery Time Objective) ............................................................................................... 12 Meeting RPO and RTO Goals ................................................................................................................................................................ 13 How Frequent is the System Down for Scheduled Maintenance? ........................................................................................................ 14 Average Monthly Planned Downtime ................................................................................................................................................... 15 Primary Risks to System Availability ................................................................................................................................................... 16 Primary Challenges in Meeting Availability Goals ............................................................................................................................... 17 Strategies and technologies to Ensure System Availability and Data Protection ............................................................................... 18 Frequency of Backup / Replication Application Data ......................................................................................................................... 19 Who is Responsible for Disaster Recovery? ......................................................................................................................................... 20 Average of Hours a Week Invested in Backup / Disaster Recovery .................................................................................................... 21 Annual Cost of Backup / Disaster Recovery ........................................................................................................................................ 22 Cost of Downtime ................................................................................................................................................................................ 23 Annual Budget for Backup / Disaster Recovery vs. Cost of Downtime ............................................................................................... 24 Average of Hours a Week Invested in Backup / Disaster Recovery vs. Cost of Downtime ................................................................. 25 Respondent Demographics: Customers ............................................................................................................................................... 26 Respondent Demographics .................................................................................................................................................................. 27 Respondent Demographics .................................................................................................................................................................. 28 Table of Figures ................................................................................................................................................................................... 29
CloudEndure 2015 © 4
Cloud Infrastructure Provider
59% are hosted in Public Cloud and 74% of the survey respondents use Amazon for their cloud infrastructure. Other cloud infrastructure providers include Microsoft (7%), Google (6%) and Rackspace (4%).
Figure 1: Cloud infrastructure provider
Figure 2: Cloud infrastructure provider
59%
14% 14% 12%
Public cloud (e.g. AWS, Azure, Google)
Private Cloud Hybrid Cloud Managed Hosting
74%
7% 6% 4% 9%
Amazon Microsoft Google RackSpace Other
CloudEndure 2015 © 5
Importance of Service Availability
A vast majority of the survey respondents indicated service availability is highly critical to their customers. 33% rated it “most critical”, while 71% rated it 8 or higher on a scale of 1-10.
Figure 3: How critical is service availability to your customers?
33%
27%
11% 13%
2%
7%
2% 2% 2% 0%
10 - Most Critical
9 8 7 6 5 4 3 2 1 - Not Critical
CloudEndure 2015 © 6
Service Availability Goals
The vast majority of the organizations surveyed (83%) have a service availability goal of 99.9% (“three nines” or less than 9 hours of downtime a year) or better.
53% have a goal of 99.99% availability (“four nines” or less than 53 minutes a year) or better, and 23% have a “five nines” availability goal of 99.999% or above (less than 5 minutes of downtime a year).
Figure 4: Service availability goal
23%
30%
19%
11%
2%
2%
2%
2%
9%
99.999% or above (less than 5 minutes of downtime a year)
99.99% (less than 53 minutes a year)
99.95% (less than 5 hours a year)
99.9% (less than 9 hours a year)
99.8% (less than 18 hours a year)
99.5% (less than 44 hours a year)
99% (less than 88 hours a year)
Less than 99%
No defined goal
CloudEndure 2015 © 7
What is downtime?
50% of the survey respondents define downtime as inaccessible system. In addition, many of the organizations expand the definition of downtime to include instances when the system is accessible but some functions are not operational (24%) or performance is highly degraded (26%).
Figure 5: Definition of downtime
50%
26% 24%
The system is not accessible
The systems is accessible but performance is highly
degraded
The system is accessible but some functions are not
operational
CloudEndure 2015 © 8
Meeting Service Availability Goals
The vast majority of the survey respondents meet their service availability goals consistently (37%) or most of the time (50%).
9% of the respondents indicate that they meet their goals only some of the times.
Figure 6: Meeting service availability goals
We meet our goals
consistently 37%
We meet our goals most of
the time 50%
We meet our goals some of
the times 9%
We are unable to meet our
goals 4%
CloudEndure 2015 © 9
Last Downtime Event
While almost all the respondents claim they meet their availability goals consistently or most of the time, almost half of the companies (44%) had an outage in the past 3 months and over a quarter (27%) had an outage in the past month.
Figure 7: Last downtime event
6%
21%
17%
13% 15%
19%
9%
Week Month 3 months 6 months Year More than a year
Never
CloudEndure 2015 © 10
Service Availability Measurement Tools and Services
Close to half (49%) of the organizations surveyed use their own tools to measure service availability.
24% use third party tools or services while 28% don’t measure service availability at all.
Figure 8: Service availability measurement tools and services
49%
28%
13% 11%
We use our own measurement tools
We do not measure system availability
We use a third party tool
We use a third party service
CloudEndure 2015 © 11
Communicating System Availability Statistics to Customers
Only 15% of the organizations don’t share system availability statistics with their customers. Close to a half (42%) notify customers when an event occurs, while 22% share statistics with their customers via regular email updates and 20% have a website that shows it.
Figure 9: Communicating system availability statistics to customers
42%
22% 20%
15%
We notify customers when an event occurs
We notify customers via regular email
updates (e.g. monthly)
We have a site that shows it
We do not share this information with
customers
CloudEndure 2015 © 12
RPO (Recovery Point Objective) and RTO (Recovery Time Objective)
20% of the respondents have a Recovery Point Objective (RPO) of less than 5 seconds data loss. At the other end of the spectrum, 7% have no RPO defined.
Figure 10: RPO
Just 9% of the respondents have a Recovery Time Objective (RTO) of less than 5 seconds downtime. 15% have an RTO of more than one hour, and another 7% have no RTO defined.
Figure 11: RTO
11% 9% 7%
17%
26% 24%
0%
7%
Zero < 5 seconds
5-30 seconds
1-5 minutes
5-30 minutes
31-60 minutes
More than 1 hour
We don't have one
7%
2% 4%
17%
22% 26%
15%
7%
Zero < 5 seconds
5-30 seconds
1-5 minutes
5-30 minutes
31-60 minutes
More than 1 hour
We don't have one
CloudEndure 2015 © 13
Meeting RPO and RTO Goals
85% of the organizations meet their Recovery Point Objective consistently (58%) or most of the time (27%).
Figure 12: Meeting RPO goals
76% meet their Recovery Time Objective consistently (35%) or most of the time (41%).
Figure 13: Meeting RTO goals
58%
27%
4% 4% 7%
Meet our goal consisently
Meet our goals most of the time
Meet out goals some of the time
Unable to meet our goals
N/A
35% 41%
7% 7% 11%
Meet our goal consisently
Meet our goals most of the time
Meet out goals some of the time
Unable to meet our goals
N/A
CloudEndure 2015 © 14
How Frequent is the System Down for Scheduled Maintenance?
11% of the survey respondents never have their system down for scheduled maintenance.
Over half (59%) indicate that their system is down for scheduled maintenance less than once a month.
At the same time, 8% of the organizations take their system down for scheduled maintenance every 2 weeks or more frequently.
Figure 14: How Frequently is the system down for scheduled maintenance?
2%
4%
4%
20%
59%
11%
More than once a week
Once a week
Once every 2 weeks
Once every 3-4 weeks
Less than once a month
Never
CloudEndure 2015 © 15
Average Monthly Planned Downtime
Two thirds of the organizations surveyed (65%) have less than 30 minutes of monthly planned downtime. 24% have an average of more than one hour monthly planned downtime.
Figure 15: Average monthly planned downtime
65%
11%
22%
0% 2%
Less than 30 minutes
30-60 minutes 1-5 hours 5-10 hours More than 10 hours
CloudEndure 2015 © 16
Primary Risks to System Availability
According the survey respondents, the top five risks to system availability are:
1. Human Errors 2. Network failures 3. Cloud provider downtime 4. External threats 5. Application scalability limitations
Figure 16: Primary risks to system availability (1-10 scale)
8.1
7.2
6.9
6.7
6.6
5.2
1.9
Human errors (including Application bugs)
Network failures
Cloud provider downtime
External threats (e.g. hacking, denial of service attack)
Application scalability limitations
Storage failures
Other
CloudEndure 2015 © 17
Primary Challenges in Meeting Availability Goals
The top three primary challenges in meeting availability goals are:
1. Budget limitations 2. Insufficient IT resources 3. Lack of in-house expertise
Figure 17: Primary challenges in meeting availability goals
7.6
7.5
7.1
6.3
6.0
5.7
5.7
5.5
Budget limitations
Insufficient IT resources
Lack of in-house expertise
Limit on our ability to prevent software bugs
Coping with peak demand
Limit on our ability to prevent configuration errors
Cloud resource capacity
Keeping up with the pace of change/growth
CloudEndure 2015 © 18
Strategies and technologies to Ensure System Availability and Data Protection
Remote storage backup and Storage replication are the leading strategies to ensure system availability and data protection cited by 57% and 46% of the respondents respectively.
Figure 18: Strategies and technologies to ensure system availability and data protection
57%
46%
43%
41%
37%
37%
28%
2%
Remote (multi region/zone) storage backup
Storage replication (multi region/zone)
Multi-availability region/zone deployment
Load balancing
Local (single region/zone) storage backup
Storage replication (single region/zone)
Auto scaling
Other
CloudEndure 2015 © 19
Frequency of Backup / Replication Application Data
There are significant variations among organization in the frequency of backup / replication of application data. At one end of the spectrum, 23% apply continuous data replication. At the other end, 34% perform backup every 12-24.
Figure 19: Frequency of backup / replication application data
23%
9% 7%
20%
5%
34%
2%
Continuous Every 5 mins or
less
Every few minutes (at least
Every 1-4 hours
Every 4-12 hours
Every 12-24 hours
Other
CloudEndure 2015 © 20
Who is Responsible for Disaster Recovery?
In over half of the organizations (53%), IT is responsible for disaster recovery.
Cloud Operations are the ones in charge of disaster recovery in 27% of the organizations and 13% have a dedicated Business Continuity role.
Figure 20: Who is responsible for disaster recovery?
IT 53% Cloud
Operations 27%
Business Continuity
13%
R&D / Application
Development 7%
CloudEndure 2015 © 21
Average of Hours a Week Invested in Backup / Disaster Recovery
11% of the organizations surveyed don’t invest any time at all in backup / disaster recovery. 41% of the organizations invest less than an hour a week in disaster recovery and 35% invest 2-10 hours.
Figure 21: Average of hours a week invested in backup / disaster recovery
11%
41%
20%
15%
7% 7%
None < 1 hour 2-5 hours 5-10 hours > 10 hours Don’t know
CloudEndure 2015 © 22
Annual Cost of Backup / Disaster Recovery
20% of the respondents indicate that their annual cost of backup / disaster recovery is less than $1,000. 26% spend $1,000-$10,000 a year, 28% $10,001-$50,000, and 19% spend more than $50,000 a year.
Figure 22: Annual cost of backup / disaster recovery
20%
26%
28%
4%
9%
2% 4%
7%
< $1K <$10K <$50K <$100K <$500K <$1M $1M+ Don’t know
CloudEndure 2015 © 23
Cost of Downtime
Only 17% of the organizations surveyed indicated a day of downtime costs $10,000 or less. The cost of a day of downtime in 37% of the organizations is more than $10,000.
Figure 23: Cost of one day of downtime
17%
30%
11% 13%
7%
2% 4%
15%
< $1K <$10K <$50K <$100K <$500K <$1M $1M+ Don’t know
CloudEndure 2015 © 24
Annual Budget for Backup / Disaster Recovery vs. Cost of Downtime
Disaster Recovery budget is highly correlated to the cost of downtime. 91% of the organizations where the cost of a day of downtime is less than $10,000 have an annual backup / disaster recovery budget of less than $10,000, whereas 70% of the organizations with a day of downtime cost exceeding $10,000 have an annual backup / disaster recovery budget of over $10,000.
Figure 24: Annual budget for backup / disaster recovery vs. cost of one day of downtime
67%
29%
24%
41%
10%
29%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
< $10,000 > $10,000 Cost of one day of downtime
> $100,000
$10,000 - $100,000
< 10,000$
Annual Budget for backup/ DR
CloudEndure 2015 © 25
Average of Hours a Week Invested in Backup / Disaster Recovery vs. Cost of Downtime
There is a strong correlation between the cost of downtime and the average hours a week invested in backup / disaster recovery. 68% of the organizations where the cost of a day of downtime is less than $10,000 invest less than an hour in average in backup / disaster recovery, while 35% of the organizations with a daily downtime cost exceeding $100,000 invest over 5 hours in backup / disaster recovery.
Figure 25: Average hours a week invested in backup / disaster recovery vs. cost of one day of downtime
68%
38%
18%
24%
42%
47%
8% 20%
35%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
< 10,000$ $10,000 - $100,000 > $100,000 Cost of one day of downtime
> 5 hours
2-5 hours
< 1 hour or none
Annual Budget for backup/ DR
CloudEndure 2015 © 26
Respondent Demographics: Customers
The majority of survey respondents serve more than one type of customers. 46% of survey respondents serve consumers and 48% serve small businesses, while 46% serve enterprises.
Figure 26: Type of customers served
Consumers
46% Enterprise
54%
SMB
48%
CloudEndure 2015 © 27
Respondent Demographics
43% of the survey respondents come from organizations of up to 50 employees, 42% have 51-1,000 employees and 16% over 1,000.
Figure 27: Number of employees
52% of the respondents have less than $10 million in revenues, 26% have $10-100M and 22% over $100M.
Figure 28: Annual revenues
43%
7%
35%
7% 9%
1-50 51-100 101-1,000 1001-5000 > 5,000
52%
13% 13% 15% 7%
<$10M $10-50M $51-100M $101M-$1B >$1B
CloudEndure 2015 © 28
Respondent Demographics
20% of the survey respondents are responsible for application development and another 20% each for Cloud and IT operations.
Figure 29: Job responsibility
46% of the respondents are in Director or Manager positions and 19% are VPs or C-Level.
Figure 30: Position in the organization
Applica'on Development
20%
Cloud 20%
IT Opera'ons 20%
Infrastructure (cross-‐domain)
16%
Development Opera'ons
7%
Compliance 2%
Database 2%
Disaster Recovery 2%
Other 11%
Director 23%
Manager 23% Team Leader
23%
C-‐level 12%
Team Member 12%
Vice President 5%
Supervisor 2%
CloudEndure 2015 © 29
Table of Figures
Figure 1: Cloud infrastructure provider .............................................................................................................................................................. 4 Figure 2: Cloud infrastructure provider .............................................................................................................................................................. 4 Figure 3: How critical is service availability to your customers? ................................................................................................................................. 5 Figure 4: Service availability goal ..................................................................................................................................................................... 6 Figure 5: Definition of downtime ...................................................................................................................................................................... 7 Figure 6: Meeting service availability goals .......................................................................................................................................................... 8 Figure 7: Last downtime event ......................................................................................................................................................................... 9 Figure 8: Service availability measurement tools and services .................................................................................................................................. 10 Figure 9: Communicating system availability statistics to customers .......................................................................................................................... 11 Figure 10: RPO ............................................................................................................................................................................................ 12 Figure 11: RTO ........................................................................................................................................................................................... 12 Figure 12: Meeting RPO goals .......................................................................................................................................................................... 13 Figure 13: Meeting RTO goals .......................................................................................................................................................................... 13 Figure 14: How Frequently is the system down for scheduled maintenance? .................................................................................................................. 14 Figure 15: Average monthly planned downtime .................................................................................................................................................... 15 Figure 16: Primary risks to system availability (1-10 scale) ...................................................................................................................................... 16 Figure 17: Primary challenges in meeting availability goals ..................................................................................................................................... 17 Figure 18: Strategies and technologies to ensure system availability and data protection ................................................................................................. 18 Figure 19: Frequency of backup / replication application data .................................................................................................................................. 19 Figure 20: Who is responsible for disaster recovery? .............................................................................................................................................. 20 Figure 21: Average of hours a week invested in backup / disaster recovery .................................................................................................................. 21 Figure 22: Annual cost of backup / disaster recovery ............................................................................................................................................. 22 Figure 23: Cost of one day of downtime ............................................................................................................................................................. 23 Figure 24: Annual budget for backup / disaster recovery vs. cost of one day of downtime ................................................................................................ 24 Figure 25: Average hours a week invested in backup / disaster recovery vs. cost of one day of downtime ............................................................................. 25 Figure 26: Type of customers served ................................................................................................................................................................. 26 Figure 27: Number of employees ...................................................................................................................................................................... 27 Figure 28: Annual revenues ............................................................................................................................................................................ 27 Figure 29: Job responsibility ........................................................................................................................................................................... 28 Figure 30: Position in the organization .............................................................................................................................................................. 28
CloudEndure 2015 © 30
About CloudEndure Business as Usual. Always.
CloudEndure provides Cloud Migration and Cloud Disaster Recovery for any application, allowing companies to mobilize entire applications with their data to and across clouds with near zero downtime and no data loss. CloudEndure enables truly consistent, block-level, real-time replication using continuous data protection (CDP). Founded in 2012, CloudEndure’s Cloud Workload Mobility technology creates an exact copy of the entire application at an alternative cloud location – at the touch of a button, within minutes, and with the latest data. CloudEndure supports physical, virtualized or cloud-based applications as the source and Amazon Web Services (AWS), Google Cloud Platform (GCP) and Microsoft’s Azure as target cloud locations. For more information, visit www.CloudEndure.com.
Survey sponsors: