Still don’t have a disaster recovery plan, or haven’t tested yours recently? The risk of downtime is very real and the potential consequences have consistently worsened over time.
Recent research underwritten by Emerson Network Power produced some alarming statistics. The average cost of a minute of unplanned downtime is $7,900, up 41 percent from 2010. Ninety-one percent of data centers had unplanned outages in the past 24 months, and the average reported incident length was 86 minutes; that puts the average cost per incident at approximately $690,200.
Many industry studies show similar conditions and impacts. The more that businesses rely on information technology – an inescapable fact of life – the greater the pain when that technology fails us. Given the particulars of your business (size, industry, location), it’s only a matter of degree as to just how much pain will be inflicted. And, the cause needn’t be a hurricane, flood or fire; 40 percent of outages are the result of power failure.
Three things can help you understand how downtime can affect your business and where your priorities need to be in order to implement an effective DR strategy.
- Determine your cost of downtime
- Determine which applications will do the greatest harm when not available
- Determine your recovery point objective and recovery time objective (RPO/RTO) for those applications
Under the Tip of the Iceberg
The majority of companies typically take into account only obvious costs incurred during an outage, such as wages of idled workers during that specific amount of time, but it’s much more complicated than that. One thing often overlooked is the value of your reputation, for example. Some other cost that may need to be considered include:
- Lost sales
- Materials lost/disposal and cleanup costs
- Financial impact of customer dissatisfaction
- Contract penalties
- Compliance violations
- Upstream and downstream value-chain ripple effects
- IT and employee recovery costs
- Potential litigation/loss of stock value
- Missed deadlines that result in employee overtime
- Priority shipping charges
While every application outage will be different for every company, they all have one thing in common. The contributing factors to the cost of downtime are more complex and extend much farther than many people think. Until you know what your cost is, making informed choices to ensure proper levels of uptime and recovery, as well as demonstrating the ROI of those investments, is next to impossible.
This blog post from Zenoss gives a good example of how certain downtime costs could be calculated. The “Cost of Downtime” chart below is another way of looking at cost. It shows average annual downtime for different levels of availability in 24/7 operation, and how much money that represents given a certain downtime dollar value.
Cost of Downtime
|Availability||Time unavailable||$25,000 per hour of downtime||$50,000 per hour of downtime|
|99.999%||5.26 minutes||$2,192||$ 4,384|
No matter how you look at it, outages are costly. And different applications will produce different impacts.
More Equal than Others
Application mapping is intended to identify and profile all of your applications. It will pinpoint which business services are most critical to your operations, as well as the most costly when they are offline; that is, which applications need to be returned to service as quickly as possible in the event of a disaster and which ones can wait. To do this effectively IT and business leaders must work together.
It’s essential to understand what an application does, who uses it, and what its value is to users and to the business. Are there interdependencies with other applications, departments or third parties? Does it have ties to regulatory or compliance requirements?
This application inventorying and assessment process is difficult and time consuming. But without it, taking the next step – establishing RPT/RTO for individual workloads – becomes a guessing game and can result in flawed data-backup and DR strategies. Or, worse.
Only a Matter of Time
How long can you be without a specific application? Recovery time objective defines the maximum tolerable or allowable duration of an outage and, consequently, your data replication or your disk-versus-tape backup requirements for that application. If your RTO is zero (no tolerance for downtime) then you may decide that having only a completely redundant infrastructure with replicated data offsite is acceptable. If your RTO is 48 hours or 72 hours then tape backup may be okay for that specific application.
Recovery point objective (RPO) dictates how much data you can afford to lose. For example, if you do a nightly back up at 12:00 midnight and the unthinkable happens at noontime, then all the data generated in the intervening 12 hours is lost. RPO in this particular context is the previous day’s backup. If that is acceptable, then that is your RPO. If it’s not acceptable, then you’ll want to think about another data protection solution to put in place.
The tighter your RTO and RPO, the more it will cost to design the appropriate solution. But, now you have the financial facts you need to wage a convincing argument about investing in disaster recovery. And you know which of the hundreds or thousands of applications in use at your company are the priorities around which to develop your strategy (or the pain of failing to do so). Focusing on critical business processes first can mean the difference between a well-orchestrated, successful recovery and interminable chaos.
Businesses and organizations are becoming more social, mobile and cloud-based. As data centers evolve to support these game-changing developments, the need grows greater to minimize the risk of downtime and commit the necessary investment in infrastructure technology and resources. By taking the steps suggested here, you’ll be better equipped to make more informed business decisions regarding the cost associated with eliminating vulnerabilities compared to the costs of taking no action at all.