An unexpected ice storm knocked down power lines and iced up the roads, preventing your staff from getting to work. Or perhaps some unscrupulous soul launched a cyber-attack that shut down your servers.
You need access to your data to keep your business going but it isn’t accessible for any number of reasons associated with the disaster at hand ─ and it may not be for awhile, if ever, if you don’t have a solid disaster recovery (DR) plan in place.
Good Enough or Not Enough
The media is full of headlines that show what happens when disaster strikes and a company impacted by the event didn’t have a DR plan ─ or had one that proved to be insufficient.
The drama conveyed in those headlines is reinforced by statistics like those in the often-cited study from the University of Texas, which stated that 94 percent of companies suffering from a catastrophic data loss do not survive, 43 percent never reopen and 51 percent close within two years.
It seems obvious that for any business that uses electronic data in some manner, a solid DR plan is essential. Yet many companies avoid investing time or money in DR planning or have simply put together a “good enough” plan.
That’s not surprising. Few of those holding the purse strings are eager to spend money protecting against something that may never happen, particularly when there are so many other competing priorities. (See How to Sell Disaster Recovery to CFOs) Traditional DR has also generally been considered expensive, time consuming and error prone ─ not exactly a reassuring investment.
Even companies that do put together DR plans may have plans that are insufficient or incomplete. Many are written as a one-time fail-over process, and lack procedures for returning data to the production site once it has been re-established.
Annual DR testing is yet another often overlooked element. Due to the time-consuming nature of executing a DR plan, many tests are only partially run and not tested through a full fail-over.
RPO, RTO Then Go
The need for a well thought out DR plan is obvious, but where do you begin and what will it take to ensure that your DR plan does what you need it to do?
It’s the components of your plan that will determine how well ─ and how quickly ─ your company recovers from a manmade or natural disaster. That hinges on what your recovery point objectives (RPO) and recovery time objectives (RTO) are and the amount and complexity of the data you handle.
RPO is the time within which business functions or application systems must be restored to acceptable levels of operational capacity. A smaller RPO means that less data is lost, which is critical for normal business operations.
To achieve a smaller RPO, you need to increase the frequency with which you back up your production environment. The cost and impact of frequent backups needs to be considered, however, as the more frequently you back up, the more copies you have to maintain.
RTO is the maximum amount of time tolerable for data loss and capture. Increasing the frequency of your backups does not always lead to a smaller RTO. If server startup takes a half hour, your recovery time won’t be less than a half hour, not matter how frequently you back up. You would have to get a faster server or reconfigure what you have.
Conducting a business impact analysis (BIA) will help in determining RPO and RTO, as well as in establishing the foundation for your DR plan. The BIA involves identifying all of your systems and applications, and then determining their operational, financial and reputational impact to your business if they went down. Which are most important to ensuring your business can keep operating? Which would you need access to first following a disaster? Which can wait? How long could you wait for access? Asking these kinds of questions will help you determine the appropriate RPO and RTO.
You’ll also need to identify the minimal resources required to maintain business operations, and establish an order of priority for restoring business functions and related data or applications. Yet another critical step is noting any possible points of failure so you can address those vulnerabilities in your DR plan.
Pick Your Strategy
After the BIA comes the review and evaluation of different data backup, replication and recovery strategies. Will tape backup suffice or should you go with disk backup? Should you keep your data backups on site or ship them off-site? If you are going with off-site backup, do you want a facility that is equipped with everything you need to get recovery started immediately ─ even if you are paying for everything even when it’s not in use. Or, does it make more economic sense to bring in the equipment needed for data recovery at an off-site facility only when needed, even if that means the total time required for recovery could be quite long. And, what about cloud-based options? Each offers its own advantages and disadvantages.
To Tape or Not Tape
In terms of data backup options, tape backup is the least expensive strategy with a lower cost per gigabyte of storage than other methods. A tape drive or a tape library is infinitely scalable; all you have to do is buy more tapes.
Tape is also fairly efficient, as it is only run when data is being read or written. Tape makes it easy for multi-site backups too. However, this can be inefficient and costly once you factor in the costs for securely transporting tapes and storing them in data centers or other off-site facilities.
There are other disadvantages as well. Reading from and writing to tapes takes longer than other backup methods. There is a greater likelihood of data corruption and problems reading data from tapes. Recovering from tape backup is slow too, taking days or weeks if the hardware needs to be found before recovery can commence. It can also be error prone and difficult, depending on the number of tapes from which data has to be recovered.
Disk backup is considered a more efficient means of data transfer than tape, with faster recovery and more continuous protection. The problem is that it still resides in your on-site data location. That’s not good if a disaster makes your site inaccessible or nonexistent.
To cover yourself, you need to use a third-party provider for off-site backup or a replication of the on-site disk. Not going with an off-site provider can make it expensive to continue buying disk space, not to mention the extra time required for doing the backups yourself.
One way to avoid high prices for disk space is to back up your disk replicas to tape. However, any savings can be quickly diminished when you factor in the extra labor and infrastructure costs associated with maintaining the tape-based backup along with disks.
Yet another option is virtual tape. This storage technology makes it possible to save data as if it were being stored on tape although it may actually be stored on a hard disk or other storage medium. A special storage device manages less-frequently needed data so that it appears to be stored entirely on tape cartridges although some parts of it may actually be located in faster, hard disk storage.
Virtual tape can be used with a hierarchical storage management (HSM) system in which data is moved as it falls through various usage thresholds to slower, less costly forms of storage media. It can also be used as part of a storage area network (SAN) where less-frequently used or archived data can be managed by a single virtual tape server for a number of networked computers.
Some Like It Hot, Cold or Warm
There is also the matter of on-site versus off-site backup. On-site backup keeps your data close at hand. But if your site is not accessible, or is damaged or destroyed, chances are your backups will be too. Off-site backup, particularly to a location a good distance from your primary site, obviously offers greater protection of your data. The question is what type of facility will you need? The three options are hot, warm and cold facilities, with the differences between them a matter of recovery time and the cost.
- Hot site: Ideal for the most critical applications, a hot site is a fully equipped data center with servers that can be online within hours. This is the most-expensive option.
- Warm site: A warm site provides basic infrastructure. However, it also requires some lead time to prepare servers, so it could take up to a week to bring online.
- Cold site: This is considered a bare-bones approach to DR. Cold site facilities have the basic infrastructure needed to run a data center, such as heating, ventilation and air conditioning (HVAC), power and network connectivity, but little else. Equipment must be brought in and configured, which can take up to a month to be operational.
What Changes in the Cloud?
The cloud allows for a very different approach to DR, largely because of virtualization. With virtualization, the entire server, including the operating system (OS), applications, patches and data is contained in a single virtual server. It can be copied or backed up to an off-site data center and spun up on a virtual host in a matter of minutes.
Since the virtual server is hardware independent, the OS, applications, patches and data can be safely and accurately transferred from one data center to another without having to reload each server component. This makes backup between data centers or other locations fast and more cost effective than other options.
Lower Costs, Faster Recovery
Cloud-enabled DR delivers a number of advantages over traditional DR, such as significantly reducing capital expenses because it eliminates the need for customers to invest in a remote DR facility. Ongoing operating expenses are also lowered because users do not have to pay to power and cool remote equipment. In addition, capacity and performance can be allocated on demand, so customers only have to pay for the resources consumed.
Cloud-based DR also increases flexibility and, because the cloud is designed for remote management, speeds up recovery. Compared to on- or off-site tape-based DR, which can be cumbersome and expensive, such capabilities can make routine testing more practical and ensure the DR solution works when needed. In cases where multi-year data archiving is needed for regulatory requirements, tape storage may be useful. However, the cost-effectiveness and recovery speed of online, offsite backup makes it difficult to justify tape backup.
In addition, the cloud makes low-cost cold site DR less desirable and warm site DR a more cost-effective option. Backups of critical servers can be spun up in minutes on a shared or dedicated host platform. With SAN replication between sites, hot site DR also becomes a more attractive, cost-effective option. SAN replication provides rapid fail-over to the DR site with very short recovery times, offering the capability to return to the production site when the DR test or disaster event is over ─ something rarely available with traditional DR due to the cost and testing challenges.
Yet another benefit of cloud-based DR is the ability to finely tune the costs and performance for the DR platform. Applications and servers that are considered less critical in a disaster can be tuned down with less resources, while still assuring that the most critical applications get the resources needed to keep the business running through the disaster.
Back Up to the Cloud
The cloud also changes data backup. Cloud-based backup is essentially off-site backup to a third-party service provider or to a customer-owned cloud infrastructure using cloud enablement technologies or on-site appliances. Multi-site data redundancy is integral to cloud-based data backup as a local data copy can live in an on-site appliance, while the enablement technology replicates data to the customer’s service provider or the customer’s own data center.
The appliances and enablement technologies continually run in the background of IT operations, eliminating some of the issues associated with manual IT processes. Administrators can provision virtual machines (VMs) on the appliances, while using the appliance to back up the VMs off site. There are no tapes or disks to buy or refresh. Nor is there a need to spend hours each week physically managing backups or transporting tapes. Replication is provided as a managed service, making it less labor intensive than tape or disk backups.
Of course, sufficient bandwidth is required to support off-site replication. This means either investing in network optimization or replicating less data. The time and effort needed to transition to the cloud should also be considered. However, the flexibility of cloud-based backup makes it easy to move only some of your more critical data to the cloud.
Some people do have concerns that cloud-based data backup could open their data streams to breaches from a third party or as a result of other customers residing in a given data center. To assuage these fears, many cloud service providers (CSPs) build high-level security features into their clouds. Typically, CPSs that are audited to meet the requirements of the Healthcare Information Portability and Accountability Act (ACT), Payment Card Institute Data Security Standard, Safe Harbor and other certifications, employ security best practices to help ensure data safety and integrity.
Disasters happen. Ensuring your business can continue operations and recover critical data and applications when they do are essential to both business survival and overall business success. The options are out there. It’s a matter of choosing the ones that are right for your business, incorporating them into a plan and testing that plan to ensure your company can weather the storm ─ or any other disaster that may strike.
To learn about Peak 10’s Recovery Cloud services, visit http://www.peak10.com/cloud/recovery-cloud.