Disaster Recovery
Analyze || Recommend || Implement || Test
Keep Your Business Running
What would you do if a storm flooded your data center? Or how would you respond if a power outage blacked out your servers? How would you recover your data and keep the business running after an unforeseen disaster? When disasters strike unprepared companies the consequences range from prolonged system downtime and the resulting revenue loss to the companies going out of business completely, yet many IT shops are not prepared to deal with such scenarios.
Risk Analysis
The first step in drafting a disaster recovery plan is conducting a thorough risk analysis of your computer systems. List all the possible risks that threaten system uptime and evaluate how imminent they are in your particular IT shop. Anything that can cause a system outage is a threat, from relatively common man made threats like virus attacks and accidental data deletions to more rare natural threats like floods and fires. Determine which of your threats are the most likely to occur and prioritize them using a simple system: rank each threat in two important categories, probability and impact. In each category, rate the risks as low, medium, or high.
For example, a small Internet company (less than 50 employees) located in California could rate an earthquake threat as medium probability and high impact, while the threat of utility failure due to a power outage could rate high probability and high impact. So in this company's risk analysis, a power outage would be a higher risk than an earthquake and would therefore be a higher priority in the disaster recovery plan.
Establish The Budget
nce you've figured out your risks, ask 'what can we do to suppress them, and how much will it cost?' Can I detect a threat before it hits? How do I reduce the potential of it occurring? How do I minimize its impact to the business? For example, our small California Internet company could employ an emergency power supply to mitigate its power outage threat and have all its data backed up daily on RAID tapes, which are stored at a remote site in case of an earthquake. The more preventative measures you establish upfront the better. Emerson says, "dollars spent in prevention are worth more than dollars spent in recovery."
The results of Step 1 should be a comprehensive list of possible threats, each with its corresponding solution and cost. It is imperative that IT presents all of these threats to the business operations units, so they can make an informed decision regarding the size of the disaster recovery budget (i.e., which risks the company can afford to tolerate and which it must pay to mitigate). Emerson believes IT "falls down" in its failure to communicate the real risks for system downtime to the business operations units of their companies. He says, "It's okay for operations to say no; it's not okay for IT not to let them know the risks."
A good place to begin is by presenting the cost of downtime to the business. How long can your business afford to be without its computer systems should one of your threats occur?
Ultimately, the business operations unit decides which threats the business can tolerate. According to Emerson, when developing a DRP, IT departments are "shooting in the dark without those business indications." Both IT and the business units must agree on which data and applications are most critical to the business and need to be recovered most quickly in a disaster. The management of our small Internet company, for example, may decide they can supply the budget only for the emergency generators and the company will have to assume the risk of an earthquake.
Disaster recovery budgets vary from company to company but they typically run between 2 and 8 percent of the overall IT budget. Companies for which system availability is crucial usually are on the higher end of the scale, while companies that can function without it are on the lower end. However, these percentages may be too small. For a large IT shop 15 percent is a best practice rule of thumb.