Whenever accidents, disasters and natural events interrupt day-to-day business activities, one thing can be certain: corporations lose money. The amount of money often depends on how prepared businesses are for dealing with interruptions. An up-to-date, well-planned and well-practiced disaster recovery plan often makes the difference between quickly returning to business as usual and failing for months or even years from the devastating repercussions.
Any event that interrupts business due to the loss of operational ability required for normal operations qualifies as a disaster. A disaster recovery plan is a blueprint for recovering from these events. A DRP does not seek to duplicate a business rather, its intent is to increase the chances of survival and to minimize the effects of the loss.
Disaster recovery planning is a set of tasks that must be performed. In addition, it is filled with potential hurdles that even the best intentions, most intelligent people in the organization can overlook. Regardless of whether the plan is developed using internal talent, external experts can help. Disaster recovery planning is an essential process for companies.
The basic tasks proceeding and maintaining recovery preparedness make good economical and business sense. In most cases with less effort than anticipated, disaster recovery planning can improve efficiency, reduce recurring issues and, through reduced downtimes and better managed processes, pay for itself.
Management must first understand the characteristics associated with a crisis. Any crisis has the following characteristics:
- Insufficient Information
- Escalating Flow of Events
- Loss of Control
- Intense Scrutiny from Outside
- Siege Mentality
- Short-term Focus.
One strategy used to put the crisis in the proper context is to establish an order of magnitude with respect to the crisis. Crises may be categorized into one of three levels:
Level I-Low Risk
No serious injuries, minimal physical damage, no disruption to critical business operations, minimal impact on routine business activities, minimal distress to employees.
Level II-Moderate Risk
Serious (life threatening) injuries, significant number of minor injuries, minor damage to property and facilities, minor or impending disruption on critical business operations, moderate impact to routine business activities, moderate employee distress.
Level III-High Risk
Major human casualties including death, major physical damage, significant impact on critical and routine business activities, media visibility, potential customer and shareholder impact.
As part of the disaster recovery plan, an established Escalation Procedure should be tied to each of these levels so that if the situation escalates to the next level procedures are already in place.
As part of a disaster recovery plan, crisis events are defined in a slightly different manner. The least serious event could be described as a “serious incident”, which involves a minor loss of data, a roof leak that drenches several shelves of replaceable books in a library or a threat from a drunken employee.
The term “emergency” is used in the event of a single casualty, a moderate fire, or substantial vandalism that compromises the security of the site.
A “major emergency” classification covers serious damage at a single site and possibly several casualties.
A “disaster” is defined as an event that is beyond the powers of first responders to prevent or control, and that results in serious damage and prolonged service disruption at several sites and possibly a number of casualties. The term “disaster” means the interruption of business due to the loss or denial of the information assets required for normal operations. It refers to a loss or interruption of the company’s data processing functions or to a loss of data itself. Loss of data could result from accidental or intentional erasure or destruction of the media on which data was recorded. This loss could be caused by a variety of man-made or natural phenomena.
Loss of data could also refer to a loss of integrity or reliability either in the dataset (or database) itself, or in the means by which data is transported, manipulated or presented for use. Corruption of programs and networks could interrupt the normal schedule for processing and reporting data, wreaking as much havoc within a company as would the loss of the data itself.
The above conception of disaster may suggest that only a major calamity – a terrorist bombing, an earthquake, or even a war – would qualify as a disaster. Most people might imagine in a smoking data centre rather than an accidental hard disk erasure at the small business office down the block. In either case, if the result is an unacceptable interruption of normal business operations, the event could be classified as a disaster. Disasters are relative and contextual.
Simply stated: A disaster is an occurrence that disrupts the functioning of the organization resulting in loss of data, loss of personnel, loss of business or loss of time:
Organization of the Disaster Recovery Plan
A written and tested disaster recovery program can determine whether or not your business fully resumes operations after a catastrophe. A sound program is actually a collection of specific action plans:
- A disaster avoidance plan to reduce or limit risks.
- An emergency response plan to ensure quick response to incidents.
- A recovery plan to guide the firm in resuming vital business functions.
- A business continuation plan to fully restore all business activities to normal.
Disaster avoidanceis the cornerstone of any disaster recovery plan. The first step in creating an avoidance plan is to analyze the potential hazards/risks and how well you are protected against them. This step should be accomplished during a risk/business impact analysis. The next step is to develop procedures for protecting those vulnerable assets/processes that are identified.
Even the best avoidance plan cannot prevent every disaster. When a serious incident occurs, a company must have an emergency response plan. The focuses of this plan should be the personnel and tasks necessary to immediately mitigate damage to people and company assets. After ensuring the human safety for employees, visitors, and the public, the plan should also address public relations and advertising strategies to let your clients know that you are still in business and where they can reach you.
If the emergency escalates to the disaster level, a comprehensive disaster recovery plan must be ready to implement. This plan should contain two main sections that address the specific action plans for:
- recoveringcritical business functions, and
- restoringthe business to pre-disaster conditions.
The scope of the disaster will also have an impact on the business’ recovery. Regional disasters could affect others with whom you do business including clients, vendors or emergency personnel. The plan should consider the possibility of competing for resources. Developing a comprehensive disaster recovery program that incorporates the four action plans takes foresight and commitment. But if catastrophe strikes, an effective written disaster recovery plan will provide a smooth, speedy return to business instead of diminished customer confidence, loss of clients and ultimately, failure.
The objectives of a disaster recovery plan are four-fold:
- To limit the extent of the damage and to prevent the escalation of a disaster
- To prevent personal injury to the company personnel as well as the general public
- To prevent physical damage to company property, and
- To minimize a disaster’s economic impact on the business.
With properly organized plan documents, it should be able to look at the disaster recovery plan’s table of contents for an overview of plan functions. The reason for this is simple: Most plans are compiled as sets of procedures that are developed to accommodate recovery strategies.
Body of the Disaster Recovery Plan
The complexity of a disaster recovery plan is directly related to the size of the organization. For very small firms the development of the coordination and recovery procedures may include all employees. For other larger companies, it may be necessary to assign functional business areas the full responsibility for developing and maintaining their section of the DRP. After all, it will be the recovery team for the functional business area that will be responsible for recovering the operation.
If each functional business area is responsible for developing its own disaster recovery plan, a central control area should also be established. This area should be responsible for coordinating the development effort, training the line planners, recruiting from within line operation and managing scarce resources.
As for the business resumption planners, each major area of the organization should make resources available, usually part-time resources (at least half time) familiar with the areas for which they would develop plans. Having planners from the areas for which they would develop plans is important in two ways. Much of the success in getting resumption plans developed in a timely fashion comes from the relationship that the planners have with the areas. Not only are the planners familiar with the areas, which gives them knowledge of the people and resources, but the areas are familiar with them. With this relationship already established, the development of the plans goes much more smoothly and quickly.
If the scope of the project is still quite large, the disaster recovery plan development efforts should initially focus on three critical areas:
- The first is the development of an Incident Management Plan and the Incident Management Team.
- Then, recovery procedures should be addressed.
- And finally, an infrastructure within the organization should be created to support any disaster recovery effort, i.e., replacement staff, key supplies, etc.
General Issues to consider:
The Disaster Recovery Plan
Creating a business continuity plan is far from a trivial exercise.
Risk analysis is inextricably linked with disaster recovery. Assessment of the risks which may lead to disaster is essential in the determination of what controls are appropriate to the situation.
Disaster Recovery Audit
How do you ensure that your disaster recovery plan meets your actual needs? How do you know that it will all work? Do you audit it, and if so, how?
Equally fundamentally, do you know what your resource/service dependencies are and what their time criticalities are? What of your actual everyday contingency practices – do they measure up?