Disaster Recovery Plan (DRP) / Business Continuity Plan (BCP)

July 16th, 2008

Computer systems are the core tool of today’s business and are vital to every business from the smallest to giant organizations. Money transactions, customer service are just simple examples. Despite high hopes, disasters in one form or another eventually strike every organization. Whether it’s natural disaster like a hurricane or earthquake, or man-made disaster like a street riot or explosion – every organization will encounter events that threaten their very existence.

We all work on our computer systems without thinking about “What if” scenarios. However, computers are not like other electronic devices such as TV, DVD and etc. Dependent on a combination of hardware and software, they may suddenly stop working for some reason. Even power failures can cause malfunctioning. To avoid such problems we need to draw up a Plan, or a number of alternate plans for possible scenarios, to help mitigate the effects a disaster has on the company’s continuing operations and to achieve a speedy return to normal operations.
Advanced preparation saves time, money, and prevents loss of clients, and business reputation.

Objectives

  • Business Continuity Planning (BCP)
  • Disaster Recovery Plan (DRP)

Process Flow

  • Risk Management
  • Business Continuity Planning
  • Disaster Recovery Plan

Summary

Business Continuity Planning (BCP) and Disaster Recovery Plan (DRP) are very important plans for business from small to big sizes. Before planning, the business should identify its assets and risks. The process is called Risk Management, and is divided in to 4 sections:

  • Risk Analysis
  • Asset Valuation
  • Calculating Safeguards
  • Handling Risk

These elements help to see the full picture before preparing the plans. Business Continuity Planning (BCP) helps a business to recover one of its systems which as ceased to function. It is divided in to 4 sections:

  • Project Scope and Planning
  • Business Impact Assessment
  • Continuity Planning Goals
  • Approval and Implementation

Disaster Recovery Plan (DRP) deals with worst case scenarios when ALL systems or one major system have ceased to function. This Plan is executed as in autopilot mode.

Process Flow – Risk Management

Risk management is a detailed process of identifying factors that could damage or disclose data, evaluating those factors in light of data value and countermeasures cost, and implementing cost-effective solutions for mitigating or reducing risk. Risk is the possibility of something occurring to interrupt business continuity.

The primary goal of risk management is to reduce risk to an acceptable level. The organization should decide what that level is, while assessing its assets, size, and budget. It is important to consider all possible risks when performing risk evaluation for an organization.

Risk management is done through risk-analysis. It includes:

  • Analyzing an environment for risks
  • Evaluating each risk as to its likelihood of occurring and the cost of the damage it would cause if it did occur
  • Assessing the cost of various countermeasures for each risk
  • Creating a cost/benefit report for safeguards to present to the upper management

Risk Management

Risk management also requires evaluation, assessment, and the assignment of value for all assets within the organization. Without proper assets valuation, it is not possible to prioritize and compare risks with possible losses.

Risk Analysis

Risk analysis provides upper management with details necessary to decide which risks should be:

  • Mitigated
  • Rejected
  • Accepted

Asset Evaluation

When evaluating the cost of an asset, there are many aspects to consider. The goal of asset evaluation is to assign a specific dollar value to each item.

Calculating Safeguards

For each specific risk, one or more safeguards or countermeasures must be evaluated on a cost/benefit basis.

  • Cost of purchase, development, and licensing
  • Cost of implementation and customization
  • Cost of annual operation, maintenance, administration, etc

Handling Risk

The results of risk analysis are:

  • Complete and detailed valuation of all assets
  • An exhaustive list of all threats and risks, rate of occurrence, and extent of loss if realized
  • A list of threat-specific safeguards and countermeasures that identifies their effectiveness
  • A cost/benefit analysis of each safeguard

Management must now address each specific risk, and decide on a response. There are four possible responses:

  • Reduce
  • Assign
  • Accept
  • Reject

Process Flow – Business continuity planning

Business continuity planning is a process that helps organization to recover one of its systems that does not work and it involves risk assessments and drawing plans, policies and procedures to reduce the impact when a disaster is striking the organization IT infrastructure. This process contains four elements.

Project Scope and Planning

There is a need for structured analysis from the business’ point of view. The organization needs to set-up a team to handle the crisis.

Business Impact Assessment

With the team ready, there is a need to identify resources that are critical for the organization’s ongoing viability and the threats posed to those resources.

Continuity Planning Goals

The next step is to describe the Plan’s goals. One important goal is to ensure continuous operation of the business in face of an emergency.

Approval and Implementation

Once the team has completed the Plan process and the documentation, it’s time for top management approval. Upon approval the team should begin with the business continuity planning implementation by setting up a time schedule. The next step should be maintenance and testing for this Plan to be efficient.

Process Flow – Disaster Recovery Plan

This process deals with the worst case scenarios such as hurricanes, earthquakes, power failure, fire, and terrorist attack by denying access to the organization main server’s room. Personnel should be trained so this Plan will run on auto pilot mode when disaster strikes the organization.

Natural Disasters

Earthquakes

Earthquakes are caused by a shift of seismic plates and can occur almost anywhere in the world without warning. A well-known example is the San Anders fault, which poses a significant risk to portions of the western United States. The organization’s DRP should have a procedure in place that is implemented when a seismic event interrupts normal activities. For example the following states: Pennsylvania, New Jersey and Delaware are considered as a moderate seismic hazard.

Floods

Flooding can occur almost anywhere. Some flooding results from the gradual accumulation of rainwater in rivers, and lakes. According to government statistics flooding is responsible for over $1 billion of damage for businesses and homes each year. The Plan should consider sufficient insurance coverage to protect the organization from the financial impact of a flood.

Storms

Storms pose high risks to a business. Hurricanes and tornadoes bring the possibility of severe winds exceeding 100 miles per hour that threaten the structural integrity of buildings.

Fires

Fires can start from natural or man-made sources. Businesses need to address fires in their DRP plans.

Man-Made Disasters

Our sophisticated society depends on an information and communication infrastructure to support our daily activities. Business employees can be one source of intentional vandalism and unintentional man-made disasters.

Bombing/Explosions

Explosions may result from many sources of man-made actions. Gas leaks can ignite and cause damages to buildings.

Acts of Terrorism

September 11, 2001 brought new/old scenarios to our consciousness, where small business can be diminished and large businesses can suffer long-term damage.

Power Outages

In order for businesses to operate they need electricity power. What happens when there is no power? To address this scenario there is a need for Uninterruptible Power Supply (UPS) to take over and allow saving of data before shutdown of the systems.

Hardware/Software Failures

Computer systems have tendency to fail without any further warning, this applies to hard-drives, mother boards, etc. Software may crash due to internal errors or a combination of hardware and software conflicts. The recovery team should address the issue of how replacement parts can be quickly obtained and installed.

Theft/Vandalism

Equipment may be stolen, as well as information in the way of a leakage from your database, such as clients list or financial records crucial to businesses continuity.

Recovery Strategy

When a disaster interrupts business, the disaster recovery Plan should be done automatically, meaning the recovery operations should start immediately.

Business Unit Priorities

In order for a business to recover quickly, all business operations have to be priorities. The highest priority should be recovered first and so forth. In some cases to recover just 40 percent from the highest operation would be sufficient for short period of time and then to move on to a lower priority operation to gain minimal business operation.

Crisis Management

This is hard on training but easier on the battle field – meaning business recovery team should be trained and organized at all times to be ready when a disaster strikes.

Emergency Communications

When disaster strikes it is important that the business be able to communicate to the outside world and internally.

Alternate Processing Sites

Alternate sites are set up for cases when the main site is not functioning. We will examine three options for alternate sites.

Cold Sites

Cold sites have minimal support: There are no computer systems, and only open space is available for work group, as well as some telephone lines. This option is inexpensive, downtime is longer.

Hot Sites

A hot site is a working site, equipped with the necessary computer systems and communication lines. The data from the primary site is constantly been updated to servers on site. This option is expensive, downtime is shorter.

Warm Sites

Warm site is almost a hot site: The site has standby servers and some minimal communication lines. To fully operate the site, a recent backup tape is needed from the main site. This option combines hot and cold sites options.

Recovery Plan Development

Once the business has established prioritization and attained a good overview of appropriate alternative recovery sites, the time has come to prepare appropriate documentation for each audience.

Backups and Off-site Storage

Backups are the key component in the business DRP or BCP. With effective backups strategies a business can fully recover. Off-site storage it is a fiscal location were all backup media are stored.

Logistics and Supplies

A business will suddenly face the problem of moving employees, equipment and supplies to an alternate site. The Plan must also address this issue.

Training and Documentation

Like the Business Continuity Plan, it is essential to provide training for all employees who will be involved in a disaster recovery effort. The DRP should be documented and modified according to business needs.

Testing and maintenance

For the DRP to work, a business needs test the Plan periodically to ensure it meets the requirements. There are five different tests that a business can use:

Checklist Test

The check list is the simplest test, and its purpose to make sure we have everything in place, such as an inventory check. It make team members familiar with the Plan.

Structured Walk-Through

The structured walk-through is designed to “play” a disaster scenario and help team members to exercise their role.

Simulation Test

The simulation test measures team response to a non-critical disaster scenario.

Parallel Test

The parallel test checks the next level, relocating employees and supplies from the main office to the alternate site with current backup tapes for restoration on the backup servers.

Full-Interruption Test

The full-interruption test checks the Plan by shutting down the main office and shifting all activities to the alternate site.

Maintenance

The DRP is a living document. The business should update it during its life time.

Yigal Behar works as a computer security consultant at 2Secure Corp. Questions and ideas? Please contact us or call 646-666-9601

  1. Dan
    March 19th, 2009 at 10:28
    Reply | Quote | #1

    What about online backup and hot bare metal recovery solutions available these days?

    Thanks,
    Dan

  2. Prophet Loans
    May 1st, 2009 at 11:55
    Reply | Quote | #2

    I think your blog has a very cool wordpress template. I allways snoop arround to find cool wordpress templates, The site has nice and unique wordpress templates.

  3. Yigal Behar
    June 2nd, 2009 at 09:29
    Reply | Quote | #3

    Hello,

    I have decided to add some thoughts to my article above.

    1. Online backup
    2. Fast recovery with acronis and VMware
    3. Using HOT bare metal recovery

    1. Online backup: This is good solution, but has some issues using this method.
    a. You need to have an Internet connection for backup and restore operations
    b. Bandwidth issues may prevent backup/restore operations at reasonable times
    c. Saving sensitive information on non-trusted networks, even if they have the best methods keeping your information safe.

    I would consider online backup if:
    1. Small size files
    2. No sensitive information is been backed up.

    Acronis: is very good utility that gives you the ability to do hard drive operations easily, but the best is having the backup is VMware files format which brings fast recovery when a disaster hits your company. You can have the same system installation with all applications and data up and running.

    HOT bare metal recovery: Latest changes in theologies today gives you the option to backup and restore server/machines to its original state, but also restoring the server to another server brand i.e: Dell to HP, using a CD.

    Be safe!

  4. hotman silalahi
    August 10th, 2009 at 01:00
    Reply | Quote | #4

    The Company to which i am currently working is requesting us to develop BCP dan DRP.

    Can you help me please to send me the DRP / BCP template?

    Thank you for your help.

    Hotman Silalahi

  5. claude
    August 27th, 2009 at 14:34
    Reply | Quote | #5

    Can you help me please to send me the DRP / BCP template?

    Thank you for your help.

    Claude

  6. Mark Smith
    January 9th, 2010 at 00:37
    Reply | Quote | #6

    I’m a professional isurance agent and I can say in no uncertain terms that this is some great advice. Awesome stuff, dude.

  7. russel
    January 15th, 2010 at 15:25
    Reply | Quote | #7

    I found your blog recently and have been visiting it . I think your way of thinking is good. keep up the good work. If interested in link exchange please contact me.

  8. grow my business
    June 24th, 2010 at 07:50
    Reply | Quote | #8

    This is such an important topic, I wish more people would write about it, and not just spam other people’s ideas. Researched content is hard to find on the Internet these days.