Mark Pearl
  • Identify & name critical people charged with responding to a crisis
  • Identify which systems are covered
  • Identify third party vendors and how to contact their support services
  • Identify priorities, what’s most important - not everything in your business is worth saving

Steps to putting a DR Plan together

Step 1 - Identify critical services

  • Compile list of cricital business services
  • Determine the services and applications used by these critical services

For each service list the following:

  • Service Name
  • Where it is located
  • What days / times it is critical for the service to be operational
  • Are there any manual work arounds
  • What is the maximum allowable outage time
  • What are the supporting services

Step 2 - Assess impact of service outage

  • Determine impact of service outages on organization
  • Identify maximum allowable down time, and data loss

For each service, list the following:

  • Service Name
  • Rate impact (Major, Moderate, Minor) if service is not available in each category: Safety Human Life / Financial / Reputation / Regulatory

Step 3 - Assess Risks

  • Assess the threats & vulnerabilities that could cause disruptions to the services and processes
  • List known risks
  • Identify risks

Step 4 - Prioritize Services

  • Categorize and prioritize services based on need and criticality (use the risk matrix high, medium, low)
  • Rate each service on a tier system, tier 0 is core infrastructure services, lower numbered tiers are recovered first as they are more critical to higher numbered tiers

Risk Matrix

Step 5 - Set Scope

  • Determine which services to develop recovery plans for first
  • Focus on Medium or High Risk area, usually you will not cover low risk or non-critical system functions

For each service, in your recovery plan cover

  • Whether to follow a strategy of prevention, or to focus on recovery options or workarounds
  • How to confirm if the actual cause
  • Estimated recovery period, how long it is expected to recover once a specific failure is identified
  • Guidelines for determining plan activation;
  • Guidelines for recovery procedures;
  • References to key technical dependencies;
  • Rollback procedures that will be implemented to return to standard operating state;
  • Checklists outlining considerations for escalation, incident management, and plan activation

Have regular practice drills

DR plans should be tested reguarly, find a cadence that works that keeps the team in sync

Keeping it up to date

Update DR Plan after changes are made to an internal system


Fire Drills for DR

blog comments powered by Disqus

Want to get my personal insights on what I learn as I learn it? Subscribe now!