Consistent Disaster Recovery (DR) testing is crucial to making it through an incident that takes your business offline.
But consistent testing is only half the battle. For the kinds of iterative changes that will improve your overall DR readiness, testing needs to be documented every step of the way. By recording the recovery steps, when and why you’re testing, and what you’re learning from current and previous tests, your DR program gets stronger over time.
Here are three of the key reasons why you need to document your processes and findings for each disaster recovery test:
You don’t know for sure who will be doing the recovery
Creating a thorough disaster recovery plan that is well documented is one of the most critical components in your ability to get your business back up and running in the event of a major impact event. Major events can take the form of power outages, system failures, or something much more widely felt such as Hurricane Sandy, which blocked access to most of downtown Manhattan for days.
- Document exact step-by-step instructions so that anyone with a general knowledge of your environment could follow them because you and your staff may be impacted as well.
- Documentation will also give you a snapshot of your environment’s recovery at a specific point in time. You can use that reference point to measure what has changed next time you go to test.
Staying consistent in testing
Everybody involved in testing should be working from the same base. And each test should use the same processes to ensure consistent results.
Ideally, you’d run DR tests once a quarter, or at least twice a year. Your IT environment is like a living organism – it is always changing and evolving. And your DR documentation needs to change and evolve with it or you won’t be able to complete replication and recovery effectively. As noted earlier, proper documentation gives you an exact snapshot in time and a repeatable, adjustable guide, so you’re not constantly reinventing the wheel.
Documentation also makes it easier for company stakeholders to run tabletop exercises that become dry runs of data center emergencies.
Testing upstream vendors
If you have critical business processes that depend on upstream vendors, you must coordinate your disaster recovery testing with them.
Coordination with vendors is tough even with strong documentation; imagine how much worse it could be without it. Vendors have to shift their workloads around to manage your tests internally. Poor documentation makes the process more complicated for them, which could threaten the health of your business relationship.
Thorough testing documentation gives your upstream vendors a playbook to work from. That can only strengthen your test results and minimize risks during an actual crisis.
Tests can also reveal weaknesses in your vendors’ ability to recover during a disaster. Putting those weaknesses on paper better prepares your company and establishes a rationale for improving on your vendor’s weaknesses or seeking out a better vendor.
To test or not to test — it’s not even a question
There’s an old military saying that ‘no battle plan survives contact with the enemy’. All commanders know this, but they still make plans before sending troops into combat.
Much the same is true of disaster recovery testing. It won’t cover every eventuality, but it will certainly cover a lot more than a DR plan that collects dust on a shelf waiting for its flaws to be exposed at the worst possible moment.
Documentation of your DR testing processes and results ensures the knowledge gained from your tests can be referenced later, and the lessons learned can then be folded into your DR program. That’s a lot better than facing a disaster unprepared.