At the risk of using a headline that borders on click bait, what does the Zombie Apocalypse have to do with cloud and cloud technologies? At a recent meeting about IT disaster recovery and business continuity, the group started talking about their incident response teams and the plans they had in place. One attendee spoke up and talked about having backpacks of food and fresh water available in each office in case of emergency. They were prepared! They were very prepared… for a cataclysmic event!

At a previous company, we had a great Business Continuity Plan and annual tests of the plan that you actually looked forward to each year. (Yes, they were that fun!) The head of risk management had a fantastic imagination for cooking up the scenarios we would play out. Over the years we walked through a rail disaster that spilled toxic chemicals behind the building, an active shooter incident (unfortunately, one of my employees “died” that day), and a student that pulled a fire alarm, triggering the fire sprinkler system (uh, our server room was “protected” by a water-based fire prevention system).

These are all great scenarios and I am not diminishing the necessity of being prepared for “the big one”; I think you have to be prepared. But, in my 35 year career in IT do you know how many hours of downtime I have experienced because of an apocalyptic disaster? Zero. (I am now knocking on my head because it is the only “wood” I can find in my office.) The fact is, most disruptions to critical IT systems are not caused by tornadoes, fires, or tsunamis. They are created by human error, hardware failure, application failure, or malicious attack.

Case in point, the most memorable outages in my career included:

  • Dual drive failure in our companies only SAN device (3 days – entire business offline)
  • Application Developer taking wrong action to resolve production batch failure (Orders, shipping, and warehouse operations down for 75 hours)
  • During Disaster Recovery test production SAP server overwritten (7 hours of downtime and 6 days loss of data)
  • Blizzard shutdown the city and knocked out power to the data center (3.5 hours of downtime)
  • Internet outage(s) (varying lengths up to 8 hours closing the entire business)
  • Payroll system compromised by attacker (day delay in issuing payroll and hundreds of hours in data forensics)

Back in the day (jeez, I sound old!), the technology meant that your only mitigation strategy against disasters of all sizes was your tape backups. It could take hours or days to restore a single application and typically, it meant you lost at least a day’s worth of data. It was time consuming and expensive. Recovery was relegated to an insurance policy against large events because businesses couldn’t justify the cost to recover.

Today, the technology exists to recover all or part of your IT systems in minutes to hours with only seconds of lost data. This means your incident response plans can now cover everything from a zombie apocalypse, to a single application failure and even a ransomware attack. But, like a good Business Continuity Plan, these response plans need to include what the business will do to operate during these smaller, yet still impactful disruptions. Your Business Continuity Plans, your IT Disaster Recovery Plans and your Information Security Incident Response Plans should all be put through the same rigor of testing for all types of incidents.

One final point, while I am talking about incident response plans, specifically Information Security Incident Response Plans (uh, you DO have one of those, don’t you?), protecting yourself from ransomware attacks can be very difficult. Yes, you should take preventative measures, including constant employee awareness and training, but you should also have reactive measures in place. What will you do when you get hit with a ransomware attack? One of the first steps you should take if you have reason to believe the attack has spread to your server environment is to pause replication to your DR site or stop back ups until you can identify the extent of the damage.

In this way, you can increase your odds you have a clean recovery point (replication) or a clean backup from which to recover your lost data.

Blog Post

3 Quick Tips to Sharpen Your IT Disaster Recovery Strategy

Now that always-on business is a critical demand of the modern world, it’s no longer acceptable to rely on a less-than-effective DR strategy.

View Blog Post
Blog Post

4 Drivers to Transform Your IT Availability

A critical element to supporting IT availability is an effective IT disaster recovery (DR or IT-DR) strategy, but how do you select the best solutions to meet your company’s objectives?

View Blog Post
Blog Post

5 Worst Practices for IT Disaster Recovery

Here are five worst-case pitfalls to avoid, so you can ensure a smooth and effective recovery of your most crucial data and IT systems.

View Blog Post