Disaster Recovery with Containers? You Bet!

disaster-recovery

Overview

In this article we will discuss the benefits containers bring to business continuance, reveal concepts for applying containers to disaster recovery and of course show disaster recovery of a live database between production and DR OpenShift environments. Business continuance of course is all about maintaining critical business functions, during and after a disaster has occurred.  Business continuance defines two main criteria: recovery point objective (RPO) and recovery time objective (RTO). RPO amounts to how much data loss is tolerable and RTO how quickly services can be restored when a disaster occurs. Disaster recovery outlines the processes as well as technology for how an organization responds to a disaster. Disaster recovery can be viewed as the implementation of RPO and RTO. Most organizations today have DR capabilities but there many challenges.

  • Cost – DR usually is at least doubles the price.
  • Efficiency – DR requires regular testing and in the event of a disaster, resources must be available. This leads to idle resources for 99.9% of the time.
  • Complexity – Updating applications is complex enough but DR requires a complete redeployment where the DR side almost never mirrors production due to cost.
  • Outdated – Business continuance only deals with one aspect,  disaster recovery but as mentioned cloud-native applications are active/active so to be effective today, business continuance architectures must cover DR and multi-site.
  • Slow – DR often is not 100% automated and recovery is often dependent on manual procedures that may not be up to date or even tested with the latest application deployment.

I would take these challenges even further and suggest that for many organizations business continuance and DR is nothing more than a false safety net. It costs a fortune and in the event of a true disaster probably won’t be able to deliver RPO and RTO for all critical applications. How could it when DR is not part of the continuous deployment pipeline and being tested with each application update? How could it with the level of complexity and scale that exists today and not 100% automation?

Continue reading