Home / Disaster Recovery

Disaster recovery (DR) is the subset of business continuity that focuses on protecting the IT infrastructure and systems that support critical business functions.  DR plans include all of the policies, procedures and tools that will enable the recovery of mission-critical applications and data in a way that will permit the organization to remain operational following a natural or human-caused disaster.  DR plans typically also include control measures aimed at preventing events from occurring, detecting unwanted events before they can cause problems, and correcting or restoring systems after an event.

Disaster recovery plans depend on the organization’s business continuity plan for key metrics like recovery point objective (RPO) and recovery time objective (RTO) for the different business systems that IT supports, such as sales, HR, and financial applications.  RPO is the maximum age of the files an organization needs to recover from backup storage, and this determines how often the files need to be backed up.  For example, if the business sets an RPO of four hours for its sales records, that means the sales system must be backed up every four hours.  RTO is the maximum amount of downtime an organization is willing to tolerate following a disaster.  If the organization sets an RTO of one hour for its product download system, then the DR plan must provide a way to restore the download system and all the related files from backup storage and resume operations in one hour.

Metrics like the RPO and RTO parameters guide the DR team in choosing the right combination of data protection technologies for the organization.  Tools are available to support different strategies ranging from tape backups that are sent off-site at regular intervals, to private and hybrid cloud solutions that replicate data into storage domains, to high-availability systems that keep both the data and systems continuously replicated off-site for continuous access even after a disaster.  For example, SUSE Linux Enterprise High Availability Extension can virtually eliminate unplanned downtime with fast failover to standby systems, clustering and replication.    Geo Clustering for SUSE Linux Enterprise High Availability Extension will shift workloads to another cluster in a distant location in the event of a regional disaster like an earthquake, tsunami, blackout, or hurricane, keeping the disaster from disrupting critical systems.