What are the differences between outsourced backup and managed disaster recovery?

Over the last ten years or so, IT solutions have been moving towards the "as a Service" model. Backup and disaster recovery solutions are no exception.
Backup as a Service is also known as Backup as a Service (BaaS), and Disaster Recovery as a Service is also known as Disaster Recovery as a Service (DRaaS). These services refer to solutions provided by service providers to businesses.
This means that corporate IT teams do not need to install and maintain solutions locally in their own datacenters. Test management (reboot, DRP, network) and operational service maintenance can also be part of the service provider's offering.
CIOs are at the heart of the choice of these solutions. They need to understand the different options available to them, and the implications for data protection, before making their decision. These two modes of protection, often perceived as similar, do not cover the same risk scenarios.
Definition of Backup as a Service (BaaS)
More and more service providers are offering Backup as a Service solutions. These correspond to the purchase of an online backup service, generally to a Cloud (public, private, private).
BaaS can cover a number of different areas:
- backup of files and folders
- backup of an entire disk,
- application backup (Domain Controller, Exchange) or database backup (SQL server, PostgreSQL, Oracle, etc.).
Recent developments in BaaS have made it possible to automate tests for restoring or restarting servers (complete or partial). These tests can then be carried out either manually, or via APIs or automata.
Definition of Disaster Recovery as a Service (DRaaS)
The concept of Disaster Recovery Planning as a Service is more recent, but is expanding rapidly as it responds to new issues, such as cyber threats.
It's a complete service offering provided and administered by a supplier, based on the cloud model and offering a recovery guarantee (RTO).
These solutions exploit the main advantages of the Cloud (elasticity, pay-per-use), and therefore reduce the costs associated with infrastructure sizes.
The scopes addressed by these DRaaS solutions are potentially very different:
- In terms of OS covered: while x86 architectures are always covered, rarer OS (OS400, proprietary Unix, etc.) are only marginally supported.
- RTO timeframes (restart in the event of activation of the DRP): the technologies used can vary widely, enabling RTOs ranging from a few tens of minutes (this is referred to as a Continuity Plan rather than a Recovery Plan) to a few hours.
- Services provided: this can be either a partially managed DRP (the customer is responsible for maintaining operational conditions and carrying out DRP tests independently), or a fully managed DRP (regular server restart tests, monitoring of cloud backups, etc.).
These are all important factors to consider when choosing your solution. That's why a preliminary analysis is necessary, to determine your needs in terms of servers to be protected, restart times and data freshness.tion and data freshness (RTO and RPO), and finally your management needs, based on the availability and skills of your technical teams.
Risks covered and not covered by these two solutions
To fully understand the difference between these two services, we first need to consider the different risks that each solution addresses.
We're going to analyze several types of risk to be covered by Backup (BaaS) and DRP (DRaaS), broken down into families.
Risk family | Risks | Potential sources | Main recovery mechanism |
Data loss or corruption |
Loss or corruption of files Data, OS or DB corruption |
User or procedural error | Backup |
Unavailability of server infrastructure |
One server down A set of servers down Entire infrastructure down |
Hardware or software problem | Backup or PRA |
Datacenter unavailability |
Long-term unavailability due to a disaster Unavailability of fluids (electricity, etc.) Unavailability of telecoms |
Fire, storm, terrorist attack, construction work, etc. | PRA |
Ransomware |
Ransomware on a file server Ransomware at IS OS level |
Malware propagated by e-mail, vulnerability, etc. | Backup or PRA |
Cyber attack |
Sophisticated attack Denial of service (DoS) Advanced Persistent Threat |
Coordinated attack on IT infrastructure | DRP |
Risk scenarios: data loss or corruption
Loss or corruption of files: this may be due to user/computer error, hardware problems or procedural errors.
Risk coverage with Outsourced Backup (BaaS) |
Risk coverage with Managed Disaster Recovery (DRaaS) |
This is the main risk covered by all outsourced backup solutions. Specific points to consider are
|
Depends on the backup or replication mechanisms used by the DRP solution:
|
Questions to ask in relation to the risk scenario :
- Cloud backup storage:
- How many replications of backed-up data are performed in the cloud (1, 2 or 3 replications?)?
- Are cloud backup replications performed on several remote DCs?
- Ability or not to have different backup retention periods:
- By file type,
- By keeping N versions of each file.
Risk scenarios: loss or corruption of operating system (OS) or database
Risk coverage with Outsourced Backup (BaaS) |
Risk coverage with Managed Disaster Recovery (DRaaS) |
Coverage of this risk depends on the functional coverage of the outsourced backup:
|
Coverage of this risk depends on the backup or replication mechanisms used by the DRP solution:
|
Questions to ask in relation to the risk scenario:
- Are there mechanisms for backing up Linux OSes in infrastructure contexts where hypervisor mechanisms cannot be used (typically in public or private clouds)?
- Does the solution have the capacity to back up only certain disks/partitions of the machine, to limit the amount of data to be backed up and speed up restoration speed?
Risk scenarios: complete unavailability of one or more servers
Risk coverage with Outsourced Backup (BaaS) |
Risk coverage with Managed Disaster Recovery Plan (DRaaS) |
Depending on the coverage of the backup solution, this risk is covered. But you need to analyze :
|
In general, this risk is not well covered by a disaster recovery solution:
|
Questions to ask in relation to the risk scenario:
- Without testing, there's no salvation: has the solution taken into account regular server restart tests (either fully automatic or manual)? A minimum annual restart test frequency is recommended.
- What are the procurement lead times for on-site IT infrastructure: these are often not compatible with business needs (especially at present, with component shortages), and therefore make it impossible to recreate an on-site infrastructure within an acceptable timeframe.
Risk scenarios: datacenter unavailability
Datacenter completely unavailable, either following a disaster (fire, storm, flood, terrorist attack, etc.), or due to long-term unavailability of the network or fluids (electricity, air conditioning, etc.).
Risk coverage with Outsourced Backup (BaaS) |
Risk coverage with Managed Disaster Recovery Plan (DRaaS) |
Not covered | This risk is fully covered by a DRP solution, as this is its main objective. The notions of RTO and RPO are paramount. So we need to ask ourselves the following questions:
|
Questions to ask in relation to the risk scenario :
Without DRP testing, there's no salvation, so you need to make sure that regular DRP tests are carried out: a half-yearly testing frequency or less is recommended.
Your DRP tests should cover infrastructure recovery, network tests, user reconnection and functional testing of the backup space by the end-user.
Risk scenario: ransomware on a file server or OS servers
Infection by ransomware via malware propagated by e-mail, exploiting a vulnerability.
Risk coverage with Outsourced Backup (BaaS) |
Risk coverage with Managed Disaster Recovery (DRaaS) |
Risk coverage depends on the ransomware-tightness of the backup:
|
This risk is fully covered by a DRP solution, as this is its main objective. The notions of RTO and RPO are paramount. So we need to ask ourselves the following questions:
|
Questions to ask about the risk scenario:
- Does the chosen solution take into account watertightness against a ransomware attack? The backup space must not be easily accessible by ransomware (e.g. Windows mount point, etc.).
- The time it takes to bring all cloud backups back on line via the network must correspond to your business needs. The question to ask is: does the solution enable data to be brought back locally via specialized boxes (NAS type, SSD disk, etc.) from the service provider?
Risk scenarios: sophisticated cyberattack combining several attack mechanisms
Constructed attack enabling the attacker to take control of the customer's infrastructure with privileged rights.
Risk coverage with Outsourced backup (BaaS) |
Risk coverage with Managed Disaster Recovery Plan (DRaaS) |
Depends on how impervious the backup is to attack:
|
Same risk coverage as for backup. |
Points to watch: the watertightness of cloud backups has become a major issue in the event of a sophisticated cyber attack.
Risk scenarios: Advanced Persistent Threat or dormant attack
Infection by an APT or dormant malware that can be activated several months after infection, requiring long retention of OS data (more than 6 months).
Risk coverage with Outsourced Backup (BaaS) |
Risk coverage with Managed Disaster Recovery Plan (DRaaS) |
Depends on the depth of OS backup. This requires the service provider to offer long-term archiving capabilities on cold storage. |
Generally not covered by DRP solutions. Unless the DRP solution offers long-term archiving on cold storage. |
Questions to ask in relation to the risk scenario:
- In this case, we're talking about archiving VMs over long periods (1 month for 24 months, for example).
- The solution of completely rebuilding the OS is sometimes unavailable.
To sum up, here are the 3 top tips
1 - Understand business issues
The first piece of advice, as with many IT projects, is to fully understand the challenges faced by the company's businesses:
- their backup requirements (backup depths, data archiving mechanisms, etc.),
- their needs in terms of critical applications to be restarted in the event of a disaster or cyber-attack:
- prioritize them (RTO),
- define data freshness requirements (mainly for databases).
2 - Identify the risk scenarios to be covered
Next, we need to identify the risk scenarios to be covered for the company's business activities and infrastructure (data loss, ransomware, datacenter loss):
- From this risk mapping, a trend is bound to emerge: either a BaaS solution is sufficient, or there is a need for DRaaS;
- Have this risk coverage validated by management. Despite their lack of understanding when it comes to Backup and DRP, IT risk coverage is a major issue, and one that management is well aware of. While they may not understand Backup and DRP, they are increasingly aware of the IT risks that need to be covered.
3 - Identify and express your requirements
Once the risks to be covered have been identified, it's time to identify the requirements for the solution:
- First of all, what are your expectations of the service provider: do you want a partially managed solution, or a fully managed solution with contractual commitments?
- In the case of backup requirements:
- What is the scope to be covered: OS, DBMS types, etc.?
- How should initial data loading be carried out (availability of dedicated appliance)?
- If a disaster recovery plan is required :
- Which servers need to be protected in the event of a disaster, and which ones require only a backup solution?
- What are the specific network requirements: how to reconnect sites (MPLS, SD-Wan), mobile users (SSL VPN, etc.)?
- What are the specifics in terms of security: what security solutions are needed in the event of a backup?
Article translated from French