Business Continuity and Disaster Recovery for Microsoft 365

To back up or not to back up

Microsoft explicitly states in its cloud responsibility matrix that the customer is responsible for their data. Microsoft 365 (M365) has many built-in data protection features that go a long way in protecting business data. However, none of the features are designed for business continuity and disaster recovery (BCDR) scenarios. In this article, we will go through:

  • M365 built-in data protection features

  • BCDR scenarios of cyberattacks, malicious or insider threats

  • Why built-in features may not be enough for fast recovery

  • Backup solution options

  • Recommendations.

While most of the content is written from the SharePoint data perspective, the same principles apply to other workloads like Exchange emails, Teams chats, and messages.

M365 Built-in Data Protection

Microsoft 365 provides several built-in safeguards to protect against accidental deletions, malicious actions, and compliance risks. Some protections are always on, while others require deliberate configuration. Here’s a quick comparison of the protections available for the two primary workloads: SharePoint documents and Exchange emails.


SharePoint Document Protection Features Exchange Email Protection Features
  • Document versioning: By default, a library keeps 500 major versions for each document.
  • Restoring an entire library: Allows restoration of a whole document library to a point-in-time within the last 30 days. The feature is based on the document version history and deleted items in the recycle bin. It's unavailable if library versioning is turned off or if deleted items have been permanently removed.
  • The first and second stage recycle bins keep deleted items collectively for up to 93 days.
  • Microsoft Support can restore an entire site. Any deleted content could be restored by a ticket to Microsoft Support within 14 days of deletion.
  • Purview retention policies and labels can retain content for a period or forever. Deleted files are kept in each site's 'Preservation Hold Library'. Retention policies and/or labels aren't enabled by default and require careful analysis and configuration.
  • Deleted and Recoverable Items folders: Deleted emails go to the "Deleted Items" folder. If you delete it from there or use Shift + Delete, it moves to the "Recoverable Items" folder. By default, it stays in this folder for 14 days, which can be extended to 30 days. After this period, the email is permanently deleted.
  • Mailbox recovery: When a user leaves the organisation and their account is deleted, their mailbox is retained for 30 days, after which it is permanently deleted.
  • Purview retention policy can keep the entire mailbox for a period or forever. When a user leaves and their account is deleted, the mailbox becomes inactive and can be searched through Purview eDiscovery.

Business Continuity & Disaster Recovery Scenarios

We use two common security threat scenarios to determine the impact on data and business continuity.

  • Ransomware attack scenario: If the business comes under a ransomware attack, data held on the user’s local device is encrypted, moved, and/or deleted. For SharePoint documents, any changes made locally are auto-synced to the cloud, meaning documents are being encrypted, moved and/or deleted in SharePoint Online.

  • Malicious user or insider attack scenario: An external malicious user or an organisation's staff member goes rogue and deletes documents in SharePoint. The level of damage depends on the level of access this actor has; for example, an actor with full administrator permissions for a site could delete data and even empty the recycle bins.

“From a BCDR perspective, the net effect of a security attack is the disruption to business through the loss of access to business-critical data.”

Data restores and recovery back to normal

The recovery depends on the volume of data affected and the lapse time period before the breach is detected.

Recovery through Microsoft support

If many sites are affected, the best option may be to seek help from Microsoft. Create a support ticket to restore the site within 14 days of the event. No recovery may be possible after the 14-day window.

Recovery through library restore

The library could be restored to a point-in-time in the last 30 days as long as versioning is turned on. You are out of luck if versioning is off, ransomware isn't detected within 30 days, or ransomware creates many edits to exhaust the version limit. For example, if the library is configured to keep 500 versions, and ransomware makes 501 edits within a few days, all your versions are corrupt, and a library restore would not help. This option will also be useless if an attacker deletes the documents and cleans up both recycle bins.

Recovery through retention

The retention feature is designed for long-term retention for compliance reasons, not a fast data recovery feature. The 'Preservation Hold Library' that keeps deleted documents is hidden and has no intuitive user interface to recover items in bulk. It holds individual copies of documents across the entire site. While you may recover individual document versions, recovering all versions across thousands of encrypted or deleted files may be impractical even with scripts.

A case for backup

The cyber-attack and recovery scenarios detailed above demonstrate that while built-in data protections go a long way, these are insufficient for meeting the business continuity requirements of many organisations. Particularly, organisations operating in highly regulated environments like professional services firms and government organisations.

The scenarios are just two samples of many cybersecurity threats facing organisations. We have not even discussed SharePoint lists, which are increasingly used as data stores for line-of-business apps running on Power Apps. Then there is the case of permissions, which are even harder to recover.

Backup options

There are plenty of vendor Sass solutions available today for SharePoint backup. I have experience implementing AvePoint Cloud Backup; popular ones are Veeam and AvePoint. Microsoft has launched its own backup solution. The table compares the M365 Backup and Other backup solutions.

M365 Backup SaaS Solutions like AvePoint, Veeam
  • Microsoft's own backup and restore solution promises ultra-fast backup and restore times, 1–3TB of restore rate per hour.
  • Full fidelity restore of SharePoint sites, meaning the entire site is restored to a point-in-time copy to the same or a different URL. Granular restores, such as restoring specific files, are coming soon.
  • Consumption-based billing: 0.15 per GB, Microsoft provides a pricing calculator. For example, to protect SharePoint sites with a total storage of 1TB, the cost is USD$150 per month.
  • Ability to choose which SharePoint sites to protect. A business may choose only the most critical sites to save on costs.
  • AvePoint and Veeam are a couple of popular backup options. Both offer a per-user license pricing model with unlimited storage, meaning the cost is the number of participating licenses in the tenant.
  • Example: The Veeam Flex Plan, which includes backups for SharePoint, OneDrive, Exchange, and Teams, is USD$2.63 per-user. If the tenant has 50 E3 licenses and 100 E5 licenses then cost will be 150 × 2.63 = USD$395 per month. The pricing varies by the number of user accounts.
  • AvePoint does not list their pricing on their website.
  • Protects multiple workloads like Exchange and Teams, not just SharePoint.
  • Unlimited storage offer alleviates the need to monitor storage in M365. Although, it is still best practice to have good information governance and retention and disposal processes in place to ensure the business is disposing of data it does not need.

Recommendations

Every business should conduct its risk assessment and decide whether it requires a backup solution. If risk assessment indicates an impact of catastrophic failure in the event of critical data loss and the likelihood of a possible event, then backup makes sense. The next question is which backup option works best - M365, Veeam, AvePoint or another vendor. Pricing calculations will come in here.

Overall, here are some recommendations for enhancing SharePoint’s data resiliency.

  • Identify your critical SharePoint libraries and ensure their versioning settings reflect your business. For example, increase the versioning setting to 5000 or even 50,000 if your company needs to maintain document history for long periods.

  • Classify your data and determine its retention and disposal periods, and configure suitable retention policies and labels. Use M365 Purview Adaptive Scopes to identify specific SharePoint sites that require long-term retention and apply policies to retain data.

  • Use Microsoft Defender to send auto alerts like bulk deletion notifications.

  • Practise good security management, such as just-in-time access for highly privileged administrator roles and ensure that access is limited to a few trusted individuals.

  • Finally, decide on whether you need backup. If so, which backup option works best for you? Once the backup solution is deployed, conduct regular disaster recovery exercises. For example, every 6 months, restore a site and confirm that a successful restore occurs.

“Risk assessment is essential to identify mission-critical data and the likelihood and impact of an adverse event. You may find built-in protections sufficient, no backup is needed, or a small data set needs backups, reducing cost. Even without backups, understanding your risks empowers you to be prepared.”