Azure PaaS Backup & Recovery β Consolidated Enterprise Guidance
Prepared by: Microsoft Cloud Solution Architecture
Date: April 2026
Audience: Enterprise Infrastructure & BCDR Teams
Context: Enterprise Scale Landing Zone, Multi-Region BCDR Strategy
Regions of Interest: West Europe, Sweden Central, Germany West Central
Table of Contentsβ
- Executive Summary
- Question 1 β Consolidated Guidance for Recovering Azure PaaS Services Across Multiple Regions
- Question 2 β Recommended Backup Mechanisms by Azure Resource Type
- Question 3 β Comprehensive List of Azure Resources with RSV Applicability
- Question 4 β Where Third-Party Solutions Are Needed
- Architecture β Multi-Region BCDR Reference Design
- Decision Matrix β Recovery Strategy Selection
- Comparison Table β Paired vs Restricted-Pair Region BCDR
- Scenario Analysis
- Recommended Next Steps
- Microsoft Learn Reference Links
1. Executive Summaryβ
Azure provides strong business continuity and disaster recovery (BCDR) capabilities, but recovery mechanisms vary significantly by service. Unlike traditional infrastructure backup models where a single agent or vault protects everything, Azure PaaS services use a combination of:
| Recovery Model | Description | Examples |
|---|
| Native geo-replication / cross-region failover | Built-in data replication to secondary regions | Azure SQL (failover groups), Cosmos DB (multi-region writes), Storage (GRS/GZRS) |
| Service-managed backups | Automatic backups managed by the service | Azure SQL (PITR), Cosmos DB (continuous backup), PostgreSQL Flexible Server |
| Azure Backup / Recovery Services Vault | Centralized backup for selected workloads | Azure VMs, SQL in VM, SAP HANA in VM, Azure Files, Azure Blobs, AKS |
| Infrastructure-as-Code redeployment | Recreate + restore pattern | App Service, Functions, Logic Apps, API Management |
| Third-party backup/orchestration | Enterprise backup tools for gaps | Cross-cloud, air-gapped, advanced Kubernetes, compliance-driven |
Key Insight: There is no single universal backup model for all PaaS services. A consolidated workload-based recovery framework is the correct enterprise approach.
Enterprise Context β European Multi-Region Deploymentβ
For organizations operating primarily in West Europe with plans to expand to Sweden Central and Germany West Central, the following considerations are critical:
- Sweden Central is paired with Sweden South (restricted-access region) β passive replication (GRS, Key Vault, Backup CRR) works to Sweden South, but you cannot deploy active workloads in Sweden South
- Germany West Central is paired with Germany North (restricted-access region) β same dynamic: passive replication works, active deployments are restricted
- West Europe is paired with North Europe (full bi-directional pairing)
- For active DR (deploying workloads in a secondary region), you must choose an unrestricted region such as West Europe, Sweden Central, or Germany West Central
- Many Azure services now support geo-replication to any region, not just paired regions
Key Nuance: Having a restricted-access paired region means services like Storage GRS, Key Vault auto-replication, and Azure Backup CRR still function for passive data protection. However, for active failover (deploying applications, creating resources), you need a different unrestricted region.
Reference: Azure region pairs and nonpaired regions
Reference: Azure regions list
Reference: Multi-region solutions in nonpaired regions
2. Question 1 β Consolidated Guidance for Recovering Azure PaaS Services Across Multiple Regionsβ
Customer Questionβ
"Provide consolidated guidelines for recovering PaaS services across multiple regions, including recommended backup mechanisms and third-party solutions where Microsoft support is limited."
Microsoft's Recommended Approachβ
Microsoft recommends workload-based resilience design anchored on:
| Concept | Description |
|---|
| Recovery Time Objective (RTO) | Maximum acceptable downtime |
| Recovery Point Objective (RPO) | Maximum acceptable data loss |
| Regional redundancy | Multi-region or multi-zone deployment |
| Paired/non-paired region strategy | Choose approach based on region capabilities |
| Regular failover testing | Validate DR plans through drills |
| Automated redeployment (IaC) | Use Bicep/ARM/Terraform for rapid recovery |
Recovery Models by Service Categoryβ
Category A β Native Geo-Failover Services (Automatic or Near-Automatic)β
These services have built-in cross-region replication and failover capabilities:
| Service | Recovery Mechanism | RPO | RTO | Non-Paired Region Support |
|---|
| Azure SQL Database | Active geo-replication, Auto-failover groups | < 5 sec | < 30 sec (planned) | Yes β any region |
| Azure Cosmos DB | Multi-region writes, Automatic failover | ~0 (multi-write) | Seconds | Yes β any region |
| Azure Storage (Blob/Files) | GRS/GZRS/RA-GRS | ~15 min | Hours (failover) | GRS uses paired; Object Replication for non-paired |
| Azure Event Hubs | Geo-replication (Premium/Dedicated) | Near real-time | Minutes | Yes β configurable |
| Azure Service Bus | Geo-DR (metadata + optional data) | Metadata only or near real-time | Minutes | Yes β flexible |
| Azure Cache for Redis | Passive geo-replication (Premium), Active geo-replication (Enterprise) | SecondsβMinutes | Minutes | Enterprise: any region |
| Azure Key Vault | Microsoft-managed cross-region replication (paired regions) | Near real-time | Minutes | Paired regions (incl. restricted-access pairs like Sweden South): auto-replication works; truly non-paired regions: manual backup/restore |
Category B β Backup + Restore Servicesβ
These services provide automated backups with restore capabilities:
| Service | Backup Type | Retention | Cross-Region | RPO |
|---|
| Azure SQL Database | Automated PITR + LTR | 7-35 days (PITR), up to 10 years (LTR) | Geo-restore from GRS backup | MinutesβHours |
| Azure Cosmos DB | Continuous (PITR) or Periodic | 7-30 days (continuous) | Backup stored per-region | 100 seconds |
| Azure PostgreSQL Flexible | Automated backups, geo-redundant backup | Up to 35 days | Yes (geo-redundant backup option) | Minutes |
| Azure MySQL Flexible | Automated backups | Up to 35 days | Geo-redundant backup option | Minutes |
| Azure Files | Azure Backup (snapshots) | Configurable | Cross-region restore with GRS vault | Daily/Hourly |
| Azure Blob Storage | Azure Backup (operational/vaulted) | Configurable | Vault tier supports CRR | Configurable |
Category C β Recreate + Restore Data (IaC-Driven Recovery)β
These services require redeployment in the secondary region with data restore:
| Service | Recovery Strategy | Data Protection | Key Consideration |
|---|
| Azure App Service | Redeploy via IaC + CI/CD | Built-in backup to Storage Account | Multi-region with Front Door recommended |
| Azure Functions | Redeploy via IaC + CI/CD | Storage dependency protection | Source-controlled deployment |
| Azure Logic Apps | Active-passive or active-active multi-region | Integration account DR | Trigger-type-dependent strategy |
| Azure API Management | Backup/restore to storage + multi-region deployment | PowerShell backup (30-day expiry) | Premium tier supports multi-region natively |
| Azure Container Registry | Geo-replication (Premium tier) | Images replicated cross-region | No backup needed β replication is sufficient |
Architecture Diagram β PaaS Recovery Categoriesβ
3. Question 2 β Recommended Backup Mechanisms by Azure Resource Typeβ
Customer Questionβ
"Provide a comprehensive list of Azure resources with their supported backup mechanisms."
Complete Azure Resource Backup & Recovery Matrixβ
Databasesβ
| Resource | Native Backup | Azure Backup / RSV | Geo-Replication | Non-Paired Region Support | Key Docs |
|---|
| Azure SQL Database | PITR (7-35 days), LTR (up to 10 years) | No (not needed) | Active geo-replication, Auto-failover groups, Geo-restore | Yes β failover groups to any region; Geo-restore not available in regions without pairs | Automated backups |
| Azure SQL Managed Instance | PITR (7-35 days), LTR | No | Failover groups | Yes β any region | Business continuity |
| Azure Cosmos DB | Continuous (PITR 7-30 days), Periodic (configurable) | No | Multi-region writes, automatic failover | Yes β any region | Disaster recovery |
| Azure PostgreSQL Flexible Server | Automated (up to 35 days) | No | Geo-redundant backup, Read replicas | Yes β geo-redundant to paired; Read replicas to any | Backup & restore |
| Azure MySQL Flexible Server | Automated (up to 35 days) | No | Geo-redundant backup, Read replicas | Limited β geo backup uses paired region | Backup & restore |
| Azure Database for MariaDB | Automated (up to 35 days) | No | Geo-restore | Limited | Backup concepts |
Storageβ
| Resource | Native Backup | Azure Backup / RSV | Geo-Replication | Non-Paired Region Support | Key Docs |
|---|
| Azure Storage Accounts | Soft delete, Versioning | Yes (Blob backup via Backup Vault) | LRS/ZRS/GRS/GZRS/RA-GRS/RA-GZRS | GRS uses paired region; Object Replication for non-paired | Storage redundancy |
| Azure Files | Share snapshots | Yes (RSV) | GRS/GZRS via storage account | Cross-region restore if GRS vault | File share backup |
| Azure Managed Disks | Incremental snapshots | Yes (Backup Vault) | Cross-region copy via snapshot | Yes β snapshot to any region | Disk backup |
| Azure Data Lake Storage Gen2 | Soft delete, Versioning | Yes (via Blob backup) | GRS/GZRS | Same as Blob Storage | ADLS reliability |
| Resource | Native Backup | Azure Backup / RSV | Geo-Replication | Non-Paired Region Support | Key Docs |
|---|
| Azure App Service | Built-in backup to Storage | No | No native geo-replication | Multi-region deploy via IaC + Front Door | App Service backup |
| Azure Functions | Source-controlled | No | No | Redeploy via CI/CD | Functions best practices |
| Azure Logic Apps | No built-in backup | No | No native geo-replication | Active-passive multi-region deploy | Logic Apps DR |
| Azure API Management | Backup/restore via PowerShell (30-day expiry) | No | Multi-region deployment (Premium tier) | Yes β deploy gateways to any region | APIM DR |
Containers & Kubernetesβ
| Resource | Native Backup | Azure Backup / RSV | Geo-Replication | Non-Paired Region Support | Key Docs |
|---|
| Azure Kubernetes Service (AKS) | No built-in | Yes (AKS Backup extension β Backup Vault) | No native replication | CRR to paired region (Vault tier); Multi-cluster via Fleet Manager | AKS Backup |
| Azure Container Registry | No backup needed | No | Geo-replication (Premium tier) | Yes β any region | ACR geo-replication |
| Azure Container Apps | No built-in | No | No native replication | Redeploy via IaC | Container Apps DR |
Messaging & Integrationβ
| Resource | Native Backup | Azure Backup / RSV | Geo-Replication | Non-Paired Region Support | Key Docs |
|---|
| Azure Event Hubs | No backup | No | Geo-replication (Premium/Dedicated β data + metadata), Geo-DR (metadata only) | Yes β configurable regions | Event Hubs geo-replication |
| Azure Service Bus | No backup | No | Geo-DR (metadata), Application-level replication | Yes β flexible regions | Service Bus Geo-DR |
| Azure Event Grid | No backup | No | No native geo-replication | Multi-region deploy | Event Grid reliability |
Security & Identityβ
| Resource | Native Backup | Azure Backup / RSV | Geo-Replication | Non-Paired Region Support | Key Docs |
|---|
| Azure Key Vault | Individual secret/key/certificate backup (encrypted blob) | No | Microsoft-managed cross-region replication (to paired region, including restricted-access pairs) | For truly non-paired regions only: custom multi-vault solution required. Sweden Central and Germany West Central ARE paired (restricted) β auto-replication works. | Key Vault reliability |
| Azure Managed HSM | Backup/restore | No | Multi-master replication via Cosmos DB backend | Limited β contact Microsoft | Managed HSM BCDR |
Monitoring & Analyticsβ
| Resource | Native Backup | Azure Backup / RSV | Geo-Replication | Non-Paired Region Support | Key Docs |
|---|
| Azure Monitor / Log Analytics | Data export | No | Workspace replication (preview) | Yes β configurable secondary regions within geography | Workspace replication |
| Application Insights | Data export | No | No native replication | Deploy separate instances per region | App Insights data retention |
Networkingβ
| Resource | Native Backup | Azure Backup / RSV | Geo-Replication | Non-Paired Region Support | Key Docs |
|---|
| Azure Front Door | N/A (global service) | N/A | Global β inherently multi-region | Yes | Front Door overview |
| Azure Traffic Manager | N/A (global DNS) | N/A | Global β inherently multi-region | Yes | Traffic Manager overview |
| Azure DNS | N/A (global service) | N/A | Global | Yes | Azure DNS reliability |
| VNet / NSG / UDR | No backup | No | No | Redeploy via IaC | Use ARM/Bicep/Terraform |
4. Question 3 β Comprehensive List of Azure Resources with RSV Applicabilityβ
Customer Questionβ
"Provide a list specifying which resources are supported by Recovery Services Vault and where third-party solutions are required."
Recovery Services Vault (RSV) β Supported Workloadsβ
| Workload | Vault Type | Backup Frequency | Cross-Region Restore | Notes |
|---|
| Azure Virtual Machines | Recovery Services Vault | 1x/day | Yes (CRR with GRS) | Full VM snapshots, app-consistent |
| SQL Server in Azure VM | Recovery Services Vault | Every 15 min (log), daily (full) | Yes (CRR with GRS) | Full + differential + log backups |
| SAP HANA in Azure VM | Recovery Services Vault | Configurable | Yes (CRR with GRS) | Enterprise SAP support |
| Azure Files | Recovery Services Vault | Multiple/day | Yes (if GRS vault) | Share-level snapshots |
| On-premises (MARS Agent) | Recovery Services Vault | 3x/day | No | Files, folders, system state |
| DPM / MABS Workloads | Recovery Services Vault | 2x/day | No | App-aware, broad workload support |
Backup Vault β Supported Workloadsβ
| Workload | Vault Type | Key Feature | Cross-Region Restore |
|---|
| Azure Blobs | Backup Vault | Operational (continuous) + Vaulted (scheduled) | Yes (Vault tier) |
| Azure Managed Disks | Backup Vault | Incremental snapshots | Limited |
| Azure Kubernetes Service (AKS) | Backup Vault | Cluster resources + PVs | Yes (Vault tier β CRR to paired region) |
| Azure PostgreSQL Server | Backup Vault | Long-term retention | Yes (Vault tier) |
| Azure Data Lake Storage | Backup Vault | Via Blob backup | Yes (Vault tier) |
| Azure Elastic SAN | Backup Vault | Volume snapshots | Limited |
Services Where RSV Does NOT Applyβ
These services use native backup models and RSV is NOT the protection mechanism:
| Service | Why RSV Doesn't Apply | What to Use Instead |
|---|
| Azure SQL Database | Has native PITR + LTR + geo-replication | Built-in automated backups + failover groups |
| Azure Cosmos DB | Has native continuous/periodic backup | Built-in PITR + multi-region replication |
| Azure App Service | Compute is stateless; data is in storage/DB | IaC redeployment + built-in backup to Storage |
| Azure Functions | Compute is stateless | IaC + CI/CD redeployment |
| Azure Logic Apps | Compute + orchestration | Multi-region deployment |
| Azure API Management | Configuration-based service | Backup/restore PowerShell + multi-region deploy |
| Azure Key Vault | Has native replication (paired) | Microsoft-managed replication or manual backup/restore |
| Azure Event Hubs | Streaming platform | Geo-replication or application-level replication |
| Azure Service Bus | Messaging platform | Geo-DR or application-level replication |
| Azure Container Registry | Image registry | Geo-replication (Premium) |
Visual Decision Tree β RSV vs Native vs Third-Partyβ
5. Question 4 β Where Third-Party Solutions Are Neededβ
Customer Questionβ
"Specify where third-party solutions are required, to better inform application teams operating in a decentralized model."
| Scenario | Why Third-Party | Recommended Tools |
|---|
| Single backup console across Azure + AWS + On-prem | Azure Backup is Azure-only | Veeam, Commvault, Rubrik |
| Long retention compliance (> 10 years) | Some Azure services have limited retention | Commvault, Rubrik, Cohesity |
| Air-gapped / immutable backups | Regulatory requirement for offline copies | Veeam (hardened repository), Commvault |
| Granular Kubernetes workload restore | AKS Backup has limitations for complex stateful apps | Kasten by Veeam, Velero |
| Advanced reporting & governance | Centralized backup compliance dashboard | Veeam, Rubrik, Commvault |
| Cross-platform orchestration | DR orchestration across heterogeneous environments | Zerto, Commvault |
| Key Vault in non-paired regions | No Microsoft-managed replication (only applies to truly non-paired regions like Italy North, Poland Central) | Custom solution + scripts |
| Complex database-level restore orchestration | Multi-step recovery with pre/post scripts | Commvault, Rubrik |
Third-Party Integration Points by Serviceβ
| Azure Service | Microsoft-Native Protection | Third-Party Gap / Addition |
|---|
| Azure VMs | Azure Backup (RSV) β Excellent | Third-party adds: cross-cloud, long retention, air-gap |
| Azure SQL Database | Native PITR + LTR + Failover Groups β Excellent | Rarely needed; only for cross-cloud compliance |
| Azure Cosmos DB | Continuous + Periodic backup β Excellent | Rarely needed |
| Azure Files | Azure Backup β Good | Third-party for advanced file-level restore, cross-platform |
| AKS | AKS Backup β Good (improving) | Kasten/Velero for complex stateful workloads, Helm-aware restore |
| Key Vault (truly non-paired regions) | Manual backup/restore β Limited | Custom scripts; no mainstream third-party tool for KV backup. Note: Sweden Central and Germany West Central ARE paired and have auto-replication. |
| App Service / Functions | Built-in backup β Basic | Usually not needed; IaC covers compute |
| Event Hubs / Service Bus | Geo-replication β Good | Application-level capture (Event Hubs Capture to Storage) |
Practical Recommendation for Enterprise Customersβ
Given a decentralized operating model:
- For IaaS workloads (VMs, SQL in VM, SAP HANA): Azure Backup (RSV) is fully sufficient
- For PaaS databases: Native backup mechanisms are best-in-class β no third-party needed
- For AKS: Start with AKS Backup extension; evaluate Kasten/Velero if complex stateful workloads exist
- For Key Vault in Sweden Central / Germany West Central: Implement custom multi-vault solution with automated backup/restore scripts
- For compliance-driven long retention: Evaluate Commvault or Rubrik if Azure LTR doesn't meet regulatory requirements
- For cross-cloud scenarios: Only if the organization has multi-cloud workloads requiring unified backup
6. Architecture β Multi-Region BCDR Reference Designβ
Enterprise Multi-Region Architecture (West Europe + Sweden Central)β
Key Architecture Notes for Regions with Restricted-Access Pairsβ
| Concern | West Europe (Paired: North Europe) | Sweden Central (Paired: Sweden South β restricted) | Germany West Central (Paired: Germany North β restricted) |
|---|
| Storage GRS | GRS replicates to North Europe | GRS replicates to Sweden South (passive replication works) | GRS replicates to Germany North |
| Key Vault | Auto-replication to North Europe | Auto-replication to Sweden South (read-only failover works) | Auto-replication to Germany North |
| Azure SQL geo-restore | Available to paired region | Available to Sweden South (but cannot create resources there β use failover groups to another region instead) | Available to Germany North |
| Azure Backup CRR | Restores to North Europe | CRR to Sweden South should work for GRS vaults | Restores to Germany North |
| Active DR (deploy workloads) | Deploy in North Europe or any region | Cannot deploy in Sweden South β use West Europe or Germany West Central for active DR | Can deploy in Germany North (restricted β request access) or use another region |
| Cosmos DB | Any region | Any region | Any region |
Important: Sweden South is restricted-access, meaning you cannot create new resources there without special access. For passive data protection (GRS, Key Vault replication, Backup CRR), the pairing works. For active disaster recovery (deploying applications, AKS clusters, App Services), you need to use a different unrestricted region as your secondary.
7. Decision Matrix β Recovery Strategy Selectionβ
Workload Tiering Modelβ
| Tier | Classification | RTO Target | RPO Target | Recommended Strategy | Cost Impact |
|---|
| Tier 0 | Mission Critical | < 1 min | Near-zero | Active-active multi-region, multi-write databases, global load balancing | $$$$$ |
| Tier 1 | Business Critical | < 15 min | < 5 min | Active-passive warm standby, auto-failover groups, geo-replication | $$$$ |
| Tier 2 | Important Internal | < 4 hours | < 1 hour | Active-passive cold standby, scheduled backups, IaC redeployment | $$ |
| Tier 3 | Dev/Test / Low Priority | < 24 hours | < 24 hours | Backup + restore only, redeploy from IaC | $ |
Strategy Selection per Serviceβ
| Service | Tier 0 Strategy | Tier 1 Strategy | Tier 2 Strategy | Tier 3 Strategy |
|---|
| Azure SQL DB | Auto-failover groups (active-active read) | Auto-failover groups | Active geo-replication | Geo-restore from backup |
| Cosmos DB | Multi-region multi-write | Multi-region single-write + auto-failover | Single-region + continuous backup PITR | Periodic backup |
| Storage | RA-GZRS + application-level routing | GRS/GZRS | GRS | LRS + scheduled backup |
| App Service | Active-active + Front Door | Active-passive + Front Door | Passive-cold + IaC | Redeploy manually |
| AKS | Multi-cluster Fleet Manager | Active-passive AKS + GitOps | AKS Backup + redeploy | AKS Backup only |
| Key Vault | Multi-vault active-active with app routing | Primary + manual sync secondary | Primary + backup scripts | Single vault |
| Event Hubs | Geo-replication active | Geo-replication | Geo-DR (metadata) | Single region |
8. Comparison Table β Paired vs Restricted-Pair Region BCDRβ
| Capability | Full Pair (e.g., West Europe β North Europe) | Restricted-Access Pair (e.g., Sweden Central β Sweden South) |
|---|
| Storage GRS/GZRS | Automatic replication to pair β
| Automatic replication to restricted pair β
(passive protection works) |
| Key Vault auto-failover | Microsoft-managed replication + failover β
| Microsoft-managed replication to restricted pair β
(read-only failover) |
| Azure Backup CRR | Cross-Region Restore to paired region β
| CRR to restricted paired region β
(passive restore) |
| Azure SQL geo-restore | Geo-redundant backup storage available β
| Geo-restore to restricted pair β
(but cannot create active resources there) |
| Active DR β deploy workloads in pair | β
Can deploy in paired region | β Cannot deploy in restricted pair β use another unrestricted region |
| Azure Site Recovery | Full support between paired regions β
| Full support β ASR works between any regions (global DR) β
|
| Cosmos DB | Works with any region | Works with any region β no pairing dependency |
| Event Hubs Geo-replication | Configurable to any region | Configurable to any region |
| Service Bus Geo-DR | Configurable | Configurable |
| Sequential updates | Paired regions get staggered updates | Both regions in same geography β staggering applies |
| Data residency | Both regions in same geography | Both regions in same geography β
|
Implications for European Enterprise Deploymentsβ
When using Sweden Central as a secondary region:
- Azure SQL Database: Use auto-failover groups (works with any region) β β
Fully supported
- Cosmos DB: Multi-region replication to any region β β
Fully supported
- Storage GRS: Replicates to Sweden South (restricted) for passive protection β β
Works; for active cross-region access use Object Replication to an unrestricted secondary
- Key Vault: Auto-replication to Sweden South works for passive failover β β
Supported; for active multi-region access, maintain a secondary Key Vault in an unrestricted region
- Azure Backup CRR: CRR to Sweden South should work for GRS vaults β β
Verify per workload type
- App Service / Functions: Multi-region deploy via IaC β β
Region-agnostic
- Active DR workloads: Cannot deploy in Sweden South (restricted) β β οΈ Use West Europe or Germany West Central as active secondary
9. Scenario Analysisβ
Scenario 1: Complete West Europe Region Outageβ
Impact: All primary services unavailable
Recovery actions by service:
| Service | Action | Time to Recover | Data Loss Risk |
|---|
| Azure SQL DB | Automatic failover to Sweden Central via failover group | < 30 seconds (planned), minutes (forced) | < 5 seconds RPO |
| Cosmos DB | Automatic failover to secondary region | Seconds | Near-zero (multi-write) |
| Storage | Initiate failover (GRS to North Europe) or use Object Replication copy in Sweden Central | Hours (GRS failover) or immediate (Object Replication) | ~15 min RPO (GRS) |
| Key Vault | Switch to Sweden Central Key Vault (manual failover in app config) | Minutes (depends on automation) | Depends on sync frequency |
| App Service | Front Door routes to Sweden Central deployment | Seconds (if pre-deployed) | Zero (stateless) |
| AKS | Activate standby cluster + restore from AKS Backup | 30 min β 2 hours | Last backup point |
| Event Hubs | Geo-replication auto-routes to secondary | Minutes | Near real-time |
Scenario 2: Key Vault Resilience for Sweden Centralβ
Clarification: Sweden Central IS paired with Sweden South (restricted-access). This means:
- Microsoft-managed Key Vault replication to Sweden South DOES work β your secrets, keys, and certificates are replicated automatically
- In a prolonged region failure, Microsoft may initiate failover β the Key Vault becomes read-only in Sweden South
- Sweden South is restricted β you cannot create new Key Vaults there, but the failover replica is Microsoft-managed
For active multi-region scenarios (where you need writable Key Vaults in multiple regions):
- Maintain a secondary Key Vault in Germany West Central or West Europe
- Implement automated sync using Azure Functions or Logic Apps that:
- Periodically export secrets/keys/certificates via backup API
- Restore to secondary Key Vault
- Note: Backups can only restore within the same Azure geography and subscription
- Application-level configuration to fall back to secondary Key Vault
- Use Managed HSM if HSM-level protection is needed β it uses multi-master replication
Key Vault Backup Limitations:
- Backups are encrypted blobs that can only be restored within the same Azure subscription and geography
- Maximum 500 past versions per key/secret/certificate
- Backups are point-in-time snapshots (not continuous)
- Key Vault backup documentation
Scenario 3: AKS Workload Recovery to Secondary Regionβ
Challenge: Stateful Kubernetes workloads with persistent volumes
Recovery approach:
- Cluster configuration: Stored in Git (GitOps) β redeploy to any region
- Container images: ACR geo-replication β immediately available in secondary
- Persistent volumes: AKS Backup extension β snapshot + vault tier for CRR
- Stateful databases: Use external PaaS databases (SQL/Cosmos) with their own DR
- Custom hooks: Implement pre/post-snapshot hooks for database consistency
Scenario 4: Decentralized Team Onboarding β Providing Self-Service BCDR Guidanceβ
Challenge: Application teams in a decentralized model need clear self-service guidance
Recommendation:
Create a BCDR Self-Service Guide for application teams containing:
- Service classification form β teams categorize their workload tier (0-3)
- Pre-built Bicep/Terraform modules β for each service's DR setup
- Runbook templates β failover and failback procedures per service
- Monitoring dashboards β Azure Monitor workbooks for backup health
- Policy enforcement β Azure Policy to enforce backup configuration
| Policy | Effect | Scope |
|---|
| VMs must have Azure Backup enabled | Audit / DeployIfNotExists | All subscriptions |
| Storage accounts must use GRS or GZRS | Audit | Production subscriptions |
| SQL databases must have LTR configured | Audit | Production subscriptions |
| Key Vaults must have soft delete and purge protection | Deny | All subscriptions |
10. Recommended Next Stepsβ
| # | Action | Owner | Priority |
|---|
| 1 | Create inventory of all Azure PaaS resources across subscriptions | BCDR Lead + App Teams | P0 |
| 2 | Classify each workload into Tiers 0-3 based on business criticality | BCDR Lead + Business | P0 |
| 3 | Map each service to its recovery mechanism using the matrix above | BCDR Lead | P0 |
| 4 | Identify Key Vault instances in regions with restricted-access pairs and verify replication; design active multi-region strategy where needed | BCDR Lead + Microsoft CSA | P1 |
Short-Term (Next 4-8 Weeks)β
| # | Action | Owner | Priority |
|---|
| 5 | Implement Azure Policy for backup enforcement | BCDR Lead / Platform Team | P1 |
| 6 | Deploy IaC templates for DR infrastructure in Sweden Central | Platform Team | P1 |
| 7 | Set up AKS Backup for all production clusters | App Teams | P1 |
| 8 | Configure auto-failover groups for all Tier 0/1 Azure SQL databases | DBA Team | P1 |
Medium-Term (Next Quarter)β
| # | Action | Owner | Priority |
|---|
| 9 | Conduct first DR drill / failover test | BCDR Lead + All Teams | P1 |
| 10 | Evaluate third-party tools (Kasten/Veeam) for complex AKS workloads | Platform Team | P2 |
| 11 | Create BCDR self-service guide for decentralized app teams | Microsoft CSA + BCDR Lead | P2 |
| 12 | Implement centralized backup monitoring dashboard (Azure Monitor Workbooks) | Platform Team | P2 |
11. Microsoft Learn Reference Linksβ
Core BCDR Documentationβ
Azure Backup & Recovery Servicesβ
Database Servicesβ
Storageβ
Containers & Kubernetesβ
Security & Key Managementβ
Messaging & Integrationβ
Monitoringβ
Closing Noteβ
As correctly identified by the customer's infrastructure team: current recovery mechanisms vary by resource, and a consolidated workload-based recovery framework is the right enterprise approach rather than assuming one universal backup model for all PaaS services.
The matrices and decision trees in this document provide a service-by-service mapping that enterprise application teams can use as a self-service reference in their decentralized operating model.
For regions with restricted-access pairs like Sweden Central (paired with Sweden South), passive data protection (GRS, Key Vault replication, Backup CRR) works automatically. However, for active disaster recovery (deploying workloads in a secondary region), an unrestricted secondary region must be used. This distinction should be clearly communicated to application teams.
Document prepared based on Microsoft Learn documentation as of April 2026. Service capabilities evolve β always verify against the latest Azure reliability guides.