Skip to main content

12 - Design Tradeoffs

Well-Architected Framework tradeoffs and decision guidance

WAF


⚖️ Overview

The Azure Well-Architected Framework acknowledges that design decisions involve tradeoffs between the five pillars. This document captures the key tradeoffs specific to Azure API Management.


🏢 High Availability vs. Cost

WAF Tradeoff: Adding redundancy affects costs

Decision Matrix

RequirementConfigurationEst. Monthly CostSLA
Basic ProductionPremium 1 unit~$2,80099.95%
Zone RedundantPremium 3 units~$8,40099.99%
Multi-Region (A/P)Premium 3+1 units~$11,20099.99%+
Multi-Region (A/A)Premium 3+3 units~$16,80099.99%+

Considerations

  • Zone redundancy: Requires minimum 3 units for full zone coverage
  • Multi-region: Adds operational costs for failover coordination
  • Backend coordination: DR must align with backend failover strategies

🏗️ Isolation vs. Operational Complexity

WAF Tradeoff: Isolating workloads adds operational complexity

Approaches Comparison

ApproachIsolation LevelCostComplexity
Single InstanceNone$Low
WorkspacesLogical (RBAC, network)$$Medium
Separate InstancesPhysical$$$High

When to Use Each

ScenarioRecommended Approach
Small team, low riskSingle Instance
Multi-team, shared costsWorkspaces
Strict compliance/data sovereigntySeparate Instances
Maximum blast radius isolationSeparate Instances
Cost optimization priorityWorkspaces

📈 Scale to Match Demand

WAF Tradeoff: Autoscaling handles malicious traffic too

The Dilemma

Mitigation Strategies

StrategyBenefitCost
WAFBlock malicious before APIMWAF costs
DDoS ProtectionBlock volumetric attacksDDoS tier costs
Rate LimitingCap per-client requestsNone
Scale LimitsCap maximum unitsReliability risk
resource autoscale 'Microsoft.Insights/autoscalesettings@2022-10-01' = {
properties: {
profiles: [{
capacity: {
minimum: '2' // Baseline
maximum: '8' // Cap to control costs
default: '2'
}
rules: [
// Scale out for legitimate load
{
metricTrigger: {
metricName: 'Capacity'
threshold: 70
operator: 'GreaterThan'
}
scaleAction: {
direction: 'Increase'
value: '1'
cooldown: 'PT10M'
}
}
]
}]
}
}

🔄 Federated vs. Distributed

WAF Tradeoff: Colocation vs autonomous topology

Federated (Workspaces)

ProsCons
✅ Cost sharing across teams❌ Shared outage blast radius
✅ Centralized governance❌ Misconfiguration impacts all
✅ Economies of scale❌ Complex multi-tenant RBAC
✅ Single control plane❌ Capacity planning for all

Distributed (Separate Instances)

ProsCons
✅ Full isolation❌ Duplicative costs
✅ Independent scaling❌ Redundant operations
✅ Team autonomy❌ No cost sharing
✅ Blast radius mitigation❌ Multiple control planes

Decision Guidance


💾 Caching Tradeoffs

WAF Tradeoff: External cache can introduce failure points

Caching OptionPerformanceReliability RiskCost
No CacheBaselineNone$
Built-in CacheImprovedMinimal$
External RedisBestAdditional dependency$$$

When to Use External Cache

  • Built-in cache capacity exceeded
  • Cache data > 50 MB (tier dependent)
  • Need for advanced Redis features
  • Multi-region cache consistency required

Mitigation for External Cache

<!-- Graceful degradation if cache fails -->
<inbound>
<cache-lookup caching-type="external"
timeout-in-seconds="2"
must-revalidate="false" />
<on-error>
<!-- Proceed without cache on Redis failure -->
<set-variable name="cache-failed" value="true" />
</on-error>
</inbound>

📊 Summary: Tradeoff Matrix

DecisionOption AOption BPrimary Tradeoff
Tier SelectionPremiumStandard v2Features vs. Cost
RedundancyZone/Multi-RegionSingle UnitReliability vs. Cost
IsolationWorkspacesSeparate InstancesCost vs. Blast Radius
ScalingUnlimitedCappedReliability vs. Cost
CachingBuilt-inExternal RedisSimplicity vs. Performance
GatewayCloud-hostedSelf-hostedSimplicity vs. Latency

✅ Tradeoff Checklist

  • Documented SLA requirements vs. budget constraints
  • Evaluated blast radius requirements
  • Assessed team autonomy vs. centralized governance needs
  • Defined scaling limits to control costs
  • Evaluated caching strategy reliability implications
  • Considered hybrid approaches for mixed requirements

DocumentDescription
01-ArchitectureTier selection
02-ReliabilityHA patterns
09-Cost-OptimizationCost strategies

📚 References


Back to: README - Main documentation index

📖Learn