Skip to main content

10 - Performance Efficiency

Caching strategies, autoscaling, and latency optimization

WAF Pillar


📋 WAF Workload Design Checklist

Based on Azure Well-Architected Framework - Performance Efficiency

#RecommendationStatus
(Service) Define performance targets: capacity, CPU, memory, request duration, throughput
(Service) Dynamically scale to match demand with autoscale rules
(Service) Collect performance data using built-in analytics, Azure Monitor, App Insights
(Service) Test performance under production conditions with load testing
(Service) Review documented limits and constraints for APIM tier
(API) Minimize expensive processing (large payloads, WebSockets) with validate-content
(API) Evaluate caching policies or external cache for performance improvement
(API) Consider Azure Front Door / App Gateway for TLS offloading
(Service & API) Evaluate logic placement impact between gateway, backend, and entry point
(Service & API) Collect: request processing time, resource usage, throughput, cache hit ratio

⚡ Validate-Content for Large Payloads

WAF Recommendation: Minimize expensive processing with validate-content policy

<!-- Validate and limit large request bodies -->
<inbound>
<base />
<validate-content
unspecified-content-type-action="prevent"
max-size="102400"
size-exceeded-action="prevent"
errors-variable-name="validationErrors">
<content type="application/json" validate-as="json" action="prevent" />
<content type="application/xml" validate-as="xml" action="prevent" />
</validate-content>

<!-- Return friendly error for validation failures -->
<choose>
<when condition="@(context.Variables.ContainsKey("validationErrors"))">
<return-response>
<set-status code="400" reason="Bad Request" />
<set-body>@{
var errors = (List<string>)context.Variables["validationErrors"];
return new JObject(
new JProperty("error", "Validation failed"),
new JProperty("details", new JArray(errors))
).ToString();
}</set-body>
</return-response>
</when>
</choose>
</inbound>

Payload Size Limits

TierMax Request SizeRecommendation
Developer256 KBFor testing only
Basic256 KBSmall payloads
Standard256 KBSmall payloads
Premium256 KBUse validate-content
v2 tiers2 MBLarger payloads supported

📍 Logic Placement Optimization

WAF Recommendation: Evaluate performance impact of logic placement

LogicBest LocationRationale
TLS TerminationFront Door/App GatewayOffload from APIM
Geo-RoutingFront DoorEdge processing
Response CachingAPIM (built-in)Reduce backend load
Request ValidationAPIMProtect backends
Business LogicBackendAvoid gateway bloat
Complex TransformsBackendBetter compute resources

🎯 Performance Targets

MetricTargetCritical
P50 Latency< 100ms< 500ms
P95 Latency< 200ms< 1000ms
P99 Latency< 500ms< 2000ms
Cache Hit Rate> 70%> 50%
Availability99.95%99.9%

🚀 Response Caching

Basic Response Caching

<policies>
<inbound>
<base />
<!-- Check cache first -->
<cache-lookup vary-by-developer="false"
vary-by-developer-groups="false"
caching-type="internal"
downstream-caching-type="public"
must-revalidate="true" />
</inbound>
<backend>
<base />
</backend>
<outbound>
<base />
<!-- Store successful responses -->
<choose>
<when condition="@(context.Response.StatusCode == 200)">
<cache-store duration="3600" />
</when>
</choose>
<!-- Add cache headers -->
<set-header name="X-Cache" exists-action="override">
<value>@(context.Response.Headers.GetValueOrDefault("X-Cache", "MISS"))</value>
</set-header>
</outbound>
</policies>

Vary Cache by Query Parameters

<cache-lookup vary-by-query-parameter="version,region,lang"
vary-by-developer="false"
caching-type="internal" />

Vary Cache by Headers

<cache-lookup vary-by-header="Accept,Accept-Language"
vary-by-developer="false"
caching-type="internal" />

🔴 External Redis Cache

Architecture with Redis

Redis Cache Configuration (Bicep)

// From customer scenario: modules/api-management-core.bicep
resource apim 'Microsoft.ApiManagement/service@2023-05-01-preview' = {
name: apimName
location: location
properties: {
// ... other config
}
}

resource redisCache 'Microsoft.ApiManagement/service/caches@2023-05-01-preview' = {
name: 'redis-external'
parent: apim
properties: {
connectionString: 'redis-host.redis.cache.windows.net:6380,password=xxx,ssl=True,abortConnect=False'
useFromLocation: 'default'
description: 'External Redis cache for high-performance caching'
resourceId: 'https://management.azure.com/subscriptions/${subscription().subscriptionId}/resourceGroups/${resourceGroup().name}/providers/Microsoft.Cache/redis/redis-apim-cache'
}
}

External Cache Policy

<policies>
<inbound>
<cache-lookup vary-by-developer="false"
caching-type="external"
downstream-caching-type="public" />
</inbound>
<outbound>
<cache-store duration="600" caching-type="external" />
</outbound>
</policies>

⚡ Backend Performance

Connection Pooling

<backend>
<!-- Enable HTTP/2 for better performance -->
<forward-request timeout="30"
buffer-request-body="true"
buffer-response="false" />
</backend>

Circuit Breaker Pattern

<backend>
<retry condition="@(context.Response.StatusCode >= 500)"
count="3"
interval="1"
delta="2"
max-interval="10"
first-fast-retry="true" />
</backend>

Timeout Configuration

<backend>
<forward-request timeout="30" follow-redirects="true" />
</backend>

<!-- Different timeouts per operation -->
<inbound>
<choose>
<when condition="@(context.Operation.Id == "heavy-report")">
<set-backend-service timeout="120" />
</when>
<otherwise>
<set-backend-service timeout="30" />
</otherwise>
</choose>
</inbound>

📊 Request/Response Optimization

Compression

<outbound>
<!-- Compress large responses -->
<choose>
<when condition="@(context.Response.Body.As<string>(preserveContent: true).Length > 1024)">
<set-header name="Content-Encoding" exists-action="override">
<value>gzip</value>
</set-header>
</when>
</choose>
</outbound>

Minimize Payload

<outbound>
<!-- Remove unnecessary fields -->
<set-body>@{
var body = context.Response.Body.As<JObject>();
body.Remove("internalId");
body.Remove("debugInfo");
body.Remove("metadata");
return body.ToString();
}</set-body>
</outbound>

Conditional GET (ETag)

<outbound>
<set-header name="ETag" exists-action="override">
<value>@{
var body = context.Response.Body.As<string>(preserveContent: true);
using (var md5 = System.Security.Cryptography.MD5.Create())
{
var hash = md5.ComputeHash(System.Text.Encoding.UTF8.GetBytes(body));
return "\"" + BitConverter.ToString(hash).Replace("-", "").ToLower() + "\"";
}
}</value>
</set-header>
</outbound>

<inbound>
<!-- Check If-None-Match -->
<choose>
<when condition="@(context.Request.Headers.GetValueOrDefault("If-None-Match", "") == context.Variables.GetValueOrDefault<string>("cachedETag"))">
<return-response>
<set-status code="304" reason="Not Modified" />
</return-response>
</when>
</choose>
</inbound>

📈 Autoscaling Configuration

Capacity-Based Scaling

resource autoscale 'Microsoft.Insights/autoscalesettings@2022-10-01' = {
name: 'apim-autoscale-performance'
location: location
properties: {
enabled: true
targetResourceUri: apim.id
profiles: [
{
name: 'Performance Profile'
capacity: {
default: '2'
minimum: '2'
maximum: '10'
}
rules: [
// Scale OUT on high capacity
{
metricTrigger: {
metricName: 'Capacity'
metricResourceUri: apim.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT5M'
timeAggregation: 'Average'
operator: 'GreaterThan'
threshold: 70
}
scaleAction: {
direction: 'Increase'
type: 'ChangeCount'
value: '1'
cooldown: 'PT10M'
}
}
// Scale IN when low
{
metricTrigger: {
metricName: 'Capacity'
metricResourceUri: apim.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT30M'
timeAggregation: 'Average'
operator: 'LessThan'
threshold: 30
}
scaleAction: {
direction: 'Decrease'
type: 'ChangeCount'
value: '1'
cooldown: 'PT30M'
}
}
]
}
]
}
}

🌐 Geographic Distribution

Multi-Region Deployment

Traffic Manager Configuration

resource trafficManager 'Microsoft.Network/trafficmanagerprofiles@2022-04-01' = {
name: 'tm-apim-global'
location: 'global'
properties: {
profileStatus: 'Enabled'
trafficRoutingMethod: 'Performance' // Route to closest
dnsConfig: {
relativeName: 'api-global'
ttl: 60
}
monitorConfig: {
protocol: 'HTTPS'
port: 443
path: '/status-0123456789abcdef'
intervalInSeconds: 30
timeoutInSeconds: 10
toleratedNumberOfFailures: 3
}
endpoints: [
{
name: 'westeurope'
type: 'Microsoft.Network/trafficManagerProfiles/azureEndpoints'
properties: {
targetResourceId: apimWestEurope.id
endpointStatus: 'Enabled'
priority: 1
}
}
{
name: 'eastus'
type: 'Microsoft.Network/trafficManagerProfiles/azureEndpoints'
properties: {
targetResourceId: apimEastUS.id
endpointStatus: 'Enabled'
priority: 2
}
}
]
}
}

📊 Performance Monitoring

Key Performance Metrics

MetricAlert ThresholdAction
BackendDuration> 2000msScale backend
Duration> 3000msInvestigate
Capacity> 70%Auto-scale
FailedRequests> 5%Alert team

KQL Query for Latency Analysis

ApiManagementGatewayLogs
| where TimeGenerated > ago(1h)
| summarize
P50 = percentile(TotalTime, 50),
P95 = percentile(TotalTime, 95),
P99 = percentile(TotalTime, 99),
Avg = avg(TotalTime),
Count = count()
by bin(TimeGenerated, 5m), OperationId
| order by P95 desc
| render timechart

Performance Dashboard Query

// Backend vs Gateway latency breakdown
ApiManagementGatewayLogs
| where TimeGenerated > ago(1h)
| extend
GatewayLatency = TotalTime - BackendTime,
CacheHit = ResponseHeaders contains "X-Cache: HIT"
| summarize
AvgGateway = avg(GatewayLatency),
AvgBackend = avg(BackendTime),
CacheHitRate = countif(CacheHit) * 100.0 / count()
by bin(TimeGenerated, 5m)
| render timechart

✅ Performance Checklist

Caching

  • Response caching enabled
  • Cache vary parameters configured
  • External Redis for high volume
  • Cache hit rate monitored

Backend

  • Connection timeouts set
  • Retry policies configured
  • Circuit breaker implemented
  • HTTP/2 enabled

Scaling

  • Autoscaling configured
  • Capacity alerts set
  • Multi-region deployed
  • Traffic Manager configured

Monitoring

  • Latency dashboards created
  • P95/P99 alerts configured
  • Backend latency tracked
  • Cache hit rate monitored

DocumentDescription
04-PoliciesCaching policies
06-MonitoringMetrics setup
09-Cost-OptimizationScaling costs

Next: 11-Monetization - Products, subscriptions, and billing

📖Learn