Module 6: Workload-Specific Cost Optimization

Duration: 60 minutes | Level: Deep-Dive Technical
WAF Alignment: CO:05 (Best Rates), CO:07 (Component Costs), CO:08 (Environment Costs), CO:10 (Data Costs), CO:11 (Code Costs), CO:12 (Scaling Costs), CO:14 (Consolidation)

6.1 Workload Optimization Map

Each workload type has unique cost levers. This module provides prescriptive guidance, CLI commands, and decision trees for the most common Azure services.

6.2 Virtual Machines

Cost Optimization Matrix

Optimization	Savings Potential	Complexity	Implementation Time
Right-size underutilized VMs	10-50%	Low	Days
Azure Hybrid Benefit (Windows/SQL)	40-82%	Low	Hours
Reserved Instances (1yr/3yr)	40-72%	Low	Hours
B-series for burstable workloads	15-55%	Low	Days
Spot VMs for interruptible workloads	Up to 90%	Medium	Days
Auto-shutdown schedule for non-prod	30-70%	Low	Hours
Migrate to containers/serverless	30-60%	High	Months

VM Cost Optimization Decision Tree

Key Azure CLI Commands for VMs

# Find underutilized VMs via Azure Resource Graph
az graph query -q "
  Resources
  | where type =~ 'microsoft.compute/virtualMachines'
  | project name, resourceGroup, 
            vmSize=properties.hardwareProfile.vmSize,
            location, os=properties.storageProfile.osDisk.osType
  | order by vmSize asc
"

# Resize a VM
az vm resize --resource-group myRG --name myVM --size Standard_D2s_v5

# Enable Azure Hybrid Benefit on a Windows VM
az vm update --resource-group myRG --name myVM --set licenseType=Windows_Server

# Deallocate a stopped VM to stop compute billing
az vm deallocate --resource-group myRG --name myVM

# Set auto-shutdown schedule
az vm auto-shutdown \
  --resource-group "DevTest-RG" \
  --name "dev-vm-01" \
  --time "1900" \
  --timezone "Pacific Standard Time"

# Find VMs without Azure Hybrid Benefit enabled
az graph query -q "
  Resources
  | where type =~ 'microsoft.compute/virtualMachines'
  | where properties.storageProfile.imageReference.publisher == 'MicrosoftWindowsServer'
  | where properties.licenseType != 'Windows_Server'
  | project name, resourceGroup, vmSize=properties.hardwareProfile.vmSize
"

VM Right-Sizing: Metrics to Monitor

Metric	Right-Sized Range	Action if Below	Action if Above
CPU Average	30-70%	Downsize or deallocate	Upsize
Memory Average	30-70%	Downsize	Upsize
Network I/O	Varies by workload	Consider smaller SKU	Consider network-optimized
Disk IOPS	Varies by workload	Step down disk tier	Step up disk tier

Rule of thumb: If CPU average is below 5% for 14 days, the VM is likely not needed. If CPU is between 5-30%, it is over-provisioned and should be downsized.

WAF Service Guide: VM Cost Optimization Configuration Recommendations

These come directly from the Azure Well-Architected Framework VM Service Guide:

Recommendation	Why It Matters
Choose the right VM plan size and SKU. Use the VM Selector to identify the best VM for your workload	SKUs are priced by capabilities. Don't overspend on SKUs with features you don't need
Use Spot VMs for batch processing, dev/test, and interruptible workloads	Up to 90% savings. Flexible orchestration lets you mix Spot and regular VMs by percentage
Scale in when demand decreases -- set a scale-in policy on VMSS	Scaling in reduces the number of running VMs and directly reduces costs
Stop VMs during off-hours using Azure Automation Start/Stop	This low-cost automation can significantly reduce idle instance costs for non-production
Use Azure Hybrid Benefit for Windows Server and SQL licenses	Reuse on-premises Software Assurance licenses at no extra cost on Azure
Use Azure Boost for CPU offloading on supported SKUs	Offloading virtualization frees up CPU for your workload -- better performance at same cost
Implement cost guardrails via Azure Policy	Restrict resource types, SKUs, configurations and locations to prevent overspend

VM Family Selection Guide for Cost Optimization

VM Series	Use Case	Cost Characteristic
B-series (Burstable)	Dev/test, low-traffic web servers, small DBs	15-55% cheaper than equivalent general purpose. Accumulate credits during idle, burst when needed
D-series (General Purpose)	Balanced CPU/memory for most production workloads	Standard pricing, good all-rounder. Use Dv5 for latest generation savings
E-series (Memory Optimized)	In-memory databases, SAP HANA, caching	Higher per-hour cost but fewer VMs needed for memory-heavy apps
F-series (Compute Optimized)	Batch, gaming servers, analytics	High CPU-to-memory. Cost-effective for CPU-bound workloads
L-series (Storage Optimized)	Big data, SQL, NoSQL, data warehousing	High throughput local NVMe storage. Avoid Premium SSD cost for temp data
N-series (GPU)	ML training, inference, rendering	Expensive per hour. Use Spot where possible and shutdown when idle
Arm64 (Dpsv6, Epsv6)	Cloud-native, scale-out, containerized	Up to 50% better price-performance vs x86 for eligible workloads

Comprehensive VM Cost Checklist for Customers

#	Action	Quick Win?	Est. Savings
1	Identify and deallocate VMs with < 5% CPU for 14 days	Yes	100% of those VMs
2	Right-size VMs with 5-30% CPU to next smaller SKU	Yes	20-50% per VM
3	Enable Azure Hybrid Benefit on all eligible Windows/SQL VMs	Yes	40-82%
4	Set auto-shutdown on all dev/test VMs	Yes	30-70%
5	Move dev/test to B-series burstable VMs	Yes	15-55%
6	Purchase 1-year RI for stable production VMs	No	40-60%
7	Purchase 3-year RI for long-term stable VMs	No	60-72%
8	Use Spot VMs for batch, CI/CD, non-critical	No	Up to 90%
9	Evaluate Arm64 for scale-out workloads	No	Up to 50%
10	Apply Azure Policy to restrict allowed SKUs	No	Prevents sprawl

6.3 Azure Kubernetes Service (AKS)

AKS Cost Optimization Strategies

Strategy	Description	Savings Potential	Best For
Cluster Autoscaler	Scale nodes based on pod demand	20-40%	All production clusters
Spot Node Pools	Non-critical workloads on interruptible nodes	Up to 90%	Batch, dev/test
Cluster Start/Stop	Shutdown dev/test clusters off-hours	50-70%	Non-production
Node Autoprovision (NAP)	Auto-select optimal VM SKU for pending pods	10-30%	Complex workloads
HPA	Scale pods horizontally based on CPU/memory	Variable	Predictable demand
VPA	Right-size pod CPU/memory requests and limits	10-30%	Fluctuating resource needs
KEDA	Event-driven scaling, can scale to zero	Up to 100% idle	Sporadic workloads
Arm64 Nodes	Cost-efficient ARM-based processors	Up to 50%	Cloud-native apps
AKS Cost Analysis	Granular cluster cost breakdown by K8s construct	Visibility	All clusters

AKS Architecture for Cost Optimization

AKS CLI Commands

# Stop an AKS cluster (saves all compute costs)
az aks stop --name myAKSCluster --resource-group myRG

# Start an AKS cluster
az aks start --name myAKSCluster --resource-group myRG

# Add a Spot node pool for batch workloads
az aks nodepool add \
  --resource-group myRG \
  --cluster-name myAKSCluster \
  --name spotnodepool \
  --priority Spot \
  --eviction-policy Delete \
  --spot-max-price -1 \
  --enable-cluster-autoscaler \
  --min-count 0 --max-count 10

# Enable cluster autoscaler on existing node pool
az aks nodepool update \
  --resource-group myRG \
  --cluster-name myAKSCluster \
  --name nodepool1 \
  --enable-cluster-autoscaler \
  --min-count 1 --max-count 5

# Enable AKS cost analysis add-on
az aks update --resource-group myRG --name myAKSCluster --enable-cost-analysis

AKS Autoscaling Decision Guide

Autoscaler	Layer	Trigger	Scale to Zero?	Best For
HPA	Pod (horizontal)	CPU, memory, custom metrics	No	Predictable demand apps
VPA	Pod (vertical)	Historical CPU/memory usage	No	Right-sizing requests
KEDA	Pod (event-driven)	Queue length, HTTP traffic, events	Yes	Sporadic/event-driven
Cluster Autoscaler	Node	Pending pods, idle nodes	No (min 1)	All clusters
NAP	Node	Pod requirements	No	Complex multi-SKU

Important: Do not use VPA and HPA on the same CPU or memory metrics simultaneously. You can use VPA for CPU/memory with HPA for custom metrics.

6.4 Storage & Databases

Storage -- WAF Service Guide Recommendations

These come from the Azure Well-Architected Framework Blob Storage Service Guide:

WAF Recommendation	Customer Action
Identify the meters used to calculate your bill (capacity, operations, optional features)	Run a billing review to understand which meters drive cost
Choose a billing model for capacity -- evaluate commitment-based (reserved) vs consumption	For stable storage, pre-purchase reserved capacity for up to 38% savings
Choose the most cost-effective default access tier	Set default to Cool if most blobs are infrequently accessed
Upload data directly to the most cost-efficient tier	Specify Cool/Cold/Archive at upload time instead of uploading to Hot first
Use lifecycle management policies to auto-tier data	Automate Hot-to-Cool (30d), Cool-to-Cold (90d), Cold-to-Archive (180d) transitions
Disable features you don't need (versioning, soft delete on high-churn accounts)	Every blob overwrite creates a version/snapshot -- this can explode storage cost silently
Create budgets and monitor usage	Use Storage insights to identify accounts with no or low use
Pack small files before moving to cooler tiers	Cooler tiers have higher per-operation costs. Fewer large files = fewer operations
Use standard-priority rehydration from archive (not high-priority)	High-priority rehydration costs significantly more

Storage Cost Optimization -- Actionable Checklist

#	Action	Impact	Azure CLI / Portal
1	Upgrade all Storage v1 accounts to GPv2	Enables tiering, reserved capacity	`az storage account update --kind StorageV2`
2	Enable Lifecycle Management Policy	Auto-tier blobs by last-modified date	Portal > Storage Account > Lifecycle management
3	Find and delete unattached managed disks	Immediate storage savings	`az disk list --query "[?managedBy==null]"`
4	Delete old disk snapshots (30+ days)	Reduce snapshot storage cost	`az snapshot list --query "[?timeCreated < '...']"`
5	Move snapshots from Premium to Standard	60% savings on snapshot storage	Portal > Snapshot > Change tier
6	Review backup redundancy (GRS vs LRS for dev/test)	50% backup storage savings	Portal > Recovery Services vault > Properties
7	Disable blob versioning on high-churn accounts	Prevent silent cost explosion	Portal > Storage Account > Data protection
8	Enable reserved capacity for stable storage	Up to 38% savings on capacity	Portal > Reservations > Add > Storage
9	Review soft delete retention periods	Long retention on high-churn = high cost	Start with 7 days, increase as needed
10	Disable SFTP support when not actively transferring	Billed hourly even when idle	Portal > Storage Account > SFTP

Storage Tiering Flow

Key action: Enable Lifecycle Management Policy to automate blob transitions between tiers based on last-modified date.

# Find unattached managed disks
az disk list \
  --query "[?managedBy==null].[name, diskSizeGb, sku.name, resourceGroup]" \
  --output table

# Find old disk snapshots (30+ days)
$threshold = (Get-Date).AddDays(-30).ToString("yyyy-MM-ddTHH:mm:ssZ")
az snapshot list \
  --query "[?timeCreated < '$threshold'].[name, diskSizeGb, resourceGroup]" \
  --output table

# Find Storage v1 accounts that should be upgraded
az graph query -q "
  Resources
  | where type =~ 'microsoft.storage/storageaccounts'
  | where kind == 'Storage' or kind == 'BlobStorage'
  | project name, resourceGroup, kind, location
"

Azure SQL Database Cost Optimization -- Comprehensive Guide

Based on the Azure SQL Database Cost Management documentation:

Purchasing Model Decision

SQL Cost Optimization Strategies

Strategy	Description	Savings	When to Use
Serverless tier	Auto-pause when idle, pay per second of compute used	Up to 70% for sporadic	Dev/test, apps with idle periods
Elastic Pools	Share provisioned resources across multiple databases	30-50% vs individual DBs	Multi-tenant apps, varying DB workloads
Reserved Capacity	Pre-commit to vCores for 1 or 3 years	33-65%	Stable, predictable production databases
Azure Hybrid Benefit	Use existing SQL Server licenses	Up to 55%	Customers with Software Assurance
DTU to vCore migration	Move from DTU to vCore for more granular control	Variable	When DTU model is inefficient
Right-size compute	Match provisioned vCores/DTUs to actual usage	20-50%	Over-provisioned databases
Scale down non-prod	Use lower tiers for dev/test/staging	50-80%	Non-production environments
Free tier	100,000 vCore seconds + 32 GB storage per month free	100%	Small apps, prototyping

# Find Azure SQL Databases and their pricing tiers
az graph query -q "
  Resources
  | where type =~ 'microsoft.sql/servers/databases'
  | where name != 'master'
  | project name, resourceGroup,
            sku=sku.name, tier=sku.tier,
            capacity=sku.capacity,
            location
  | order by tier asc
"

# Check if any SQL DBs can use serverless
az graph query -q "
  Resources
  | where type =~ 'microsoft.sql/servers/databases'
  | where sku.tier == 'GeneralPurpose'
  | where sku.name !contains 'Serverless'
  | project name, resourceGroup, sku=sku.name
"

Cosmos DB Cost Optimization

Strategy	Description	Savings
Autoscale throughput	Automatically scale RUs up/down based on demand	10-70% vs fixed provisioned
Reserved capacity	Pre-purchase RUs for 1 or 3 years	Up to 65%
Serverless mode	Pay per request for sporadic workloads	Up to 90% vs provisioned idle
Optimize query RU cost	Use indexing policies, avoid cross-partition queries	30-80% per query
Rate limiting	Implement client-side rate limiting to avoid 429s	Prevents over-provisioning
Time-to-live (TTL)	Auto-delete old documents to reduce storage	Variable

PostgreSQL / MySQL Flexible Server

Strategy	Description
Burstable tier (B-series)	For dev/test and low-traffic workloads
Stopped server	Stop the server when not in use (no compute charges)
Reserved capacity	Pre-commit for 1 or 3 years for 30-65% savings
Scale compute tier	Match compute to actual workload requirements
Storage auto-grow	Prevent manual over-provisioning of storage

6.5 Networking -- The Hidden Cost Driver

Key insight: Data transfer (egress) is often the most overlooked cost in Azure. Many customers pay 10-20% of their bill on egress without realizing it if architecture is not well designed.

Networking Cost Drivers Ranked by Impact

#	Resource / Cost Driver	Typical Monthly Cost	How It Accumulates
1	Data Transfer (Egress)	$0.05-$0.087/GB	Cross-region, internet-bound data adds up fast
2	ExpressRoute	$55-$500/month + per-GB metered	Charges even when unused. Metered plans charge per GB
3	Azure Firewall	$912/month (Standard) or $1,827/month (Premium)	Base cost per hour + per-GB processed
4	Application Gateway / WAF	$250-$700/month	Fixed capacity units + per-GB processed
5	VPN Gateway	$139-$1,012/month	Charged per hour. VpnGw1 vs VpnGw5 varies 7x
6	Load Balancer (Standard)	$18/month + per-rule	Charged per rule per hour. Idle LBs still cost
7	Public IPs (Standard)	$3.65/month each	Charged even when unattached. Orphaned PIPs add up
8	NAT Gateway	$32/month + per-GB	Steady charge even with low traffic

Egress Cost Reduction Architecture

Networking Cost Optimization Checklist

#	Action	Azure CLI / Resource Graph	Savings
1	Delete orphaned Public IPs	`az graph query -q "Resources \| where type =~ 'microsoft.network/publicIPAddresses' \| where isnull(properties.ipConfiguration)"`	$3.65/month each
2	Remove idle Load Balancers	Check for empty backend pools	$18+/month each
3	Remove idle Application Gateways	Check for empty backend pools	$250+/month each
4	Delete unused VNet Gateways	Check for zero connections	$139+/month each
5	Downgrade Azure Firewall Premium to Standard	Verify no Premium features (TLS inspection, IDPS) used	$900+/month
6	Consolidate Azure Firewalls (hub-spoke)	One central Firewall per region, not per spoke	$900+/month per extra
7	Deprovision unused ExpressRoute circuits	Check traffic, confirm no active peerings	$55-500/month each
8	Use Private Endpoints for PaaS services	Eliminates public internet egress	Variable
9	Co-locate services in same region	Free intra-region data transfer	Variable
10	Use Azure CDN for static content delivery	Free Azure-to-CDN transfer	Variable

Key Egress Rules of Thumb

Intra-region transfer: Free (between services in the same Azure region)
Inter-region transfer: $0.02-$0.087/GB depending on regions
Internet egress: $0.05-$0.087/GB (first 5 GB/month free)
ExpressRoute metered egress: $0.025/GB for Zone 1
Azure CDN delivery: Free from Azure origin to CDN PoP
Target: No more than 15% of total Azure spend on data transfer

# Find orphaned Public IPs
az graph query -q "
  Resources
  | where type =~ 'microsoft.network/publicIPAddresses'
  | where isnull(properties.ipConfiguration)
  | project name, resourceGroup, sku=sku.name, location
"

# Find idle Load Balancers (empty backend pools)
az graph query -q "
  Resources
  | where type =~ 'microsoft.network/loadbalancers'
  | where array_length(properties.backendAddressPools) == 0
  | project name, resourceGroup, sku=sku.name, location
"

# Find Azure Firewalls and their SKU (Premium vs Standard)
az graph query -q "
  Resources
  | where type =~ 'microsoft.network/azureFirewalls'
  | project name, resourceGroup, sku=properties.sku.tier, location
"

# Find VNet Gateways with no connections
az graph query -q "
  Resources
  | where type =~ 'microsoft.network/virtualNetworkGateways'
  | project name, resourceGroup, gatewayType=properties.gatewayType,
            sku=properties.sku.name, location
"

6.6 Monitoring & Log Analytics

Log Analytics Cost Optimization

Optimization	Description	Savings Potential
Commitment Tiers	Pre-commit to daily ingestion volume (100, 200, 300+ GB/day)	15-30%
Basic Logs Plan	Lower cost for infrequently queried tables	60-80%
Data Collection Rules (DCR)	Filter and transform data before ingestion	Reduce volume 30-50%
Table-level Retention	Set different retention per table based on compliance needs	Variable
Dedicated Cluster	Aggregate 500+ GB/day across workspaces	Volume discount
Managed Prometheus	Replace Container Insights metrics for AKS	Significant
Transformations	Modify/filter control plane and data plane logs before storage	20-50%

Log Analytics Decision Tree

AKS Monitoring Cost Optimization

Before	After	Impact
Container Insights (full metrics)	Managed Prometheus + Grafana	Significant cost reduction
All control plane logs enabled	Disable unnecessary log categories	Reduce Log Analytics ingestion
Default data collection	Custom DCR with filtering	Ingest only needed data

# Check Log Analytics workspace pricing tier
az monitor log-analytics workspace show \
  --resource-group myRG \
  --workspace-name myWorkspace \
  --query "{name:name, sku:sku.name, retentionDays:retentionInDays}" \
  --output table

6.7 App Service & Azure Functions

App Service Cost Optimization -- Deep Guide

Strategy	Description	Savings	Portal / CLI
Upgrade v2 to v3 SKU	v3 is cheaper AND supports RI/Savings Plan	10-30%	Portal > App Service Plan > Scale up
Delete Stopped apps	Stopped App Services still incur plan charges	100% of those apps	`az webapp delete`
Enable autoscale	Scale instances based on CPU/memory/HTTP queue	20-50%	Portal > Scale out > Custom autoscale
Reserved Instances	Pre-commit for P1v3, P2v3, P3v3 plans	35-55%	Portal > Reservations > App Service
Use Free/Shared tier for dev prototyping	F1 (free) or D1 (shared) for non-production	100%	Portal > Scale up
Consolidate apps onto fewer plans	Multiple apps can share one App Service Plan	Variable	Move apps to shared plan
Use Deployment Slots only when needed	Each slot consumes resources on the plan	Reduce instance count	Portal > Deployment slots

# Find stopped App Services (still being charged)
az graph query -q "
  Resources
  | where type =~ 'microsoft.web/sites'
  | where properties.state == 'Stopped'
  | project name, resourceGroup, kind, location
"

# Find App Service Plans and their SKUs
az graph query -q "
  Resources
  | where type =~ 'microsoft.web/serverfarms'
  | project name, resourceGroup,
            sku=sku.name, tier=sku.tier,
            workers=properties.numberOfWorkers,
            location
  | order by tier asc
"

# Find App Service Plans with zero apps (wasted cost)
az graph query -q "
  Resources
  | where type =~ 'microsoft.web/serverfarms'
  | where properties.numberOfSites == 0
  | project name, resourceGroup, sku=sku.name, location
"

Azure Functions Cost Optimization

Plan	Best For	Cost Model	Key Optimization
Consumption	Sporadic, event-driven workloads	Pay per execution + execution time	First 1M executions/month free. Scale to zero
Flex Consumption	Variable workloads needing VNET	Per-second billing with always-ready instances	Set always-ready minimum low
Premium (EP)	Sustained load with VNET/unlimited execution	Pre-warmed instances, per-second billing	Right-size minimum instances
Dedicated	Apps sharing App Service Plan	App Service Plan pricing	Already paying for plan

Logic Apps Cost Tips

Scenario	Recommendation
High-volume workflows (100k+ actions/month)	Standard tier (fixed, predictable pricing)
Sporadic triggers (< 1000 runs/month)	Consumption tier (pay per action)
Enterprise integration with ISE	Evaluate migration to Standard + VNET integration

6.8 AI & Azure OpenAI Services -- Cost Optimization Deep Dive

Azure OpenAI Pricing Models

Model	Pricing Approach	Best For
Pay-as-you-go (Standard)	Per 1K tokens consumed	Variable, unpredictable workloads
Provisioned Throughput (PTU)	Fixed hourly rate for reserved capacity	Predictable, high-volume production
Batch API	Same tokens at reduced rate, async processing	Non-real-time inference (reports, analysis)
Global Standard	Lower per-token cost, Microsoft routes to optimal region	Latency-insensitive workloads

Token Cost Comparison (Illustrative)

Model	Input (per 1M tokens)	Output (per 1M tokens)	Use When
GPT-4o	$2.50	$10.00	Complex reasoning, code generation
GPT-4o-mini	$0.15	$0.60	Summarization, classification, simple Q&A
GPT-3.5-turbo	$0.50	$1.50	Legacy workloads, simple completions

Key insight: GPT-4o-mini is 16x cheaper than GPT-4o for input and delivers excellent quality for most business tasks. Always test if a cheaper model meets your quality bar.

Azure OpenAI Cost Optimization Strategies

Strategy	Description	Savings Potential
Model right-sizing	Use GPT-4o-mini instead of GPT-4o where quality allows	80-95% per request
Prompt engineering	Reduce system prompt tokens, use structured output, fewer examples	20-50% per request
Semantic caching	Cache embeddings of similar queries, return cached responses	50-90% for repeated queries
Batch API	Process non-urgent requests in batch at reduced rates	50% per token
PTU commitment	Pre-purchase provisioned throughput for predictable workloads	30-50% vs pay-as-you-go
Max tokens limit	Set `max_tokens` to prevent unnecessarily long responses	10-40% per request
Content filtering	Only enable needed content filters (custom filters cost more)	Variable
Chargeback by API key	Separate API keys per business unit to track and allocate costs	Accountability
Azure Budgets	Set budget alerts on OpenAI resource groups	Prevention

# Find all Azure OpenAI resources and their configurations
az graph query -q "
  Resources
  | where type =~ 'microsoft.cognitiveservices/accounts'
  | where kind == 'OpenAI'
  | project name, resourceGroup, sku=sku.name, location
"

Azure Cognitive Search Cost Tips

Strategy	Description
Right-size replicas	Use minimum replicas that meet query performance SLA
Right-size partitions	Match partition count to index size, not peak projections
Use Basic/Free tier for dev/test	Standard S1+ is expensive
Semantic ranker	Only enable when needed (per-query cost)

6.9 Implementation Timeline

Key Takeaways

Every workload type has unique cost levers -- there is no one-size-fits-all
VMs: Right-size first, then AHB, then Reservations/Savings Plans, then Spot for eligible workloads
AKS: Cluster Autoscaler + Spot node pools + Start/Stop for non-prod + KEDA for event-driven
Storage: Lifecycle policies are free money; delete orphaned disks and old snapshots
Networking: Egress is the hidden cost -- use Private Endpoints, CDN, same-region co-location
Monitoring: Basic Logs plan, commitment tiers, DCRs, and managed Prometheus for AKS
Start with quick wins (hours), then structured optimization (months), then FinOps culture (ongoing)

References

WAF Service Guides (Cost Optimization per Resource)

Resource	WAF Service Guide Link
Virtual Machines & VMSS	learn.microsoft.com/.../virtual-machines-review
Blob Storage	learn.microsoft.com/.../storage-accounts/cost-optimization
Azure SQL Database	learn.microsoft.com/.../azure-sql/database/cost-management
AKS	learn.microsoft.com/.../aks/best-practices-cost
Azure Firewall	learn.microsoft.com/.../azure-firewall
App Service	learn.microsoft.com/.../app-service-web-apps
Cosmos DB	learn.microsoft.com/.../cosmos-db
Log Analytics	learn.microsoft.com/.../azure-log-analytics

Other References

Previous Module: Module 5 — Usage Optimization & Waste Reduction
Next Module: Module 7 — AI Workload Cost Optimization
Back to Overview: README — Cost Optimization

6.1 Workload Optimization Map​

6.2 Virtual Machines​

Cost Optimization Matrix​

VM Cost Optimization Decision Tree​

Key Azure CLI Commands for VMs​

VM Right-Sizing: Metrics to Monitor​

WAF Service Guide: VM Cost Optimization Configuration Recommendations​

VM Family Selection Guide for Cost Optimization​

Comprehensive VM Cost Checklist for Customers​

6.3 Azure Kubernetes Service (AKS)​

AKS Cost Optimization Strategies​

AKS Architecture for Cost Optimization​

AKS CLI Commands​

AKS Autoscaling Decision Guide​

6.4 Storage & Databases​

Storage -- WAF Service Guide Recommendations​

Storage Cost Optimization -- Actionable Checklist​

Storage Tiering Flow​

Azure SQL Database Cost Optimization -- Comprehensive Guide​

Purchasing Model Decision​

SQL Cost Optimization Strategies​

Cosmos DB Cost Optimization​

PostgreSQL / MySQL Flexible Server​

6.5 Networking -- The Hidden Cost Driver​

Networking Cost Drivers Ranked by Impact​

Egress Cost Reduction Architecture​

Networking Cost Optimization Checklist​

Key Egress Rules of Thumb​

6.6 Monitoring & Log Analytics​

Log Analytics Cost Optimization​

Log Analytics Decision Tree​

AKS Monitoring Cost Optimization​

6.7 App Service & Azure Functions​

App Service Cost Optimization -- Deep Guide​

Azure Functions Cost Optimization​

Logic Apps Cost Tips​

6.8 AI & Azure OpenAI Services -- Cost Optimization Deep Dive​

Azure OpenAI Pricing Models​

Token Cost Comparison (Illustrative)​

Azure OpenAI Cost Optimization Strategies​

Azure Cognitive Search Cost Tips​

6.9 Implementation Timeline​

Key Takeaways​

References​

WAF Service Guides (Cost Optimization per Resource)​

Other References​