Skip to main content

Module 5: Usage Optimization & Waste Reduction

Duration: 60 minutes | Level: Tactical + Hands-on
WAF Alignment: CO:07 (Component Costs), CO:08 (Environment Costs), CO:12 (Scaling Costs)


5.1 The Cost of Waste​

Cloud waste β€” resources provisioned but not effectively used β€” typically represents 25–35% of total cloud spend in organizations without active FinOps practices. The most common waste categories are idle resources, orphaned components, and over-provisioned infrastructure.

Waste Impact by Category​

Waste CategoryTypical Monthly Cost Per InstanceDetection DifficultyRemediation Effort
Idle VMs (not deallocated)$50–$2,000+EasyLow
Unattached Premium Disks$20–$300EasyLow
Orphan Public IPs (Standard)~$3.60/month eachEasyLow
Idle Application Gateways$175–$500+MediumMedium
Idle Load Balancers (Standard)~$18/month baseEasyLow
Over-provisioned VMs30–70% of VM costMediumMedium
Old snapshots$0.05/GB/monthEasyLow
Idle Web Apps (on paid plans)$50–$500+MediumMedium

πŸ“– Azure Well-Architected Framework: CO:07 – Component costs


5.2 Azure Advisor Cost Recommendations​

Recommendation Categories​

Azure Advisor provides recommendations across five categories. For cost optimization, focus on the Cost category, but other categories also impact cost indirectly:

Advisor CategoryCost RelevanceExamples
CostDirectRight-size VMs, shutdown idle resources, buy reservations, use AHB
PerformanceIndirectAvoid over-provisioning by matching SKU to performance needs
ReliabilityIndirectPrevent costly outages and over-provisioned HA configurations
SecurityIndirectAvoid cost of breach remediation
Operational ExcellenceIndirectAutomate cost-saving operations

Accessing Advisor Recommendations Programmatically​

# Get all cost recommendations
az advisor recommendation list --category Cost --output table

# Get right-sizing recommendations specifically
az advisor recommendation list \
--category Cost \
--query "[?shortDescription.solution=='Right-size or shutdown underutilized virtual machines']" \
--output table

# Get reservation purchase recommendations
az advisor recommendation list \
--category Cost \
--query "[?shortDescription.solution contains 'reservation']" \
--output table

# Export all cost recommendations to JSON
az advisor recommendation list \
--category Cost \
--output json > advisor-cost-recommendations.json

Via Azure Resource Graph (for cross-subscription visibility):

// All Advisor cost recommendations across subscriptions
AdvisorResources
| where type == "microsoft.advisor/recommendations"
| where properties.category == "Cost"
| project
subscriptionId,
Resource = tostring(properties.resourceMetadata.resourceId),
Recommendation = tostring(properties.shortDescription.solution),
Impact = tostring(properties.impact),
AnnualSavings = tostring(properties.extendedProperties.annualSavingsAmount),
Currency = tostring(properties.extendedProperties.savingsCurrency)
| order by AnnualSavings desc

Via REST API:

# Get cost recommendations via REST API
az rest --method GET \
--url "https://management.azure.com/subscriptions/{sub-id}/providers/Microsoft.Advisor/recommendations?api-version=2022-10-01&$filter=Category eq 'Cost'" \
--output json

πŸ“– Azure Advisor Cost recommendations
πŸ“– Advisor REST API


5.3 Usage Optimization Workbook (Azure Advisor)​

The Cost Optimization Workbook in Azure Advisor provides a centralized view of waste across your entire Azure estate:

CategoryChecks Included
ComputeStopped (not deallocated) VMs, Deallocated VMs, VMSS optimization, Advisor right-sizing recommendations
StorageStorage v1 accounts, Unattached disks, Old snapshots, Premium storage snapshots, Idle backups
NetworkingAzure Firewall Premium misuse, Firewall instances per region, Idle App Gateways, Idle Load Balancers, Orphan Public IPs, Idle VNet Gateways
ServicesWeb Apps, AKS clusters, Azure Synapse, Log Analytics workspaces

How to Access​

  1. Navigate to Azure Portal > Azure Advisor
  2. Click Workbooks in the left menu
  3. Open Cost Optimization (Preview) workbook
  4. Use subscription and tag filters to scope your view

πŸ“– Cost Optimization Workbook


5.4 Azure Orphaned Resources Workbook​

The Azure Orphaned Resources Workbook (by Dolev Shor, Microsoft) is a community-driven Azure Workbook that identifies orphaned and idle resources across your subscriptions. It goes beyond Advisor by checking a broader set of resource types.

What It Checks​

Resource TypeDetection Logic
Managed DisksDisks with diskState == "Unattached"
Public IP AddressesPIPs with no ipConfiguration association
Network InterfacesNICs not attached to any VM or service
Network Security GroupsNSGs not associated with any subnet or NIC
Route TablesRoute tables not associated with any subnet
Load BalancersLBs with empty backend address pools
Application GatewaysAppGWs with empty backend pools
Front Door WAF PoliciesWAF policies not linked to any Front Door
Traffic Manager ProfilesProfiles with no endpoints
Availability SetsAvailability sets with no VMs
Resource GroupsEmpty resource groups (no child resources)
API ConnectionsAPI connections not used by any Logic App
CertificatesExpired certificates
App Service PlansPlans with no apps deployed

How to Deploy​

  1. Go to the Azure Orphaned Resources Workbook GitHub repo
  2. Click Deploy to Azure button or manually import the workbook JSON
  3. Select your subscription and resource group
  4. Once deployed, open the workbook from Monitor > Workbooks or Advisor > Workbooks

πŸ“– Azure Orphaned Resources GitHub


5.5 Azure Optimization Engine (AOE)​

The Azure Optimization Engine (AOE) by HΓ©lder Pinto (Microsoft) is an extensible solution that generates optimization recommendations by collecting and analyzing Azure resource metadata beyond what Azure Advisor natively covers.

Key Capabilities​

CapabilityDescription
Custom RecommendationsGenerates recommendations not found in Advisor (e.g., unused App Service Plans, idle Azure SQL DBs)
Programmatic AnalysisRuns on Azure Automation + Logic Apps + Log Analytics
Export & ReportingResults stored in Log Analytics for querying, Power BI dashboards
Multi-subscriptionAnalyzes across an entire tenant or management group hierarchy
ExtensibleAdd custom recommendation scripts (PowerShell runbooks)
ScheduledAutomated weekly or daily runs with email notifications

Recommendations AOE Generates​

  • VMs with AHB not enabled
  • VMs with auto-shutdown not configured
  • Unmanaged disks (Classic)
  • Storage accounts with suboptimal access tier
  • Unused App Service Plans
  • Unused or idle Azure SQL Databases
  • VMs with public IPs in production
  • Resource groups with no resources
  • NSGs with no associations
  • And many more via community modules

How to Deploy​

  1. Go to the Azure Optimization Engine GitHub repo
  2. Follow the deployment guide (ARM template deploys Automation Account, Log Analytics Workspace, Logic Apps)
  3. Configure scope (subscriptions/management groups)
  4. Results appear in the Log Analytics Workspace and can feed into Power BI

πŸ“– Azure Optimization Engine GitHub


5.6 WACO Waste Reduction Scripts β€” Detailed Descriptions​

The following PowerShell scripts are available in the knowledge base for automated waste cleanup. All scripts follow the same operational pattern:

Common Pattern:

  1. Set the CSV file path (exported from Azure Advisor or a custom query)
  2. Set the Tenant ID
  3. The script installs required Az PowerShell modules (Az.Resources, Az.Accounts, etc.)
  4. Authenticates to Azure via Connect-AzAccount
  5. Iterates through each resource in the CSV
  6. Performs the remediation action (delete, deallocate, stop)

Script Details​

#ScriptTarget ResourceWhat It DoesRequired ModulesInput CSV Format
1DeleteIdleAppGW_v2.ps1Application GatewaysReads a CSV of idle Application Gateways (those with no backend pool targets or zero healthy backend instances). For each gateway, it removes the resource using Remove-AzResource. AppGWs can cost $175–$500+/month even when idle, making this one of the highest-value cleanup scripts.Az.Resources 6.1.0, Az.Accounts 2.9.1IdleAppGw.csv β€” Resource IDs from Advisor
2DeleteIdleDisk_v2.ps1Managed DisksReads a CSV of unattached managed disks (state = Unattached). Iterates through and deletes each disk using Remove-AzResource. Targets orphaned disks left behind after VM deletion or disk detachment. Premium disks can cost $20–$300/month each while completely unused.Az.Resources 6.1.0, Az.Accounts 2.9.1UnattachedDisks.csv β€” Resource IDs
3DeleteIdleLB_v2.ps1Load BalancersReads a CSV of idle Standard Load Balancers that have empty backend address pools (no VMs or NICs associated). Removes each LB using Remove-AzResource. Standard LBs incur a ~$18/month base cost plus data processing charges even with no traffic flowing.Az.Resources 6.1.0, Az.Accounts 2.9.1IdleLB.csv β€” Resource IDs
4DeleteIdlePIP_v2.ps1Public IP AddressesReads a CSV of orphaned Public IP addresses (not associated with any NIC, Load Balancer, or NAT Gateway). Removes each PIP using Remove-AzResource. Standard SKU PIPs cost ~$3.60/month each, and organizations often accumulate hundreds of orphaned PIPs over time.Az.Resources 6.1.0, Az.Accounts 2.9.1PublicIPs.csv β€” Resource IDs
5DeleteIdleWebApp_v2.ps1App Services (Web Apps)Reads a CSV of idle or stopped Web Applications. Uses Remove-AzWebApp to delete idle web apps. Web apps on paid plans (Basic, Standard, Premium) continue to incur App Service Plan costs even when stopped, unless the entire plan is also removed.Az.Websites 2.11.3, Az.Accounts 2.9.1WebApps.csv β€” Resource group + name
6DeprovisionStoppedVM_v2.ps1Virtual MachinesReads a CSV of VMs in a Stopped (but not deallocated) state. Uses Stop-AzVM -Force with the deallocate flag to properly deallocate each VM. VMs in "Stopped" state (via guest OS shutdown) still incur full compute charges. Only Deallocated VMs stop billing for compute.Az.Compute 4.30.0, Az.Accounts 2.9.1StoppedVMs.csv β€” Resource group + name
7StopAksCluster.ps1AKS ClustersUses an Azure Automation Workflow runbook to stop non-production AKS clusters. Authenticates via system-assigned managed identity (Connect-AzAccount -Identity), then calls Stop-AzAksCluster to fully stop the cluster. Stopped AKS clusters incur no compute charges (only storage for disks). Ideal for dev/test clusters used only during business hours.Az.Aks (implicit via Automation)Hardcoded RG and cluster name in script
8IdentifyingNotModifiedBlobs.ps1Blob StorageScans a storage account using a SAS token, enumerating all blobs via Get-AzStorageBlob with pagination (1000 blobs per batch). For each blob, checks its LastModified date against configurable thresholds: 30 days for Cool tier candidates and 365 days for Archive tier candidates. Reports total blob count and aggregate size for each tier recommendation. Helps build the business case for lifecycle management policies.Az.Storage 4.2.0Parameterized: storage account name + SAS token

Script Usage Pattern​

# General usage pattern for CSV-based scripts:

# 1. Export idle resources from Azure Advisor or Resource Graph to CSV
# 2. Set parameters in the script:
$CsvFilePath = "C:\Temp\UnattachedDisks.csv"
$tenantID = "<YourTenantID>"

# 3. Run the script:
.\DeleteIdleDisk_v2.ps1

# For AKS stop script, edit the hardcoded parameters:
# $ResourceGroupName = "Wasteful-rg"
# $Name = "WastefulK8S"
# Then run via Azure Automation as a scheduled runbook

# For blob identification, set:
$storageAccountName = "<StorageAccountName>"
$sasToken = "<SASToken>"
$daysBeforeCoolTier = 30
$daysBeforeArchiveTier = 365

Scripts location: knowledge_base/Module Usage Optimization/Usage Optimization PowerShell Scripts/


5.7 Azure Resource Graph Queries for Finding Idle Resources​

Azure Resource Graph enables querying resource metadata at scale across subscriptions. These queries can be run in the Azure Portal (Resource Graph Explorer), Azure CLI, or PowerShell.

Unattached Managed Disks​

Resources
| where type == "microsoft.compute/disks"
| where properties.diskState == "Unattached"
| project
name,
resourceGroup,
subscriptionId,
sku = tostring(properties.sku.name),
diskSizeGB = tostring(properties.diskSizeGB),
location,
tags
| order by sku desc

Orphaned Public IP Addresses​

Resources
| where type == "microsoft.network/publicipaddresses"
| where properties.ipConfiguration == ""
or isnull(properties.ipConfiguration)
| project
name,
resourceGroup,
subscriptionId,
sku = tostring(sku.name),
ipAddress = tostring(properties.ipAddress),
location

Stopped (Not Deallocated) VMs​

Resources
| where type == "microsoft.compute/virtualmachines"
| where properties.extended.instanceView.powerState.code == "PowerState/stopped"
| project
name,
resourceGroup,
subscriptionId,
vmSize = tostring(properties.hardwareProfile.vmSize),
location

Idle Network Interfaces​

Resources
| where type == "microsoft.network/networkinterfaces"
| where isnull(properties.virtualMachine)
| where isnull(properties.privateEndpoint)
| project
name,
resourceGroup,
subscriptionId,
location

Empty Resource Groups​

ResourceContainers
| where type == "microsoft.resources/subscriptions/resourcegroups"
| join kind=leftouter (
Resources
| summarize resourceCount = count() by resourceGroup, subscriptionId
) on $left.name == $right.resourceGroup, $left.subscriptionId == $right.subscriptionId
| where isnull(resourceCount) or resourceCount == 0
| project
name,
subscriptionId,
location,
tags

App Service Plans with No Apps​

Resources
| where type == "microsoft.web/serverfarms"
| where properties.numberOfSites == 0
| project
name,
resourceGroup,
subscriptionId,
sku = tostring(sku.name),
location

Idle Load Balancers (No Backend Pools)​

Resources
| where type == "microsoft.network/loadbalancers"
| where sku.name == "Standard"
| where array_length(properties.backendAddressPools) == 0
or properties.backendAddressPools == "[]"
| project
name,
resourceGroup,
subscriptionId,
location

Running Resource Graph queries via CLI:

# Run any of the above queries
az graph query -q "Resources | where type == 'microsoft.compute/disks' | where properties.diskState == 'Unattached' | project name, resourceGroup, subscriptionId" --output table

πŸ“– Azure Resource Graph overview
πŸ“– Resource Graph query samples


5.8 Right-Sizing VMs β€” Complete Workflow​

Key Metrics to Monitor​

MetricThresholdAction
CPU Average< 5% for 14+ daysStrongly consider shutdown or downsize
CPU Average5–20% for 7+ daysReview and evaluate downsizing
CPU P95< 30% for 14+ daysSafe to downsize (even bursts are low)
Memory Average< 10% for 14+ daysConsider downsizing or B-series
Memory Average10–30% for 7+ daysReview memory-optimized alternatives
Network I/OMinimal traffic for 14+ daysEvaluate if VM is still needed
Disk IOPS< 5% of provisioned for 14+ daysConsider Standard SSD or smaller disk

Azure Monitor Metrics Queries (KQL)​

Use these queries in Azure Monitor Logs or Log Analytics to analyze VM utilization:

// Average CPU utilization per VM over the last 30 days
Perf
| where ObjectName == "Processor" and CounterName == "% Processor Time"
| where TimeGenerated > ago(30d)
| summarize AvgCPU = avg(CounterValue), P95CPU = percentile(CounterValue, 95) by Computer
| where AvgCPU < 20
| order by AvgCPU asc
// Memory utilization per VM over the last 30 days
Perf
| where ObjectName == "Memory"
| where CounterName == "% Committed Bytes In Use"
or CounterName == "% Used Memory"
| where TimeGenerated > ago(30d)
| summarize AvgMemory = avg(CounterValue), P95Memory = percentile(CounterValue, 95) by Computer
| where AvgMemory < 30
| order by AvgMemory asc
// Combined CPU + Memory view for right-sizing decisions
Perf
| where TimeGenerated > ago(30d)
| where (ObjectName == "Processor" and CounterName == "% Processor Time")
or (ObjectName == "Memory" and (CounterName == "% Committed Bytes In Use" or CounterName == "% Used Memory"))
| summarize AvgValue = avg(CounterValue) by Computer, ObjectName
| evaluate pivot(ObjectName, take_any(AvgValue))
| project
Computer,
AvgCPU = column_ifexists("Processor", 0.0),
AvgMemory = column_ifexists("Memory", 0.0)
| where AvgCPU < 20 or AvgMemory < 30
| order by AvgCPU asc

Right-Sizing Workflow​

Right-Sizing via Azure Advisor​

# Get right-sizing recommendations from Azure Advisor
az advisor recommendation list \
--category Cost \
--query "[?shortDescription.solution=='Right-size or shutdown underutilized virtual machines']" \
--output table

# Get the detailed savings amount for each recommendation
az advisor recommendation list \
--category Cost \
--query "[?shortDescription.solution=='Right-size or shutdown underutilized virtual machines'].{Resource:resourceMetadata.resourceId, Savings:extendedProperties.annualSavingsAmount, Currency:extendedProperties.savingsCurrency, CurrentSKU:extendedProperties.currentSku, TargetSKU:extendedProperties.targetSku}" \
--output table

πŸ“– VM right-sizing recommendations
πŸ“– Azure Monitor for VMs


5.9 Autoscaling Deep Dive​

Autoscaling Options Matrix​

AutoscalerLevelBest ForKey Feature
Horizontal Pod Autoscaler (HPA)Application (AKS)Predictable demandScale pod replicas based on CPU/memory/custom metrics
Vertical Pod Autoscaler (VPA)Application (AKS)Fluctuating resource needsAdjust CPU/memory requests automatically
KEDAApplication (AKS)Event-drivenScale to zero based on event sources (queues, topics, etc.)
Cluster AutoscalerInfrastructure (AKS)Node managementAdd/remove nodes based on pending pod scheduling
Node Autoprovision (NAP)Infrastructure (AKS)Optimal VM selectionAuto-select best VM SKU for workload requirements
VMSS AutoscaleInfrastructureGeneral computeScale VM instances based on metrics or schedules
App Service AutoscalePlatformWeb appsScale plan instances based on metrics or schedules

VMSS Autoscale Rules β€” Deep Dive​

VMSS autoscale supports three types of scaling triggers:

1. Metric-Based Scaling:

# Create a VMSS autoscale setting based on CPU
az monitor autoscale create \
--resource-group myRG \
--resource myVMSS \
--resource-type Microsoft.Compute/virtualMachineScaleSets \
--name myAutoscaleSetting \
--min-count 2 \
--max-count 10 \
--count 2

# Add a scale-out rule (add 1 VM when CPU > 70% for 10 minutes)
az monitor autoscale rule create \
--resource-group myRG \
--autoscale-name myAutoscaleSetting \
--condition "Percentage CPU > 70 avg 10m" \
--scale out 1

# Add a scale-in rule (remove 1 VM when CPU < 25% for 10 minutes)
az monitor autoscale rule create \
--resource-group myRG \
--autoscale-name myAutoscaleSetting \
--condition "Percentage CPU < 25 avg 10m" \
--scale in 1

2. Custom Metrics Scaling:

You can scale based on Application Insights metrics, Storage queue depth, Service Bus queue length, or any custom metric:

# Scale based on Storage queue message count
az monitor autoscale rule create \
--resource-group myRG \
--autoscale-name myAutoscaleSetting \
--condition "ApproximateMessageCount > 100 avg 5m" \
--scale out 2 \
--source "/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Storage/storageAccounts/{account}/services/queue/queues/{queue}"

3. Schedule-Based Scaling:

# Add a schedule profile for business hours (Mon-Fri 8am-6pm: 5 instances)
az monitor autoscale profile create \
--resource-group myRG \
--autoscale-name myAutoscaleSetting \
--name "BusinessHours" \
--min-count 5 \
--max-count 15 \
--count 5 \
--recurrence week Mon Tue Wed Thu Fri \
--start "08:00" \
--end "18:00" \
--timezone "Pacific Standard Time"

# Add a schedule profile for nights/weekends (min 1 instance)
az monitor autoscale profile create \
--resource-group myRG \
--autoscale-name myAutoscaleSetting \
--name "OffHours" \
--min-count 1 \
--max-count 3 \
--count 1 \
--recurrence week Mon Tue Wed Thu Fri Sat Sun \
--start "18:00" \
--end "08:00" \
--timezone "Pacific Standard Time"

Autoscaling Best Practices​

#PracticeWhy
1Always set both scale-out AND scale-in rulesPrevent "scale-up only" cost leaks
2Use a cooldown period (5–10 min)Avoid flapping between scale actions
3Set reasonable min/max instance countsPrevent runaway scaling costs
4Combine metric-based + schedule-basedProactive scaling for known patterns, reactive for spikes
5Monitor autoscale activity logsDetect unexpected scaling events
6Test scale-in behaviorEnsure graceful connection draining

πŸ“– VMSS Autoscale
πŸ“– Autoscale best practices
πŸ“– Predictive autoscale


5.10 Azure Automation: Start/Stop VMs​

Azure Automation provides a built-in Start/Stop VMs v2 solution (now based on Azure Functions + Logic Apps) for scheduling VM start/stop operations to reduce costs during non-business hours.

Option 1: Auto-Shutdown (Single VM)​

# Enable auto-shutdown for a single VM
az vm auto-shutdown \
--resource-group "DevTest-RG" \
--name "dev-vm-01" \
--time "1900" \
--timezone "Pacific Standard Time"

# Enable with email notification 30 minutes before
az vm auto-shutdown \
--resource-group "DevTest-RG" \
--name "dev-vm-01" \
--time "1900" \
--timezone "Pacific Standard Time" \
--email "team@contoso.com"

Option 2: Start/Stop VMs v2 (Fleet-Level)​

The Start/Stop VMs v2 solution deploys to your subscription and provides:

FeatureDescription
Scheduled stop/startCron-based schedules for groups of VMs
Tag-based targetingTarget VMs by tag (e.g., AutoShutdown: true)
Sequenced operationsStop/start VMs in a specific order (multi-tier apps)
NotificationsEmail alerts on action completion
Exclusion listExclude specific VMs from automation
ScopeSubscriptions, resource groups, or individual VMs

Deployment:

  1. Search for Start/Stop VMs v2 in the Azure Marketplace
  2. Deploy to a resource group
  3. Configure schedules, targets, and notifications in the deployed Logic Apps

Option 3: Custom Azure Automation Runbook​

# Example: Stop all VMs in a resource group tagged for auto-shutdown
workflow Stop-TaggedVMs
{
# Authenticate with managed identity
Disable-AzContextAutosave -Scope Process
$AzureContext = (Connect-AzAccount -Identity).context
$AzureContext = Set-AzContext -SubscriptionName $AzureContext.Subscription -DefaultProfile $AzureContext

# Get VMs with auto-shutdown tag
$VMs = Get-AzVM -Status | Where-Object {
$_.Tags["AutoShutdown"] -eq "true" -and
$_.PowerState -eq "VM running"
}

foreach -parallel ($VM in $VMs) {
Write-Output "Stopping VM: $($VM.Name)"
Stop-AzVM -Name $VM.Name -ResourceGroupName $VM.ResourceGroupName -Force
}
}

πŸ“– VM Auto-Shutdown
πŸ“– Start/Stop VMs v2
πŸ“– Azure Automation runbooks


5.11 Storage Optimization​

Storage Tiering Strategy​

Blob Tier Cost Comparison (Approximate)​

Using Hot tier as the 100% baseline for storage costs (per GB/month). Access costs scale inversely:

TierStorage Cost
(vs Hot)
Read Access Cost
(vs Hot)
Min RetentionRehydration
Hot100% (baseline)100% (baseline)NoneInstant
Cool~50% of Hot~10x Hot reads30 daysInstant
Cold~35% of Hot~14x Hot reads90 daysInstant
Archive~5–10% of Hot~500x+ Hot reads180 daysHours (standard) or minutes (high priority)

Rule of Thumb: If data is accessed less than once per month, Cool tier saves money. If accessed less than once per quarter, Cold tier. If accessed less than once per year, Archive tier.

Storage Lifecycle Management Policy (JSON Example)​

{
"rules": [
{
"enabled": true,
"name": "MoveOldBlobsToCool",
"type": "Lifecycle",
"definition": {
"actions": {
"baseBlob": {
"tierToCool": {
"daysAfterModificationGreaterThan": 30
}
}
},
"filters": {
"blobTypes": ["blockBlob"],
"prefixMatch": ["data/", "logs/", "reports/"]
}
}
},
{
"enabled": true,
"name": "MoveOldBlobsToCold",
"type": "Lifecycle",
"definition": {
"actions": {
"baseBlob": {
"tierToCold": {
"daysAfterModificationGreaterThan": 90
}
}
},
"filters": {
"blobTypes": ["blockBlob"],
"prefixMatch": ["data/", "logs/", "reports/"]
}
}
},
{
"enabled": true,
"name": "ArchiveOldBlobs",
"type": "Lifecycle",
"definition": {
"actions": {
"baseBlob": {
"tierToArchive": {
"daysAfterModificationGreaterThan": 180
}
}
},
"filters": {
"blobTypes": ["blockBlob"],
"prefixMatch": ["data/", "logs/"]
}
}
},
{
"enabled": true,
"name": "DeleteOldSnapshots",
"type": "Lifecycle",
"definition": {
"actions": {
"snapshot": {
"delete": {
"daysAfterCreationGreaterThan": 90
}
}
},
"filters": {
"blobTypes": ["blockBlob"]
}
}
},
{
"enabled": true,
"name": "DeleteOldVersions",
"type": "Lifecycle",
"definition": {
"actions": {
"version": {
"delete": {
"daysAfterCreationGreaterThan": 60
}
}
},
"filters": {
"blobTypes": ["blockBlob"]
}
}
}
]
}

Applying lifecycle policy via CLI:

az storage account management-policy create \
--account-name <storage-account> \
--resource-group <resource-group> \
--policy @lifecycle-policy.json

Storage Optimization Checklist​

#ActionSavings Potential
1Upgrade Storage v1 to v2 (GPv2)Enables tiering
2Enable lifecycle management policiesAutomate tiering
3Delete unattached managed disksImmediate savings
4Remove old snapshots (30+ days)Storage reduction
5Move snapshots from Premium to Standard tier~60% savings
6Use reserved capacity for stable storage workloadsUp to 38% savings
7Enable soft delete with reasonable retention (7–14 days)Avoid over-retention
8Review backup redundancy (GRS vs LRS for dev/test)~50% backup savings
9Delete empty storage containersCleanup
10Review and reduce blob versioning retentionReduce version sprawl

πŸ“– Storage Lifecycle Management
πŸ“– Access tiers overview
πŸ“– Storage pricing


5.12 Data Lifecycle Management β€” Log Analytics Retention​

Log Analytics workspaces can accumulate significant data volumes. Azure charges for both ingestion and retention beyond the default free period.

Retention Strategies​

StrategyDetailCost Impact
Default retention30 days included free (interactive). Up to 730 days configurable per table.Free first 30 days, then ~$0.10/GB/month
Archive tierData older than interactive retention moves to archive tier. Up to 7 years total. Query via search jobs (slower).~$0.02/GB/month (vs $0.10 interactive)
Data collection rulesFilter or transform data at ingestion time to reduce volumeReduce ingestion costs by 30–80%
Table-level retentionSet different retention per table. Keep security logs longer, performance logs shorter.Optimize cost vs compliance
Basic logs tierLower-cost ingestion for high-volume, low-query tables (e.g., verbose traces)~60–70% cheaper ingestion

Configuration​

# Set workspace-level retention to 90 days
az monitor log-analytics workspace update \
--resource-group myRG \
--workspace-name myWorkspace \
--retention-time 90

# Set table-level retention (SecurityEvent table: 365 days interactive, 730 total)
az monitor log-analytics workspace table update \
--resource-group myRG \
--workspace-name myWorkspace \
--name SecurityEvent \
--retention-time 365 \
--total-retention-time 730

# Set a table to Basic logs tier (cheaper ingestion)
az monitor log-analytics workspace table update \
--resource-group myRG \
--workspace-name myWorkspace \
--name ContainerLogV2 \
--plan Basic

πŸ“– Log Analytics data retention and archive
πŸ“– Basic logs


5.13 Environment Optimization (CO:08)​

EnvironmentOptimization Strategy
ProductionRight-size, reserved instances, autoscaling, monitoring
Staging/UATUse smaller SKUs, schedule off-hours shutdown, consider B-series
Dev/TestDev/Test pricing, Spot VMs, auto-shutdown, B-series VMs, KEDA scale-to-zero
DRActive-active where paid, minimal active-passive, On-Demand Capacity Reservations
SandboxStrict budgets, auto-delete after N days, resource locks on essential infra only

Sandbox Cleanup Policies​

Sandbox environments often grow uncontrolled. Implement these governance controls:

1. Auto-Delete Resource Groups After N Days:

Use Azure Policy with a deployIfNotExists effect to apply an expiration tag, then use Azure Automation to regularly scan and delete expired resource groups:

# Tag resource groups with an expiration date on creation
az group update \
--name sandbox-rg-01 \
--tags ExpirationDate=2026-03-25 Environment=Sandbox Owner=user@contoso.com

# Azure Automation runbook snippet to delete expired RGs
# (Schedule this runbook to run daily)
# Delete expired sandbox resource groups
$today = Get-Date
$resourceGroups = Get-AzResourceGroup | Where-Object {
$_.Tags["Environment"] -eq "Sandbox" -and
$_.Tags["ExpirationDate"] -and
[datetime]$_.Tags["ExpirationDate"] -lt $today
}

foreach ($rg in $resourceGroups) {
Write-Output "Deleting expired sandbox RG: $($rg.ResourceGroupName)"
Remove-AzResourceGroup -Name $rg.ResourceGroupName -Force
}

2. Budget Controls for Sandbox Subscriptions:

Set aggressive budgets ($50–$200/month) with auto-action to disable new resource creation at 100% threshold.

Resource Locks for Production Resources​

Prevent accidental deletion of critical production resources using Azure Resource Locks:

Lock TypeEffect
ReadOnlyResource can be read but not modified or deleted (very restrictive)
CanNotDeleteResource can be modified but not deleted
# Apply a CanNotDelete lock to a production resource group
az lock create \
--name "ProtectProdRG" \
--lock-type CanNotDelete \
--resource-group production-rg \
--notes "Prevents accidental deletion of production resources"

# Apply a lock to a specific resource
az lock create \
--name "ProtectProdDB" \
--lock-type CanNotDelete \
--resource-group production-rg \
--resource-name prod-sqldb \
--resource-type Microsoft.Sql/servers

# List all locks
az lock list --resource-group production-rg --output table

# Remove a lock (when intentional changes are needed)
az lock delete \
--name "ProtectProdRG" \
--resource-group production-rg

Best Practice: Apply CanNotDelete locks on production databases, storage accounts, key vaults, and networking infrastructure. Use Azure Policy to require locks on production resource groups.

πŸ“– Azure Resource Locks
πŸ“– WAF CO:08 – Environment costs


5.14 Usage Optimization Implementation Workflow​


Key Takeaways​

  1. Idle resources are the biggest waste β€” use the Advisor Workbook, Orphaned Resources Workbook, and WACO scripts to find and remediate them
  2. Right-sizing is continuous β€” review monthly with Advisor recommendations and Azure Monitor metrics (target <30% CPU avg for right-sizing candidates)
  3. Autoscaling prevents over-provisioning β€” implement at both application and infrastructure level; combine metric-based and schedule-based rules
  4. Storage tiering is free money β€” lifecycle policies automate the process; archive tier reduces storage costs by 90–95%
  5. Environment optimization β€” not all environments need production-grade resources; use B-series, Spot VMs, auto-shutdown, and Dev/Test pricing
  6. Log Analytics costs add up β€” use Basic logs, archive tiers, and data collection rules to control ingestion volume
  7. Automate waste prevention β€” Azure Policy, resource locks, expiration tags, and scheduled Automation runbooks prevent waste from recurring
  8. Resource Graph is your search engine β€” use it to find idle resources across hundreds of subscriptions in seconds

References​


Previous Module: Module 4 β€” Rate Optimization
Next Module: Module 6 β€” Workload-Specific Cost Optimization
Back to Overview: README β€” Cost Optimization

πŸ“–Learn