Azure Cosmos DB Cost Optimization: 8 Levers to Cut Your RU/s Bill
Azure Cosmos DB Cost Optimization: 8 Levers to Cut Your RU/s Bill
Cosmos DB has earned its reputation of "great technology, painful invoice". After two years of running CostSentry.AI on Cosmos I have learned eight concrete levers that, pulled together, cut the total bill by 40–60% with no impact on performance or availability.
This article is the distilled version of what I run on every Cosmos DB FinOps audit.
Step 0: Understand What You Are Actually Paying For
Cosmos DB bills for two things:
- Provisioned/Consumed RU/s (request units per second) – throughput for database operations
- Storage (GB-month) – fixed price, marginal on the total bill
Storage is negligible in 90% of cases. RU/s is the lever that matters. Before optimizing, run a cost breakdown:
az consumption usage list \
--start-date 2026-05-01 \
--end-date 2026-06-01 \
--query "[?contains(consumedService, 'Microsoft.DocumentDB')].{Resource:instanceName, RU:meterDetails.meterName, Cost:pretaxCost}" \
--output tableYou should see how many RU/s each container consumes. That is the starting point for optimization.
Lever 1: Provisioned vs Serverless Decision
| Scenario | Recommendation | Reason |
|---|---|---|
| Dev/test, <5 000 RU/s burst | Serverless | Pay per RU consumed, no minimum |
| Production, >5 000 RU/s sustained | Provisioned (autoscale) | Cheaper per-RU at higher throughput |
| Spike-heavy (5× nominal) | Provisioned autoscale | No throttling, accommodates spikes |
| Steady predictable load | Provisioned manual | Lowest per-RU price |
For the CostSentry.AI dev/test environment, switching from Provisioned (4 000 RU/s) to Serverless dropped the bill from USD 180 to USD 40 per month. For production we stayed on Provisioned autoscale.
Lever 2: Autoscale Tuning
The default autoscale setting is max 4 000 RU/s. Most dev containers do not need it. Always:
- Check the actual peak over the last 30 days in Metrics → Total Request Units
- Set max to 1.3× peak (not 4 000, not 10 000 – a concrete number from your data)
resource container 'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers@2024-08-15' = {
name: 'users'
parent: database
properties: {
resource: {
id: 'users'
partitionKey: { paths: ['/tenantId'], kind: 'Hash' }
}
options: {
autoscaleSettings: {
maxThroughput: 2000 // not the 4000 default!
}
}
}
}Per-RU autoscale price: ~$0.00012 / RU / hour (vs $0.00008 manual). Choosing max 2 000 instead of 4 000 saves 50% even when idle – autoscale does not bill for unused capacity, but the baseline is 10% of the maximum.
Lever 3: Indexing Policy Optimization
The default indexing policy in Cosmos is /* – it indexes everything. That raises write cost by 20–40% versus an explicit policy. If you only query specific properties:
{
"indexingMode": "consistent",
"includedPaths": [
{ "path": "/tenantId/?" },
{ "path": "/createdAt/?" },
{ "path": "/email/?" },
{ "path": "/status/?" }
],
"excludedPaths": [
{ "path": "/*" },
{ "path": "/_etag/?" }
]
}This indexes only 4 fields. Write RU cost drops from ~10 RU per document to ~6 RU per document = 40% saving on writes.
At CostSentry.AI this change on a single container (events, 500k writes/day) dropped cost from USD 80/month to USD 52.
Heads up: Cosmos applies indexing policy changes as a background reindex that can take hours to days on large containers. Monitor via:
// Cosmos SDK – check reindex progress
const { resource: container } = await client
.database('mydb')
.container('users')
.read();
console.log(`Indexing transformation: ${container.indexTransformationProgress}%`);Lever 4: TTL for Ephemeral Data
If you have data with an expiration (sessions, audit logs, cached query results, expired notifications), set a container-level TTL instead of manual deletion:
resource sessionContainer 'Microsoft.DocumentDB/.../containers@2024-08-15' = {
name: 'sessions'
parent: database
properties: {
resource: {
id: 'sessions'
partitionKey: { paths: ['/userId'], kind: 'Hash' }
defaultTtl: 86400 // 24 hours in seconds
}
}
}TTL deletion costs 0 RU – Cosmos handles it as part of background maintenance. By contrast, a manual DELETE call costs ~5 RU per document plus the application overhead.
For variable per-document TTLs, set a ttl property on the document:
await container.items.create({
id: 'session-123',
userId: 'user-456',
ttl: 3600, // this document expires in an hour, not in 24
data: { ... }
});Lever 5: Partition Key Design (Cost Driver #1)
A bad partition key is not just a performance problem – it is the single biggest cost driver. A hot partition means:
- Cosmos has to split the partition (RU/s overhead)
- Throttling even with high provisioned throughput
- Cross-partition queries (5–10× more expensive than single-partition)
Rules:
- Cardinality: at minimum thousands of distinct values
- Uniform distribution: no value should hold more than 5% of all documents
- Query-aligned: queries frequently filter by that value
At CostSentry.AI we originally used /customerId as the PK for events. For large customers that meant a hot partition. Refactoring to /customerId-${YYYYMM} (customer + month composite) spread the load evenly and cut cross-partition query cost by 60%.
Lever 6: Bulk Operations for Write-Heavy Workloads
If you write 100+ documents at a time, single-document inserts burn RU. Use bulk:
// Instead of a loop over inserts
const operations = items.map(item => ({
operationType: 'Create',
resourceBody: item
}));
const response = await container.items.bulk(operations);Bulk operations share networking overhead and Cosmos batches them – the result is 30–40% lower RU cost per document compared to the loop variant.
Lever 7: Reserved Capacity for Provisioned
Like VM Reserved Instances, Cosmos DB has Reserved Capacity – you pay for a 1Y or 3Y commitment and get a 20–65% discount:
| Term | Discount |
|---|---|
| 1-year | 20% |
| 3-year | 65% |
For a production workload with a stable 10 000 RU/s baseline this means:
Pay-as-you-go: ~USD 580/month
1-year Reserved: ~USD 465/month (-20%)
3-year Reserved: ~USD 205/month (-65%)Heads up: since summer 2024 Microsoft has disabled exchanges for Cosmos DB Reserved Capacity (refund only). If you plan re-platforming inside a 2-year horizon, buy 1-year instead of 3-year.
Lever 8: Multi-Region Replication Audit
The default architect mindset is "for production we turn on multi-region for HA". Cosmos DB bills each region as its own provisioned capacity:
| Setup | RU/s cost multiplier |
|---|---|
| Single region (zone-redundant) | 1.25× |
| 2 regions (read replica) | 2× |
| 3 regions | 3× |
| 5 regions | 5× |
For most workloads, a zone-redundant single region (99.995% SLA) is enough. Turn on multi-region only when:
- You have an explicit business requirement for geographic redundancy
- You have read traffic from multiple continents and a latency requirement <100 ms p99
- Compliance requires data residency across multiple regions
At Christie's we found four production Cosmos accounts with a multi-region setup from 2022 nobody questioned anymore. Switching to a zone-redundant single region cut Cosmos invoices by ~EUR 1 800/month combined.
FinOps Audit Checklist for Cosmos DB
| Step | Expected saving | Difficulty |
|---|---|---|
| Switch dev/test to Serverless | 50–80% on non-prod | Low |
| Tune autoscale max to 1.3× peak | 20–40% on overprovisioned | Low |
| Optimize indexing policy | 20–40% on writes | Medium |
| Set container TTL for ephemeral data | 100% on manual deletes | Low |
| Audit partition key cardinality/distribution | 30–60% on cross-partition queries | High |
| Bulk operations for write-heavy workloads | 30–40% on bulk write paths | Medium |
| Reserved Capacity 1Y on stable workload | 20% on baseline | Low |
| Audit multi-region necessity | up to 50% on unnecessary replicas | Low |
Conclusion
Cosmos DB cost optimization is a craft – no silver bullet, but eight concrete levers with a cumulative effect of 40–60% reduction. At CostSentry.AI, after applying all of them, we dropped the Cosmos bill from USD 380/month to USD 145 with no loss of functionality or SLA.
The key principle: Cosmos is not expensive, it is misconfigured. Microsoft tuned the defaults for "works out of the box", not for "cheapest". Three hours of FinOps audit returns thousands of EUR per year.
Need help with a Cosmos DB FinOps audit? Check out our cloud architecture services or reach out for a NoSQL cost review.
About the author

Martin Rylko
Senior Cloud Architect & DevOps Engineer
14+ years in IT – from on-premises datacenters and Hyper-V clustering to cloud infrastructure on Microsoft Azure. I specialize in Landing Zones, IaC automation, Kubernetes and security compliance.
Frequently Asked Questions
When should I switch Cosmos DB from Provisioned to Serverless?▾
What is autoscale and when should it replace manual scaling?▾
How does indexing policy affect cost?▾
Is multi-region replication worth it for non-critical workloads?▾
You might also like
Azure FinOps: 7 Steps to Cut Cloud Costs by 30%
Practical Azure FinOps guide – from Cost Management through VM right-sizing to Reserved Instances and auto-scaling. Real savings from enterprise projects.
ReadAzure Reserved Instances and Savings Plans: A Fiscal Year-End Strategy
A complete guide to buying Azure Reserved Instances and Savings Plans before fiscal year-end. Coverage analysis, exchange rules, and real-world math from enterprise projects.
ReadAzure Functions Flex Consumption: When to Replace the Premium Plan in 2026
Flex Consumption is the third path between the Consumption and Premium plans for Azure Functions. A practical breakdown of the pricing model, VNet integration, and when to switch off the Premium plan.
Read