Right-sizing
Matching compute capacity to actual workload demand to eliminate over-provisioning.
Updated 2026-04-22 · 3 min read
Definition
Right-sizing is the continuous practice of matching allocated cloud capacity — CPU, memory, storage — to what a workload actually uses. The goal is to reclaim the gap between provisioned and consumed resources without creating a new risk to performance or availability.
Why it matters
Over-provisioning rarely shows up as a single painful event; it accumulates. Teams pick safe defaults early, traffic profiles change, and nobody goes back to re-check. A 30–50% headroom across a fleet quietly becomes the single largest controllable line item on the cloud bill.
Example
A batch ETL job runs nightly on an m6i.8xlarge because that's what it needed during the first data backfill two years ago. Utilization today peaks at 18% CPU. Moving it to m6i.2xlarge cuts cost ~75% with zero user-visible impact.
How to do it
- Enable a native recommender (AWS Compute Optimizer, Azure Advisor, GCP Recommender).
- Review utilization over a representative window — at least 14 days, longer if the workload is weekly or seasonal.
- Preserve headroom appropriate to the workload (tight for batch, generous for user-facing).
- Apply changes in non-production first, then stagger production rollout.
- Re-run the exercise on a standing cadence — efficiency drifts back.
Related Terms
Savings Plans
A flexible commitment discount covering a steady compute spend rate (USD/hr) across instance families, regions, and sometimes services.
Spot instances
Discounted cloud compute that can be reclaimed at short notice — ideal for fault-tolerant, interruptible workloads.
Reserved Instance
A commitment-based discount on cloud compute in exchange for a 1- or 3-year term — typically 30–60% off on-demand pricing.