Cost Optimization Mindset
FinOps culture, cloud spend governance, cost as architectural quality attribute
Cost optimization is a first-class architectural quality attribute — as important as performance or reliability — especially in the variable-cost cloud model where architectural decisions directly translate to monthly invoices. The FinOps Foundation defines a maturity model: Inform (visibility into spend), Optimize (reduce waste), and Operate (real-time cost governance embedded in engineering workflows). Cost decisions must be made by engineers, not just finance teams — the "cost-aware architect" treats cloud spend as a product feature with a cost dimension alongside latency and availability. Cloud spend governance requires tagging standards, budget alerts, and regular review cadences embedded into team rituals.
Key Points
- Cost as an NFR: include cost constraints in architecture reviews ("this solution must cost < $X/month at Y scale") alongside performance and availability requirements
- FinOps Inform phase: deploy cost allocation tags across 100% of cloud resources; set up per-team cost dashboards in real time; the first step that enables all subsequent optimization
- FinOps Optimize phase: rightsize EC2/RDS instances (AWS Compute Optimizer), purchase Reserved Instances/Savings Plans for stable workloads, delete zombie resources (idle load balancers, unattached EBS volumes), optimize data transfer costs
- FinOps Operate phase: cost anomaly alerts in Slack within 15 minutes of threshold breach; engineers make cost-aware design choices daily; cost is a metric in architecture reviews and sprint retrospectives
- Cost-per-unit metrics: rather than total cloud spend, track cost per API request, cost per transaction, cost per active user — this metric is meaningful to product and finance, normalizes for growth, and is directly actionable by engineers
- Data transfer costs: often the largest hidden cost; data moving between AZs ($0.01/GB), regions ($0.02–0.08/GB), and out to the internet ($0.05–0.09/GB) can exceed compute costs for data-heavy architectures; architect data flows with egress in mind
- Kubernetes cost optimization: right-size pod resource requests (VPA), use Spot/Preemptible nodes for non-critical workloads, implement Cluster Autoscaler, and use Kubecost/OpenCost for namespace-level cost attribution
- Cost review cadence: weekly automated cost anomaly report to team leads; monthly cost review in team retrospective; quarterly rightsizing analysis; annual architecture review with cost optimization as explicit agenda item
Real-World Example
Lyft reduced their cloud spend by $30M annually by embedding a "Cost Engineer" role within their platform team who works directly with product engineering teams; the role identifies waste, educates on cost-efficient patterns, and implements shared platform cost optimization — treating cost reduction as a product feature with quarterly OKRs.