Skip to content
Ryan de Melo
Go back

FinOps Is Just Capacity Planning With a Better Hat

I once watched a team cut their cloud bill by a third in a quarter without buying a single tool, renaming a single line item, or attending a single FinOps ceremony. They just got an alert that said which of their services was the most expensive, and they were embarrassed.

That is the whole thing. That is the part the vendors will not put on a slide.

For a stretch I owned an infrastructure P&L large enough that a single percent of waste was a real number, the kind that shows up in a board deck. So I learned the discipline the way you learn anything that has your name on the invoice: by being wrong in public a few times. And the lesson that stuck is that FinOps, the practice that got a logo and a foundation and a certification track somewhere around 2020, is capacity planning. We did capacity planning in data centers for decades. You forecast demand, you provision for it, you account for what you used, and you tell the people who used it what it cost. Same loop. The cloud just made the meter spin faster and detached it from anyone who could feel it.

What the tooling actually buys you

I am not anti-tool. Two things the platforms genuinely do that you cannot do with a spreadsheet at scale, and both are about visibility, not control.

The first is allocation. In a multi-account, multi-team org, the raw billing export is a swamp. Untagged resources, shared clusters, a NAT gateway that quietly bills more than the workload behind it. Getting from that swamp to “this team spent this much on this product” is real engineering work, and the tools that do it well save you from building it yourself. Worth paying for.

The second is unit economics. Total spend going up is not a problem if you are selling more. Total spend going up while cost-per-order goes up is a fire. The number that matters is never the bill. It is the bill divided by the thing your business actually counts: per order, per merchant, per thousand requests, per ride. A tool that puts that ratio in front of an engineer who can move it is doing the one useful thing. (Most dashboards show the numerator and call it a day, because the numerator is easy and the denominator lives in a different team’s database.)

Where it becomes theater

Here is the part nobody tells you. You can buy every tool, hire the FinOps lead, run the cadence, and change nothing.

The classic failure is renaming showback. Showback is when you show a team what they spent. Chargeback is when it actually hits their budget. The difference is enormous and almost nobody crosses it, because crossing it means a finance conversation and an org-chart fight, and the tool cannot have that fight for you. So teams build a beautiful dashboard, point at it in the monthly review, everyone says “we should really look at that,” and the bill keeps climbing at exactly the rate it was climbing before. The dashboard is not the intervention. It is a photograph of the problem in a nicer frame.

The tell is easy to spot. Ask who at the table feels the number. If the answer is “the platform team gets yelled at,” you have built a cost-allocation system that allocates cost to the one team least able to change what the product teams provision. That was my mistake, the first year. I made my own group the cost sink for everyone else’s choices, then wondered why nobody else optimized anything. Why would they. It was free to them.

The actual fix is cultural

The thing that worked was making teams feel the cost of their own decisions, in a unit they already cared about, close to the moment they made the choice.

Not a quarterly true-up. Not a dashboard they had to remember to open. A number in the place they already looked: the cost delta in the deploy pipeline, the per-request cost on the same page as the latency graph, the alert when a service crossed a threshold its own team had set. Once an engineer can see that the cache they skipped is costing more per month than the cache would have taken an afternoon to write, you do not have to mandate anything. They fix it. Pride does more than policy.

None of that requires a FinOps platform. It requires an engineering culture where cost is a property of a system you own, like latency or error rate, and not a bill that arrives from somewhere else with your name nowhere near a decision that drove it.

So buy the visibility tools if the swamp is real. Compute your unit economics, religiously. But if your cost problem persists after all of that, the tool was never going to fix it. You have a team that does not feel its own spend. Go fix that. The dashboard will get a lot more interesting once someone in the room is afraid of it.


Share this post:

Previous Post
Your Recommendation Engine Doesn't Need Deep Learning (Yet)
Next Post
Everyone Wants ChatGPT in Their Product. Most Should Wait.