You wouldn’t believe how much money is leaking into your cloud bill right now for resources nobody’s using.

I’m building startup infrastructure. Before that, over 18 years across enterprise platforms all over Kazakhstan, where every tenge counts. And here’s the paradox, in a corporation with all its bureaucracy, financial control is tight. In the cloud, it’s a total free-for-all. A developer writes terraform, deploys, forgets. The bill comes at the end of the month. Everyone’s surprised.

This isn’t a money problem. It’s a visibility and culture problem.

Cloud costs leaking unnoticed

Why Money Leaks Without Anyone Noticing

Here’s a real case. Adobe. One Pull Request with temporary scaling for load testing cost $12,000 over three weeks. No hack, no bug in the code. The PR just sat there, nobody was watching the spend, the meter kept running.

According to SquareOps data for 2026, organizations waste 20–35% of their cloud budget. Average CPU utilization is just 15–30%. Meaning you’re paying for a server that just sits there breathing three quarters of the time.

Where this waste comes from:

Orphans, meaning snapshots, disks, load balancers that haven’t been needed for ages but nobody deleted them
Dev environments running 24/7, where a cluster spins overnight and on weekends while developers sleep
Inflated resource requests in Kubernetes, where teams request 4 CPU but use 0.3
Poor bin packing, where pods don’t pack efficiently and nodes are half-empty

Dashboards don’t help with this. A dashboard shows you what already happened. The $12K is already spent, the PR is already merged (or not), but you find out three weeks later when the bill arrives.

You need a different approach.

6 Patterns That Actually Work

1. Tagging as the foundation of everything, no exceptions

If a resource isn’t tagged, it’s invisible. Nobody optimizes invisible resources.

Minimum set of tags on every resource:

team: backend
service: payments-api
environment: production
owner: ilyas@company.com

In my experience, the most common problem is you started tagging today, but half your resources were created a year ago with no tags. Don’t try to do it all at once. The iterative approach works:

Pull up a cost dashboard
Find untagged resources
Tag them
Repeat next week

Tedious, but this is the foundation. Without tags you don’t know which team is burning money.

2. Cost-as-Code, or cost in code review

Back to Adobe. The problem wasn’t that someone was greedy or clueless. The problem was that the cost of the change wasn’t visible at the moment the decision was made. You write terraform, open a PR, the reviewer looks at the code logic, and nobody’s looking at the fact that this is going to cost $4,000 a month.

The solution is Infracost. Integrates into CI/CD, shows the cost delta right in the PR comment.

Looks something like this:

Estimated monthly cost change: +$3,847/mo

+ aws_instance.load_test  r5.4xlarge  +$876/mo
+ aws_rds.replica         db.r5.2xl   +$2,971/mo

Now the reviewer sees this before merging. This isn’t a blocker but a signal. Legitimate expensive decisions will still go through, just consciously.

In my opinion, this has the highest effort-to-result ratio of anything I’ve ever implemented. Takes a day to set up, runs forever.

3. Automated Cleanup

Manual doesn’t work. People forget, feel awkward deleting “someone else’s” stuff, keep putting it off.

cloud-nuke (from Gruntwork) is a tool for automatically cleaning up AWS resources by age and tags. You run it on a schedule in dev/staging environments and it deletes everything older than N days if there’s no do-not-delete: true tag.

Simplest scenario to start with:

# Delete all resources in dev older than 7 days
cloud-nuke aws --region eu-west-1 \
  --environment dev \
  --older-than 168h

In my startup experience, dev environments are the main source of waste. Someone was testing, created an RDS, Redis, three EC2s, forgot about it. All of it just keeps running.

Separately, snapshots and disks. Go right now to AWS Console → EC2 → Snapshots. Sort by creation date. I’d bet you’ll find snapshots from two years ago from instances that no longer exist.

4. Rightsizing, or pay for what you actually use

CPU utilization of 15–30% on average across the industry. That means most instances can be cut in half with no noticeable performance impact.

How to find candidates:

AWS Compute Optimizer is free and shows instance type recommendations right in the console
AWS Cost Explorer → Rightsizing Recommendations is also free, with concrete numbers

The algorithm is simple:

Take the list of recommendations
Look at metrics for the last 2 weeks
If CPU < 20% and memory is sufficient, downsize the instance type
Watch for a week

In my experience, engineers are afraid to downsize production. That’s normal. Start with staging where there’s zero risk, but real data.

5. Kubernetes: resource limits and bin packing

K8s is its own headache. Developers set requests: cpu: 2 because “just to be safe.” The scheduler reserves the node for those requests, actually uses 10% of what was requested. The node is half-empty but counted as occupied.

What to do:

Kubecost or OpenCost (the open-source version) show cost broken down by namespace, deployment, even individual pod. You immediately see who’s overpaying.

Minimum practices:

resources:
  requests:
    cpu: "100m"      # actual consumption, not "just to be safe"
    memory: "256Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

Also, shut down dev clusters at night and on weekends. If the cluster is only needed 9 to 6 on weekdays, that’s already minus 128 hours out of 168 in a week. 75% savings on dev.

6. On-demand vs provisioned: think about it at design time

This is about culture, but with a concrete example. Take DynamoDB for example. If you choose on-demand mode, you pay for actual requests. Provisioned capacity means you pay for what’s reserved, whether you use it or not.

For MVPs and unpredictable load, go with on-demand. For stable high load, provisioned is cheaper.

Or another example: you’re running an ML model. You use a GPU instance because “ML = GPU.” But the model is simple classification that runs perfectly fine on CPU. A GPU instance costs 5–10x more.

An engineer should be asking themselves about cost at design time, not when the bill has already arrived.

Tools: what to use

Tool	What for	Price
Infracost	Cost of changes in PR/CI	Free (open-source)
AWS Cost Explorer	Spend analysis, rightsizing	Free
AWS Compute Optimizer	Instance recommendations	Free
OpenCost	K8s cost by namespace/pod	Free (open-source)
Kubecost	Same + nice UI + alerting	Freemium
cloud-nuke	Auto-delete resources in dev/staging	Free (open-source)

In my opinion, Cost Explorer + Infracost + cloud-nuke is enough to start. That’s zero dollars and one day of setup.

How to get started in one day: checklist

No need to implement everything at once. Here’s the minimum you can do today:

Morning (2-3 hours):

Open AWS Cost Explorer, look at the top 5 services by cost over the last month
Enable Rightsizing Recommendations, write down the candidates
Go to EC2 → Snapshots, find snapshots older than 6 months

Afternoon (3-4 hours):

Install Infracost, integrate into one repository with terraform
Agree with the team that on infrastructure PRs we look at cost
Set a minimum set of tags for new resources (team, environment, owner)

Evening (1-2 hours):

Find dev/staging resources without an environment tag
Make a list of what can be turned off at night/on weekends
Set up cloud-nuke on the dev environment with a 7-day period

Structured optimization delivers 15–25% savings in the first 60 days, according to SquareOps data. It’s not magic, just consistent work.

Why this isn’t finance’s job

Finance sees the bill. The engineer sees the cause.

Finance doesn’t know that a specific EKS cluster can be scaled down from 10 nodes to 6 with no consequences. They don’t know that this particular snapshot is from an instance that was torn down a year ago. They can’t assess whether it’s worth paying for reserved instances or if on-demand is better.

FinOps only works if engineers own the cost of their decisions. These are the four components described by Noam Levy from ActiveFence: visibility, short feedback loops, utilization optimization, and, most importantly, cost-aware thinking.

That last one doesn’t come on its own. It has to be built into your processes like cost-as-code in PRs, dashboards by team, regular spend reviews in standups.

In my experience, when a developer sees that their PR adds $2,000 a month to the bill, they start thinking differently. Not out of fear. Just because the information became visible.

The cloud isn’t magic and it isn’t free. It’s a meter running 24/7. Your job is to know what it’s counting.

Start small. Install Infracost today. Look at Cost Explorer. Find one garbage snapshot and delete it.

That’s already better than nothing.

FinOps in Practice. How to Stop Burning Money in the Cloud