Platform Engineering on AWS: Build Faster & Scale Smarter

By:

Thom Bogers

Updated on:

May 5, 2026

Every cloud team has a cost report. It arrives monthly, it’s 40 pages long, and it tells you that costs went up. Again.

Someone skims the summary. “EC2 up 30%.” “Lambda up 47%.” The team agrees they should look into it. Nobody does, because there are pipelines to fix and developers to unblock. Next month, same story, higher number.

This is not a technology problem. It’s a process problem. And it’s costing European companies billions.

‍

Dutch companies alone spend an estimated €8.4 billion annually on cloud services. Industry research consistently shows that 30 to 32% of that is waste. Not “could be optimised.” Waste. Resources that are idle, oversized, or misconfigured.

That’s roughly €2.5 billion per year, just in the Netherlands, going to infrastructure that nobody is using.

‍

Your cost report only tells you half the story

Even when someone does sit down with the report, here’s the thing nobody talks about: most automated reports show you your top 10 biggest cost items. So you look at EC2, RDS, Lambda. The usual suspects. Meanwhile, item number 11 starts climbing. Maybe it’s S3 storage that’s been quietly growing for months. Maybe it’s CloudWatch logging that someone set to debug level during an incident and never turned back. Maybe it’s a NAT gateway that was supposed to be temporary two years ago and is now costing €35/month across every VPC in every account.

‍

These costs don’t make the top 10 until they’re already a serious problem. By the time they surface, you’ve been overspending for weeks or months.

And even for the costs that do show up in the report, the report tells you the total, not what to actually change. “EC2 costs went up 30%.” Great. But which instances? What are they running? What’s the current instance type, and should it be different? What does the Terraform configuration look like, and what would the change be?

That’s where the real work starts. And it’s the kind of manual digging through Cost Explorer, CloudWatch metrics, and infrastructure code that nobody has time for.

‍

Where cloud waste actually hides

After running hundreds of cloud scans for startups and scale ups across the Netherlands and the EU, we see the same patterns over and over:

‍

Oversized instances running at low utilisation.

‍This is the most common finding. Teams size instances based on worst case traffic projections. Worst case never happens. The instance runs at 15% CPU, 24 hours a day, 365 days a year. Right sizing can cut compute costs by 30 to 50%, but it requires someone to check utilisation data against the current configuration and propose a specific change. That’s 60 to 90 minutes of manual work per finding.

‍

Database connection pooling that prevents scaling.

‍Serverless databases like Aurora Serverless v2 or Cloud SQL are designed to scale down when idle. But if your application holds connections open aggressively, the database can never scale down. You’re paying for compute capacity that exists only because the app won’t let go of idle connections. This one is particularly insidious because it doesn’t show up in the infrastructure configuration. It’s in the application code.

Log retention with no policy.

‍CloudWatch logs growing indefinitely because nobody configured a retention period. Not expensive per service, but it adds up silently across dozens of services over months and years.

‍

Forgotten environments.

‍Dev and staging environments from three sprints ago, still running at full capacity. Each one is €90/month in a €15,000 bill. Nobody notices. Six months later, you’re paying €600/month for environments nobody has touched.

‍

Duplicate infrastructure.

‍A load balancer per environment. A NAT gateway per VPC. Monitoring stacks per team. When a shared resource would do.

The pattern is always the same: each one is small enough to ignore individually, but collectively they represent 20 to 40% of the bill.

‍

Why the standard FinOps toolkit isn’t enough

‍

The FinOps Foundation has done excellent work establishing cloud financial management as a discipline. 58% of large Dutch companies now have dedicated FinOps functions. That’s real progress. But the standard toolkit has a gap. Cost Explorer shows you what happened. Budget alerts tell you when a threshold is exceeded. Tagging helps you allocate costs. Dashboards visualise the data.

None of these investigate.

An alert that says “EC2 costs in eu-west-1 are up 30%” is the start of a question, not an answer. Someone still needs to trace it to specific resources, check utilisation, review the Terraform, and figure out whether it’s a legitimate feature launch or a misconfiguration.

That investigation takes 60 to 90 minutes per alert done properly. Most teams can afford that once or twice a week. So the rest get skipped. The anomaly becomes the new baseline. The waste compounds.

‍

Why alert fatigue makes it worse

‍

A mid size cloud setup can easily produce dozens of cost and performance alerts per week. Budget threshold exceeded. Anomaly detected. Utilisation warning. Scaling event. Each one technically meaningful. But when you get 30 of these a week, they stop being signals and start being noise.

The instinct is to add more monitoring. More granular budgets. Tighter thresholds. This makes the problem worse. Every additional alert rule adds to the noise. The ratio of actionable to ignorable stays the same. The only thing that changes is that the Slack channel scrolls faster.

‍

The regulatory backdrop makes this harder to ignore

‍

DORA (the Digital Operational Resilience Act) came into force for financial services in January 2025. It expects companies to demonstrate governance over their IT resources, including cloud. A surprise 40% cost increase with no documented explanation is exactly the kind of finding that makes an auditor uncomfortable.
NIS2 is being enforced more broadly across the EU. The Netherlands is expected to implement its Cybersecurity Act in Q2 2026, with penalties up to €10 million or 2% of global revenue. Neither regulation is specifically about cost, but both expect you to know what you’re running, why, and whether it’s being managed deliberately.
ISO 27001:2022 and SOC 2 are increasingly requested earlier in sales cycles. Demonstrating governance over your cloud resources is becoming a commercial requirement, not just a compliance one.

‍

How AI agents change the economics of FinOps

‍

76% of DevOps teams have integrated AI into their workflows by 2026. But most of what’s labelled “AI for DevOps” is still chatbots. You ask a question, it answers. Useful, but it doesn’t solve the core problem: the work keeps coming whether or not someone asks for help. AI agents are different. They’re triggered by real events in your infrastructure: a cost anomaly, a scheduled audit, a pipeline failure. They investigate autonomously, following the same path a senior engineer would, but faster and without context switching.

For FinOps specifically, the agent monitors your cloud spend continuously, not monthly. It analyses the full picture, including the trends that standard tools miss: item number 11 that’s quietly becoming item 5, the S3 storage bloat, the logging costs that crept up after the last incident.

‍

When it detects an anomaly or identifies an optimisation, it traces the cost to specific resources, checks how they’re actually being used, reviews the current configuration, and generates the Terraform code to fix it. Not “you should look into EC2.” Instead: “These 12 m5.xlarge instances in eu-west-1 are running at 15% CPU. Switch to m5.large. Here’s the updated Terraform. Estimated savings: €4,200/month.”

The cost report becomes a pull request.

‍

Security and data sovereignty

‍

For European companies, security is often the first blocker. Most AI tools are SaaS. Your infrastructure data leaves your environment. The model runs somewhere in us-east-1.

Under GDPR and Schrems II, sending cloud metadata to US based AI providers is a compliance conversation you don’t want to have with your DPO.

The right architecture for European cloud operations is single tenant, deployed inside your own environment, with the model running in your own region. Data never leaves. Access is least privilege and read only. Full audit trail. Designed from day one to pass SOC 2, ISO 27001, GDPR, and DORA reviews.

‍

What you can do right now

‍

If your cloud bill is higher than it should be (or you suspect it might be), here are concrete steps:

Pick one AWS account or GCP project and do a manual audit. Just one. Most teams find something within 30 minutes.
Assign ownership for cost anomaly investigation. Not “the team” but a specific person, with a response time.
Set log retention policies everywhere. Five minutes of Terraform.
Review instance sizing against actual utilisation, not projected utilisation.

And if you want to see what an AI agent would find: we offer a free cloud scan. A 30 minute review call, then our FinOps Agent scans your entire environment in hours. You get a full report with findings and savings estimates. If you decide to implement the agents for ongoing cost management, we work on a no cure, no pay basis. We take a percentage of the savings we actually deliver. No savings, no cost.

‍

Book a free cloud review: https://www.blackbird.cloud/free-cloud-scan

‍

About the next webinar

Cloud cost optimization in 2026: why your reports aren’t saving you money (and what will)

Cloud cost optimization in 2026: why your reports aren’t saving you money (and what will)

Thom Bogers

Your cost report only tells you half the story

Where cloud waste actually hides

Why the standard FinOps toolkit isn’t enough

Why alert fatigue makes it worse

The regulatory backdrop makes this harder to ignore

How AI agents change the economics of FinOps

Security and data sovereignty

What you can do right now

Cloud cost optimization in 2026: why your reports aren’t saving you money (and what will)

Who should attend

Key takeaways

About the bird

Meet the team

Joeri Malmberg

Sakif Surur

Thom Bogers

Melvin Stans

Lets’s fly together! Contact us