TG
infrastructure·architecture·software-engineering·5 min read

Leaving the cloud: why 2026 is the year of the bare metal comeback

Cloud repatriation stopped being a Twitter take and became a real movement. The rationale behind moving back to bare metal, the numbers that matter, and when it actually makes sense for your team.

Ler em português
Leaving the cloud: why 2026 is the year of the bare metal comeback

For more than a decade, "go to the cloud" was the default answer to any infrastructure question. In 2026, that reflex has become the exception in serious engineering teams.

This isn't anti-cloud. It's course correction.

The question stopped being "AWS, Azure or GCP?" and became "what should actually run where?". And for a lot of things that seemed obvious in the cloud, the answer is now bare metal in colocation.

What's happening

Flexera's 2025 report already showed 21% of workloads had been repatriated. More recent data points to 86% of CIOs planning to move workloads back — the highest rate ever recorded. Gartner projects 40% of enterprises running hybrid architectures on mission-critical workloads by 2026, up from 8% before.

The movement has a name: cloud repatriation.

And it's not coming from conservative laggards. It's coming from teams that were already cloud-native and ran the math carefully.

The case that unlocked the conversation: 37signals

DHH and 37signals became the poster child for the thesis by publishing honest numbers:

  • AWS bill: $3.2M/year → Dell + colocation operation: $1.3M/year (and dropping).
  • Hardware investment (~$600K) paid off in 6 months.
  • Database queries 3–5x faster with no noisy neighbors.
  • Projection: $10M+ saved over 5 years.
  • Zero new hires for the transition — colocation handled the physical layer.

They also open-sourced tooling along the way (Kamal), and that matters: leaving the cloud forced simplicity, not extra complexity.

Why the math flipped

The cloud charges three things that look small and devastate budgets when stacked:

  1. Continuous compute — you pay for elasticity you don't use when load is steady.
  2. Egress — the silent killer, especially for apps with heavy media or storage traffic.
  3. Premium managed services — RDS, ElastiCache, OpenSearch and friends carry a huge spread over the self-hosted equivalent.

When a workload matures and becomes predictable, that model turns into overpaying for what it delivers. 43% of IT leaders report the cloud ended up more expensive than expected.

The AI tailwind

The 2025–2026 AI explosion accelerated the thesis.

Cloud GPU is absurdly expensive for continuous training and inference. And the data gravity problem (moving terabytes near the GPU) pushes naturally toward colocation and dedicated clusters. For teams running their own models in production, bare metal doesn't just win — it wins by a landslide.

What leaves the cloud well

  • Large, hot databases
  • Steady-state workloads (non-burst)
  • Model training and inference on GPU
  • Heavy storage and analytics
  • Apps with strong regulatory requirements (DORA, data sovereignty)

What still belongs in the cloud

  • Frontend, CDN, global edge
  • Workloads with unpredictable peaks
  • Low-volume serverless and event-driven loads
  • Small teams in early stage — low OpEx matters more than TCO

The winning architecture: hybrid

Almost nobody serious is doing a "full cloud exit". The pattern that's emerging is smarter:

  • Frontend and edge in the cloud — CDN, light functions, global distribution.
  • Core and data on bare metal — databases, heavy queues, GPU, storage.
  • Burst capacity in the cloud — for seasonal spikes or events.

The cloud stops being the house and becomes the porch.

Trade-offs the marketing won't show you

Bare metalCloud
High upfront CapExScalable OpEx
Manual capacity planningAutomatic elasticity
Team needs hardware fluency (or colo partnership)Ops abstracted away
Best price/perf at steady loadBest for spikes and experiments
Full data sovereigntyHeavy vendor dependency

Bare metal is not free. You trade a predictable, growing bill for a model with higher CapEx, hardware lifecycle, and more operational responsibility. The right question isn't "which is better", it's "what does this specific workload need for the next 3 years?".

When it makes sense to start the conversation

Signs it's worth running the numbers:

  • Your cloud bill is past $30–50K/month and keeps climbing.
  • More than 60% of the cost is predictable compute + egress, not burst.
  • You run GPU in production with continuous inference.
  • You have regulatory requirements that data sovereignty solves better than contractual compliance.
  • Your team has the ops maturity to run Kubernetes/Postgres/Redis outside managed services — or is willing to pay the learning tax.

If none of that is true, stay in the cloud. Seriously.

My take

The cloud was treated as a destination. It's actually a tool.

For a new product, pre-PMF, with unpredictable traffic and a small team, the cloud is still the right answer almost 100% of the time. The cost of ops is way higher than the provider's margin.

For a mature product, with predictable load and tight margins, the math flips. And when it flips, it flips hard — not 10–20%, but 70–90% cost reduction, as 37signals proved.

2026 is reminding us of something we'd quietly forgotten: infrastructure is a strategic decision, not a default.

The question I'll leave you with: do you actually know how much of your cloud bill is real elasticity, and how much is just inertia?

Thiago Marinho

May 15, 2026 · Brazil