Cloud Repatriation: Why Search Infrastructure Exposes the Limits of Cloud-First Thinking

Cloud repatriation isn’t a step backward—it’s a response to reality. Search and observability workloads, with their always-on nature and relentless data growth, are exposing the economic and operational limits of cloud-first thinking earlier than most systems. This article examines why search infrastructure is leading that shift.

The Great Awakening: When “Cloud First” Met Reality

For over a decade, the enterprise technology narrative followed a single trajectory: migrate everything to the public cloud. But in 2026, a strategic reversal is underway. 86% of Chief Information Officers now plan to move at least some public cloud workloads back to private cloud or on-premises environments, according to a 2024 Barclays CIO survey.

This shift—cloud repatriation—isn’t a retreat. It’s a calculated advance toward a “cloud-appropriate” strategy. And nowhere is this more visible than in search and observability infrastructure. Elasticsearch and OpenSearch deployments are becoming the canary in the coal mine, exposing the fundamental economics that make certain workloads unsuitable for public cloud at scale.

Search systems don’t fail quietly.

They surface economic and operational mismatches early — because they run continuously, grow relentlessly, and touch every part of the stack.

The Cost Structure That Breaks: Why Search Amplifies Cloud Pain

37signals' high-profile cloud exit has become the paradigmatic repatriation story. The company projects saving over $10 million over five years by moving workloads on-premises. When they departed AWS's S3 storage, AWS waived $250,000 in data egress fees—revealing the scale of "exit penalties" that trap customers.

But here's what makes search and observability workloads uniquely punishing in the cloud: they combine three cost-amplifying characteristics simultaneously.

Continuous high-volume ingestion. Search indices and log aggregation systems consume data constantly. Unlike application workloads with variable traffic, search infrastructure runs hot 24/7. You're paying peak pricing for baseline operations.

Storage that never shrinks. Observability data accumulates. Even with retention policies, search clusters grow inexorably. Cloud storage pricing—optimized for variable workloads—penalizes this predictable, persistent growth pattern.

Data transfer as a tax on your own data. Every query, every replica sync, every backup incurs transfer fees. Moving one petabyte from AWS S3 costs $90,000 to $120,000 in egress fees. For search workloads querying terabytes daily, this becomes an architectural tax on your own infrastructure.

Flexera's 2025 State of the Cloud Report confirms managing cloud spend is the top challenge for 84% of organizations. For search and observability specifically, the cost structure isn't a bug—it's the fundamental mismatch between workload characteristics and cloud pricing models.

The Control Imperative: Data Sovereignty as Forcing Function

Beyond economics, regulatory compliance is accelerating repatriation. 68% of Indian CIOs cite data security as their main cloud concern, driven by data residency requirements.

For organizations running search clusters handling sensitive logs, customer data, or proprietary intelligence, granular control over data location isn't optional. Post-repatriation, 92% of organizations report improved security posture—a remarkable validation of the control thesis.

The AI Paradox: Machine Learning Makes the Case for Self-Hosted

Ironically, AI—the technology many assumed would cement cloud dominance—is accelerating repatriation. 30% of high AI maturity firms plan to move nearly 12% of workloads back on-premises.

Why? Because AI-enhanced search exposes a triple bind:

GPU predictability destroys cloud economics. Machine learning workloads for semantic search, anomaly detection, and relevance tuning run continuously. They don't burst—they sustain. Cloud GPU pricing optimized for variable compute becomes prohibitively expensive for always-on inference.

Data gravity becomes absolute. Training and inference require data proximity. When your search corpus lives on-premises, shipping it to cloud GPUs for processing inverts the entire value proposition. The data doesn't move—the compute must.

Model serving costs compound exponentially. Both Elasticsearch and OpenSearch now embed sophisticated AI capabilities—native semantic search in Elastic, FAISS-powered vector search in OpenSearch. At enterprise query volumes, serving these models in the cloud can consume more budget than the underlying search infrastructure itself.

The 30-60% cost savings from self-hosted AI workloads aren't marginal optimizations. They represent the difference between AI features being economically viable or strategically unaffordable.

The Elasticsearch-OpenSearch Split: How Licensing Shapes Repatriation

The cloud repatriation trend unfolds against a fundamental fork in the search infrastructure world. In 2021, Elastic abandoned the Apache 2.0 open-source license for the proprietary SSPL and Elastic License v2—explicitly designed to prevent AWS from offering managed Elasticsearch services. Amazon responded by forking Elasticsearch into OpenSearch, maintaining Apache 2.0.

Five years later, this split creates divergent repatriation paths:

OpenSearch: Built for Self-Hosted Economics

OpenSearch positions as the natural repatriation choice through structural advantages:

Apache 2.0 eliminates licensing friction for organizations embedding search in products or building SaaS offerings
Enterprise security at zero cost: SSO, audit logging, and RBAC included in the standard distribution—capabilities requiring paid Elasticsearch licenses
Infrastructure-only pricing model: Organizations pay for hardware and operations only, with no software fees regardless of scale For large, predictable search workloads, OpenSearch's free enterprise features represent substantial cost avoidance.

Elasticsearch: The Performance and Integration Trade

Elasticsearch maintains compelling advantages despite licensing complexity:

Superior raw performance: 2x to 12x faster than OpenSearch for vector search, with 40-140% better complex query performance
Multi-cloud consistency: Elastic Cloud delivers uniform control planes across AWS, Azure, and GCP—critical for organizations maintaining hybrid strategies across providers
Integrated AI toolkit: Built-in semantic search versus OpenSearch's plugin architecture creates operational simplicity

The strategic choice: OpenSearch wins on licensing freedom and zero software costs. Elasticsearch wins on performance and multi-cloud portability.

The Break-Even Math: When Self-Hosted Infrastructure Wins

The repatriation decision ultimately reduces to total cost of ownership at scale:

Elastic Cloud (128GB cluster): ~$3,000/month
Self-Hosted Infrastructure (128GB equivalent): ~$1,800/month + operations

The break-even point arrives when predictability + scale + team capability converge. For organizations with dedicated operations teams and search expertise, self-hosted economics improve dramatically as cluster size grows. The cloud elasticity premium—valuable for variable workloads—becomes dead weight for the always-on, continuously-growing nature of search infrastructure.

The operational overhead is real: daily health monitoring, backup orchestration, security patching. But for medium-to-large deployments, the 30-60% infrastructure savings dwarf operational costs—especially when egress fees vanish entirely.

The Hybrid Pattern: Strategic Workload Placement

Cloud repatriation doesn't mean abandonment. The emerging model is workload-aware architecture. For Elasticsearch and OpenSearch, the pattern crystallizes: Self-Hosted/Private Cloud:

Production search indices with stable query patterns
Large-scale log aggregation and observability data
AI/ML workloads requiring dedicated GPU compute
Compliance-sensitive data requiring geographic control

Public Cloud:

Development and testing environments
Customer-facing search with unpredictable traffic spikes
Rapid prototyping and proof-of-concept projects

Both platforms support cross-cluster search, enabling unified experiences while optimizing infrastructure placement. Today, 70% of businesses embrace this balanced approach—treating infrastructure as a portfolio, not a religion.

Conclusion: Search Workloads Expose the Truth First

Cloud repatriation doesn't signal the death of public cloud. It marks the end of cloud dogmatism.

Search and observability infrastructure—with their continuous ingestion, persistent storage growth, and always-on AI inference—expose the economic realities faster than almost any other workload category. They serve as the leading indicator for a broader maturation: the recognition that infrastructure placement is not theology, but engineering.

The organizations thriving in 2026 have moved beyond "cloud first" to something more sophisticated: workload literacy. They understand that some workloads bloom in the cloud's elastic environment, while others—predictable, data-intensive, continuously operating—deliver superior economics and control when brought home.

The pendulum isn't swinging back to on-premises dogma. It's finding equilibrium in strategic placement.

And search infrastructure is teaching us where that equilibrium lives.