Elastic Cloud Infrastructure: Containers, Services, and How Deployments Work

Elastic Cloud is a managed service, but understanding what runs under the hood helps you make better decisions about deployment sizing, troubleshooting performance, and evaluating the platform against self-managed alternatives.

This guide explains how Elastic Cloud (and Elastic Cloud Enterprise) containerizes and orchestrates Elasticsearch, Kibana, and supporting services.

Deployment Architecture Overview

When you create an Elastic Cloud deployment, the platform provisions a set of containers (not Kubernetes pods — Elastic uses its own orchestration layer) spread across allocator hosts. A deployment typically includes:

Elasticsearch nodes: One or more containers running Elasticsearch, each assigned a role (data, master, ML, etc.)
Kibana instance: A container running Kibana, connected to the Elasticsearch cluster
APM Server (if enabled): Application Performance Monitoring ingest container
Enterprise Search (if enabled): App Search and Workplace Search containers
Integrations Server (if enabled): Fleet and Elastic Agent management

Each component runs as an isolated container with dedicated CPU, memory, and storage allocations.

The Allocator

The allocator is the core of Elastic Cloud's infrastructure. Allocators are physical or virtual hosts that run deployment containers.

How Allocation Works

When you create or resize a deployment:

The constructor service receives your deployment configuration
It calculates the required resources (memory, storage, CPU) for each component
The allocator scheduler places containers on hosts with available capacity
Containers are started with resource limits enforced by cgroups

Resource Isolation

Each container gets:

Dedicated memory: Your selected RAM allocation maps directly to JVM heap + OS overhead
CPU shares: Proportional CPU access based on memory allocation (not dedicated cores in most configurations)
Dedicated storage: Allocated disk space for indices, translog, and temporary files

CPU is the key shared resource. A "4 GB" Elasticsearch node doesn't get dedicated CPU cores — it gets proportional CPU time. During periods of heavy load on the allocator host, CPU contention between co-located containers can cause latency spikes.

The Proxy Layer

All traffic to Elastic Cloud deployments routes through a proxy layer:

Client → Elastic Cloud Proxy → Elasticsearch Container

The proxy handles:

TLS termination: All connections are encrypted in transit
Request routing: Directs traffic to the correct deployment based on the deployment ID in the hostname
Load balancing: Distributes requests across Elasticsearch nodes in the deployment
Authentication: Validates deployment credentials before forwarding requests
Rate limiting: Protects against runaway clients

Proxy Impact on Performance

The proxy layer adds a small amount of latency (typically 1–5 ms) to every request. For latency-sensitive applications, this overhead is usually negligible compared to Elasticsearch processing time. However, for very high-throughput, low-latency workloads, it's worth benchmarking against self-managed deployments where clients connect directly to Elasticsearch.

Storage Architecture

Deployment Storage

Elastic Cloud uses network-attached storage (cloud provider EBS, Persistent Disks, or Azure Managed Disks) for Elasticsearch data. This enables:

Snapshotting: Transparent integration with object storage for backups
Resizing: Storage can be increased without data migration
Persistence: Data survives container restarts and host failures

Storage performance depends on the cloud provider's volume type. Elastic Cloud typically uses SSD-backed volumes for hot data and HDD-backed volumes for warm/cold tiers.

Data Tiers in Elastic Cloud

Tier	Storage Type	Use Case
Hot	High-performance SSD	Active indexing and frequent queries
Warm	Standard SSD	Less frequent queries, older data
Cold	HDD or standard storage	Rare queries, long retention
Frozen	Object storage (S3/GCS/Azure Blob)	Archival, searchable snapshots

The frozen tier uses searchable snapshots — data lives in object storage and is cached locally on demand. This dramatically reduces storage costs for archival data while keeping it searchable.

Services Behind a Deployment

Constructor

The constructor manages deployment lifecycle — creation, configuration changes, plan changes (scaling), and deletion. When you resize a deployment in the console, the constructor:

Plans the new configuration
Coordinates container migrations if needed
Ensures zero-downtime by using rolling restarts

ZooKeeper / Coordination

Elastic Cloud uses ZooKeeper (or equivalent) for internal coordination — tracking allocator health, container placement, and deployment state. This is separate from any Elasticsearch internal coordination.

Monitoring Cluster

Each Elastic Cloud region runs a dedicated monitoring cluster that collects metrics and logs from all deployments. This powers the deployment health dashboards and alerts in the Elastic Cloud console.

Snapshot Repository

Elastic Cloud automatically configures a snapshot repository (S3, GCS, or Azure Blob) for every deployment. Snapshots are taken automatically and retained based on your configuration. You don't need to manage snapshot infrastructure.

Elastic Cloud Enterprise (ECE): The Same Architecture, Your Infrastructure

ECE runs the same orchestration layer (allocators, constructors, proxy) on your own servers. This gives you:

The Elastic Cloud management experience on-premises or in your own cloud account
Multi-tenant deployment orchestration on shared infrastructure
The same rolling upgrade and resize capabilities

ECE adds operational burden (you manage the allocator hosts, ZooKeeper, and proxy infrastructure) but provides data sovereignty and network control that hosted Elastic Cloud doesn't offer.

How Deployment Sizing Maps to Infrastructure

When you choose "8 GB RAM, 2 availability zones" in the Elastic Cloud console:

Setting	What Actually Happens
8 GB RAM	Two Elasticsearch containers, each with 4 GB heap + ~4 GB OS buffer
2 availability zones	Containers placed on allocators in different AZs
240 GB storage	Network-attached SSD volumes divided across containers
High availability	Primary and replica shards distributed across AZs

The "8 GB" label refers to total memory across the deployment. With 2 AZs, each node gets half. This is important for JVM heap sizing — an "8 GB, 2 AZ" deployment has 4 GB heap per node, not 8 GB.

Performance Characteristics

Advantages of the Managed Infrastructure

Automatic rolling upgrades: Zero-downtime version upgrades with container orchestration
Self-healing: Failed containers are automatically restarted or relocated
Snapshot management: Automated backups without configuration
Integrated monitoring: Built-in dashboards without Prometheus/Grafana setup

Limitations to Understand

CPU sharing: Containers share CPU with neighbors. Noisy neighbor effects are possible during peak loads.
Network-attached storage: Lower IOPS than local NVMe SSDs. For extremely I/O-intensive workloads, self-managed with local SSDs may perform better.
Proxy overhead: Every request traverses the proxy layer, adding small but non-zero latency.
Configuration constraints: You can't tune OS-level settings (vm.max_map_count, I/O schedulers) or Elasticsearch settings that Elastic manages internally.
Instance sizes are fixed: You choose from predefined RAM tiers rather than arbitrary CPU/memory combinations.

Monitoring Your Deployment's Infrastructure

The Elastic Cloud console provides:

CPU utilization: Shows how much of your allocated CPU share is being used
Memory pressure: JVM heap usage and garbage collection metrics
Disk usage: Current utilization and growth rate
I/O metrics: Read/write throughput and queue depth

For deeper analysis, use the Stack Monitoring features in Kibana, which provide node-level, index-level, and shard-level metrics.

For production deployments where you need proactive analysis rather than just dashboards, AI-powered monitoring platforms like Pulse provide continuous health assessment, root-cause analysis, and optimization recommendations that go beyond what the built-in monitoring surfaces.

Frequently Asked Questions

Q: Are Elastic Cloud containers Docker containers?

No. Elastic Cloud uses its own container technology, not Docker or Kubernetes. Containers are isolated using cgroups and namespaces, but the orchestration layer is proprietary. Elastic Cloud on Kubernetes (ECK) is a different product that uses Kubernetes-native orchestration.

Q: Can I SSH into my Elastic Cloud instances?

No. Elastic Cloud is a fully managed service — you don't have shell access to the underlying infrastructure. For OS-level debugging, you'll need to work with Elastic support.

Q: How does Elastic Cloud handle hardware failures?

Containers on failed allocators are automatically rescheduled to healthy hosts. With multi-AZ deployments, replica shards in the surviving AZ keep the cluster operational during failover. Recovery is automatic but may take minutes depending on shard sizes.

Q: Is there a performance difference between Elastic Cloud regions?

Performance varies by cloud provider and instance generation in each region. Newer regions may have access to more recent hardware. If latency to your users matters, choose the region closest to your application servers, not your users.

Q: How do I get the best performance from Elastic Cloud?

Right-size your deployment based on actual usage (not theoretical peak), use data tiers to move old data to cheaper storage, optimize your index mappings and queries before scaling up compute, and use dedicated ML nodes if running ML jobs to avoid impacting search performance.