Elastic Cloud is a managed service, but understanding what runs under the hood helps you make better decisions about deployment sizing, troubleshooting performance, and evaluating the platform against self-managed alternatives.
This guide explains how Elastic Cloud (and Elastic Cloud Enterprise) containerizes and orchestrates Elasticsearch, Kibana, and supporting services.
Deployment Architecture Overview
When you create an Elastic Cloud deployment, the platform provisions a set of containers (not Kubernetes pods — Elastic uses its own orchestration layer) spread across allocator hosts. A deployment typically includes:
- Elasticsearch nodes: One or more containers running Elasticsearch, each assigned a role (data, master, ML, etc.)
- Kibana instance: A container running Kibana, connected to the Elasticsearch cluster
- APM Server (if enabled): Application Performance Monitoring ingest container
- Enterprise Search (if enabled): App Search and Workplace Search containers
- Integrations Server (if enabled): Fleet and Elastic Agent management
Each component runs as an isolated container with dedicated CPU, memory, and storage allocations.
The Allocator
The allocator is the core of Elastic Cloud's infrastructure. Allocators are physical or virtual hosts that run deployment containers.
How Allocation Works
When you create or resize a deployment:
- The constructor service receives your deployment configuration
- It calculates the required resources (memory, storage, CPU) for each component
- The allocator scheduler places containers on hosts with available capacity
- Containers are started with resource limits enforced by cgroups
Resource Isolation
Each container gets:
- Dedicated memory: Your selected RAM allocation maps directly to JVM heap + OS overhead
- CPU shares: Proportional CPU access based on memory allocation (not dedicated cores in most configurations)
- Dedicated storage: Allocated disk space for indices, translog, and temporary files
CPU is the key shared resource. A "4 GB" Elasticsearch node doesn't get dedicated CPU cores — it gets proportional CPU time. During periods of heavy load on the allocator host, CPU contention between co-located containers can cause latency spikes.
The Proxy Layer
All traffic to Elastic Cloud deployments routes through a proxy layer:
Client → Elastic Cloud Proxy → Elasticsearch Container
The proxy handles:
- TLS termination: All connections are encrypted in transit
- Request routing: Directs traffic to the correct deployment based on the deployment ID in the hostname
- Load balancing: Distributes requests across Elasticsearch nodes in the deployment
- Authentication: Validates deployment credentials before forwarding requests
- Rate limiting: Protects against runaway clients
Proxy Impact on Performance
The proxy layer adds a small amount of latency (typically 1–5 ms) to every request. For latency-sensitive applications, this overhead is usually negligible compared to Elasticsearch processing time. However, for very high-throughput, low-latency workloads, it's worth benchmarking against self-managed deployments where clients connect directly to Elasticsearch.
Storage Architecture
Deployment Storage
Elastic Cloud uses network-attached storage (cloud provider EBS, Persistent Disks, or Azure Managed Disks) for Elasticsearch data. This enables:
- Snapshotting: Transparent integration with object storage for backups
- Resizing: Storage can be increased without data migration
- Persistence: Data survives container restarts and host failures
Storage performance depends on the cloud provider's volume type. Elastic Cloud typically uses SSD-backed volumes for hot data and HDD-backed volumes for warm/cold tiers.
Data Tiers in Elastic Cloud
| Tier | Storage Type | Use Case |
|---|---|---|
| Hot | High-performance SSD | Active indexing and frequent queries |
| Warm | Standard SSD | Less frequent queries, older data |
| Cold | HDD or standard storage | Rare queries, long retention |
| Frozen | Object storage (S3/GCS/Azure Blob) | Archival, searchable snapshots |
The frozen tier uses searchable snapshots — data lives in object storage and is cached locally on demand. This dramatically reduces storage costs for archival data while keeping it searchable.
Services Behind a Deployment
Constructor
The constructor manages deployment lifecycle — creation, configuration changes, plan changes (scaling), and deletion. When you resize a deployment in the console, the constructor:
- Plans the new configuration
- Coordinates container migrations if needed
- Ensures zero-downtime by using rolling restarts
ZooKeeper / Coordination
Elastic Cloud uses ZooKeeper (or equivalent) for internal coordination — tracking allocator health, container placement, and deployment state. This is separate from any Elasticsearch internal coordination.
Monitoring Cluster
Each Elastic Cloud region runs a dedicated monitoring cluster that collects metrics and logs from all deployments. This powers the deployment health dashboards and alerts in the Elastic Cloud console.
Snapshot Repository
Elastic Cloud automatically configures a snapshot repository (S3, GCS, or Azure Blob) for every deployment. Snapshots are taken automatically and retained based on your configuration. You don't need to manage snapshot infrastructure.
Elastic Cloud Enterprise (ECE): The Same Architecture, Your Infrastructure
ECE runs the same orchestration layer (allocators, constructors, proxy) on your own servers. This gives you:
- The Elastic Cloud management experience on-premises or in your own cloud account
- Multi-tenant deployment orchestration on shared infrastructure
- The same rolling upgrade and resize capabilities
ECE adds operational burden (you manage the allocator hosts, ZooKeeper, and proxy infrastructure) but provides data sovereignty and network control that hosted Elastic Cloud doesn't offer.
How Deployment Sizing Maps to Infrastructure
When you choose "8 GB RAM, 2 availability zones" in the Elastic Cloud console:
| Setting | What Actually Happens |
|---|---|
| 8 GB RAM | Two Elasticsearch containers, each with 4 GB heap + ~4 GB OS buffer |
| 2 availability zones | Containers placed on allocators in different AZs |
| 240 GB storage | Network-attached SSD volumes divided across containers |
| High availability | Primary and replica shards distributed across AZs |
The "8 GB" label refers to total memory across the deployment. With 2 AZs, each node gets half. This is important for JVM heap sizing — an "8 GB, 2 AZ" deployment has 4 GB heap per node, not 8 GB.
Performance Characteristics
Advantages of the Managed Infrastructure
- Automatic rolling upgrades: Zero-downtime version upgrades with container orchestration
- Self-healing: Failed containers are automatically restarted or relocated
- Snapshot management: Automated backups without configuration
- Integrated monitoring: Built-in dashboards without Prometheus/Grafana setup
Limitations to Understand
- CPU sharing: Containers share CPU with neighbors. Noisy neighbor effects are possible during peak loads.
- Network-attached storage: Lower IOPS than local NVMe SSDs. For extremely I/O-intensive workloads, self-managed with local SSDs may perform better.
- Proxy overhead: Every request traverses the proxy layer, adding small but non-zero latency.
- Configuration constraints: You can't tune OS-level settings (vm.max_map_count, I/O schedulers) or Elasticsearch settings that Elastic manages internally.
- Instance sizes are fixed: You choose from predefined RAM tiers rather than arbitrary CPU/memory combinations.
Monitoring Your Deployment's Infrastructure
The Elastic Cloud console provides:
- CPU utilization: Shows how much of your allocated CPU share is being used
- Memory pressure: JVM heap usage and garbage collection metrics
- Disk usage: Current utilization and growth rate
- I/O metrics: Read/write throughput and queue depth
For deeper analysis, use the Stack Monitoring features in Kibana, which provide node-level, index-level, and shard-level metrics.
For production deployments where you need proactive analysis rather than just dashboards, AI-powered monitoring platforms like Pulse provide continuous health assessment, root-cause analysis, and optimization recommendations that go beyond what the built-in monitoring surfaces.
Frequently Asked Questions
Q: Are Elastic Cloud containers Docker containers?
No. Elastic Cloud uses its own container technology, not Docker or Kubernetes. Containers are isolated using cgroups and namespaces, but the orchestration layer is proprietary. Elastic Cloud on Kubernetes (ECK) is a different product that uses Kubernetes-native orchestration.
Q: Can I SSH into my Elastic Cloud instances?
No. Elastic Cloud is a fully managed service — you don't have shell access to the underlying infrastructure. For OS-level debugging, you'll need to work with Elastic support.
Q: How does Elastic Cloud handle hardware failures?
Containers on failed allocators are automatically rescheduled to healthy hosts. With multi-AZ deployments, replica shards in the surviving AZ keep the cluster operational during failover. Recovery is automatic but may take minutes depending on shard sizes.
Q: Is there a performance difference between Elastic Cloud regions?
Performance varies by cloud provider and instance generation in each region. Newer regions may have access to more recent hardware. If latency to your users matters, choose the region closest to your application servers, not your users.
Q: How do I get the best performance from Elastic Cloud?
Right-size your deployment based on actual usage (not theoretical peak), use data tiers to move old data to cheaper storage, optimize your index mappings and queries before scaling up compute, and use dedicated ML nodes if running ML jobs to avoid impacting search performance.