Elasticsearch Data Tiers: Configuration, Common Issues, and Troubleshooting

Elasticsearch data tiers formalize the hot-warm-cold architecture that operators previously built using custom node attributes. Each tier is a set of nodes sharing the same data role, and the system uses the _tier_preference index setting to route indices to the right nodes. Getting the configuration wrong leads to unassigned shards, yellow cluster health, and indices that never move off the hot tier.

The Four Tiers and Node Roles

Each tier maps directly to a node role set in elasticsearch.yml:

# Hot node
node.roles: [data_hot, data_content]

# Warm node
node.roles: [data_warm]

# Cold node
node.roles: [data_cold]

# Frozen node
node.roles: [data_frozen]

The data_hot tier holds newly ingested time-series data. These nodes need fast storage (NVMe or SSD) and enough CPU to handle both indexing and search. The data_content role handles non-time-series indices like lookup tables or application data - it is typically co-located with data_hot.

The data_warm tier stores data that is still queried but no longer actively indexed. You can use cheaper storage here since the write load drops to zero outside of merges and ILM operations. The data_cold tier takes this further, typically using fully mounted searchable snapshots so the data lives in both a snapshot repository and a full local cache. The data_frozen tier is the most cost-efficient: it uses partially mounted searchable snapshots that only cache recently accessed data regions, pulling the rest on demand from the snapshot repository.

A node with the generic data role acts as all tiers simultaneously - data_hot, data_warm, data_cold, data_content. This works for development but defeats the purpose of tiered hardware in production.

How ILM Moves Indices Between Tiers

Index Lifecycle Management automatically injects a migrate action into the warm, cold, and frozen phases. You do not need to add it explicitly. When an index enters the warm phase, ILM sets index.routing.allocation.include._tier_preference to data_warm,data_hot. The cold phase sets it to data_cold,data_warm,data_hot. The frozen phase sets it to data_frozen.

The fallback list matters. If no data_warm nodes exist, an index in the warm phase falls back to data_hot nodes. This keeps the index allocated and healthy, but it also means the index stays on expensive hardware - silently undermining your cost optimization. The frozen phase has no fallback: if no data_frozen nodes exist, the index cannot be allocated.

You can disable automatic migration by explicitly adding the migrate action with "enabled": false in your ILM policy. This is useful when you need custom allocation rules via the allocate action, but it also means you take responsibility for setting _tier_preference yourself.

PUT _ilm/policy/my_policy
{
  "policy": {
    "phases": {
      "warm": {
        "actions": {
          "migrate": { "enabled": false },
          "allocate": {
            "require": { "box_type": "warm" }
          }
        }
      }
    }
  }
}

The _tier_preference Setting and Allocation

The _tier_preference setting is an ordered list of tier names. Elasticsearch tries the first tier in the list. If no nodes with that role exist, it tries the next. This differs from the older _tier attribute in that it is a preference chain, not a hard requirement.

When you create a data stream, Elasticsearch automatically sets _tier_preference to data_hot. For regular indices created outside of data streams, no _tier_preference is set by default - the index allocates to any available data node.

You can inspect an index's current tier preference directly:

GET my-index/_settings/index.routing.allocation.include._tier_preference

If an index is stuck on the wrong tier, check whether a conflicting index.routing.allocation.require or index.routing.allocation.include setting is overriding the tier preference. Legacy allocation filters take precedence and can prevent tier-based routing entirely.

Common Issues and Troubleshooting

The most frequent problem is indices stuck in an UNASSIGNED state because no nodes with the required tier role exist. This happens when you define a cold or frozen phase in your ILM policy but never deploy nodes with data_cold or data_frozen roles. The cluster allocation explain API shows the root cause:

GET _cluster/allocation/explain
{
  "index": "my-index-000001",
  "shard": 0,
  "primary": true
}

The response will include a message like "node does not match index setting [index.routing.allocation.include] filters [_tier_preference]". The fix is either deploying nodes with the required role or adjusting the ILM policy to remove the phase that requires the missing tier.

Yellow cluster health from tier mismatches is another common scenario. If your warm tier has fewer nodes than the replica count of an index transitioning to warm, replicas will remain unassigned. The primary shards allocate fine, but the cluster stays yellow. Either add warm-tier nodes or reduce the replica count via the ILM allocate action before migration.

Indices piling up on the hot tier is a subtler problem. If ILM execution is paused, or the policy's min_age thresholds are too conservative, data never migrates. Check ILM status with GET _ilm/status and review per-index state with GET my-index/_ilm/explain.

Legacy Node Attributes, Searchable Snapshots, and Sizing

Before Elasticsearch 7.10, the standard approach was custom node attributes. You would set node.attr.box_type: hot in elasticsearch.yml and use index.routing.allocation.require.box_type: warm in ILM policies. This still works, but it has drawbacks. ILM cannot automatically inject migration actions when custom attributes are in use. You must manually configure allocation rules for every phase. There is no fallback chain - if a required attribute value does not exist, the index stays unassigned with no graceful degradation. The POST _ilm/migrate_to_data_tiers API converts existing ILM policies, index templates, and indices from custom attribute-based allocation to tier-based allocation. Run it once during migration and verify the results. Mixed configurations - some indices using attributes, others using tiers - create confusion and should be avoided.

The cold and frozen tiers interact directly with searchable snapshots. In the cold tier, ILM creates a fully mounted searchable snapshot: all data is cached locally, and the snapshot repository serves as a backup. Search performance is comparable to a regular index. In the frozen tier, ILM creates a partially mounted searchable snapshot: only recently accessed data lives in a shared local cache, with the rest fetched from the snapshot repository on demand.

Sizing differs sharply across tiers. Hot nodes need fast storage (NVMe or SSD) sized for your active write and search workload - plan for the index size multiplied by the number of replicas plus headroom for merges. Warm nodes can use larger, slower disks since there is no write pressure. Cold nodes need enough local storage to fully cache the mounted snapshots. Frozen nodes are the most storage-efficient: the shared cache is typically 10-20% of the total data size, with the rest offloaded to object storage like S3, GCS, or Azure Blob Storage. Size the frozen cache based on your search working set, not total data volume.