Elasticsearch index.translog.durability Setting

The index.translog.durability setting in Elasticsearch controls how often the translog is fsync'd to disk. This setting directly impacts the durability guarantees of write operations and the overall performance of the cluster.

Default value: request
Possible values: request, async
Recommendations: Use request for maximum durability, async for improved indexing performance at the cost of potential data loss in case of hardware failure.

The translog is a write-ahead log used by Elasticsearch to ensure data consistency. When set to request, Elasticsearch ensures that the translog is fsync'd to disk after every write request. This provides the highest level of durability but can impact indexing performance. When set to async, Elasticsearch will fsync the translog every index.translog.sync_interval (default: 5s), potentially improving indexing performance but increasing the risk of data loss in case of sudden node failure.

Example

To change the translog durability setting for an index:

PUT /my_index/_settings
{
  "index.translog.durability": "async"
}

This change might be desirable in scenarios where you need to prioritize indexing performance over absolute data consistency, such as during bulk ingestion of historical data where the source data is still available for re-indexing if needed.

Common Issues or Misuses

Setting index.translog.durability to async without understanding the potential data loss implications
Overusing request durability in write-heavy scenarios, leading to unnecessary performance bottlenecks
Failing to adjust related settings like index.translog.sync_interval when changing durability to async

Do's and Don'ts

Do use request durability for critical data where losing even a single write is unacceptable
Do consider async durability for bulk indexing jobs or scenarios where slight data loss is tolerable
Don't change this setting without thoroughly understanding its implications on data consistency and performance
Don't forget to test the impact of changing this setting in a non-production environment first
Do monitor your cluster's performance and adjust the setting as needed based on your specific use case

Frequently Asked Questions

Q: How does changing index.translog.durability to async affect write performance?
A: Setting index.translog.durability to async can significantly improve write performance by reducing the frequency of fsync operations. However, this comes at the cost of potentially losing the last few seconds of data in case of a sudden node failure.

Q: Can I change the index.translog.durability setting on a per-index basis?
A: Yes, you can configure this setting at the index level, allowing you to have different durability settings for different indices based on their specific requirements.

Q: What happens if a node crashes when index.translog.durability is set to async?
A: If a node crashes with async durability, you might lose the last few seconds of writes that haven't been fsync'd to disk. The exact amount of potential data loss depends on when the last fsync occurred relative to the crash.

Q: Is there a way to achieve a balance between performance and durability?
A: Yes, you can set index.translog.durability to async and adjust index.translog.sync_interval to find a balance that suits your needs. For example, setting a shorter sync interval can reduce the potential data loss window while still providing some performance benefits.

Q: How does index.translog.durability interact with replica shards?
A: The index.translog.durability setting affects how quickly changes are persisted on the primary shard. Replica shards receive these changes asynchronously, so they provide an additional layer of durability regardless of this setting. However, for maximum data safety, it's recommended to use request durability and wait for replication to complete.