The data_streams.lifecycle.retention.default
setting in Elasticsearch controls the default retention period for data streams. It determines how long data is kept in a data stream before it is automatically deleted.
- Default value: None (no default retention period)
- Possible values: Time units (e.g., "30d" for 30 days, "6m" for 6 months)
- Recommendations: Set this value based on your data retention requirements and storage capacity
This setting provides a cluster-wide default retention period for all data streams. When set, it applies to any data stream that doesn't have a specific retention period defined in its index lifecycle policy.
This setting is available in Elasticsearch version 7.13 and later.
Example
To set the default retention period for all data streams to 90 days:
PUT _cluster/settings
{
"persistent": {
"data_streams.lifecycle.retention.default": "90d"
}
}
Changing this setting can help manage storage costs and comply with data retention policies. It ensures that old data is automatically removed from your data streams after the specified period.
Common Issues and Misuses
- Setting too short a retention period may result in premature data loss
- Setting too long a retention period can lead to excessive storage consumption
- Forgetting to consider this setting when planning storage capacity
Do's and Don'ts
Do's:
- Align the retention period with your business and compliance requirements
- Monitor storage usage regularly to ensure the retention period is appropriate
- Use more specific retention settings in individual index lifecycle policies when needed
Don'ts:
- Don't set a retention period shorter than your longest-running queries or analytics jobs
- Don't rely solely on this setting for critical data retention policies; use it in conjunction with other data management practices
- Don't forget to adjust this setting when your data retention requirements change
Frequently Asked Questions
Q: How does this setting interact with index lifecycle policies?
A: This setting provides a default retention period for data streams that don't have a specific retention period defined in their index lifecycle policy. If an index lifecycle policy specifies a retention period, it will override this default setting for that particular data stream.
Q: Can I change the retention period for an existing data stream?
A: Yes, you can change the retention period for an existing data stream by updating its index lifecycle policy. The default setting will only apply to new data streams or those without a specific retention period in their policy.
Q: What happens to the data when the retention period is reached?
A: When data in a stream reaches the retention period, Elasticsearch will automatically delete the affected backing indices, removing the old data from the stream.
Q: Does this setting affect all indices or only data streams?
A: This setting only affects data streams. Regular indices are not impacted by this setting and should be managed using index lifecycle policies or other methods.
Q: How can I monitor the effects of this setting on my cluster?
A: You can use Elasticsearch's monitoring features, such as the Index Management UI in Kibana, to track the lifecycle of your data streams and observe when indices are deleted due to the retention policy.