The Moving Function Aggregation in Elasticsearch is a powerful tool for performing calculations on a sliding window of data, typically used in time series analysis. It allows you to apply a custom script to a specified number of surrounding data points, enabling complex calculations like moving averages, cumulative sums, or any custom logic you define.
Syntax and Documentation
The basic syntax for a Moving Function Aggregation is:
{
"moving_fn": {
"buckets_path": "the_sum",
"window": 10,
"script": "MovingFunctions.sum(values)"
}
}
For detailed information and advanced usage, refer to the official Elasticsearch documentation on Moving Function Aggregation.
Example Usage
Here's an example of using the Moving Function Aggregation to calculate a 7-day moving average of daily sales:
{
"aggs": {
"sales_per_day": {
"date_histogram": {
"field": "date",
"calendar_interval": "day"
},
"aggs": {
"daily_sales": {
"sum": { "field": "sales" }
}
}
},
"weekly_moving_avg": {
"moving_fn": {
"buckets_path": "sales_per_day>daily_sales",
"window": 7,
"script": "MovingFunctions.unweightedAvg(values)"
}
}
}
}
Common Issues
- Insufficient data: Ensure you have enough data points to cover the specified window size.
- Script errors: Double-check your custom scripts for syntax errors or logical issues.
- Performance impact: Large window sizes can affect query performance, especially on large datasets.
Best Practices
- Start with built-in functions like
MovingFunctions.sum()
orMovingFunctions.unweightedAvg()
before writing custom scripts. - Use appropriate window sizes based on your data and analysis requirements.
- Consider using the
shift
parameter to offset the window if you need to look at past or future data points.
Frequently Asked Questions
Q: Can I use Moving Function Aggregation with non-numeric data?
A: Moving Function Aggregation is primarily designed for numeric data. While you can write custom scripts to handle non-numeric data, it's generally more efficient and practical to use it with numeric values.
Q: How does the window size affect the results?
A: The window size determines how many data points are considered in each calculation. A larger window will smooth out short-term fluctuations but may be less responsive to recent changes. A smaller window will be more sensitive to recent data but may be more volatile.
Q: Can I combine Moving Function Aggregation with other aggregations?
A: Yes, Moving Function Aggregation can be combined with other aggregations. It's often used in conjunction with date histograms or other bucket aggregations to analyze time-series data.
Q: What's the difference between Moving Function and Moving Average aggregations?
A: Moving Average is a specific type of moving function that calculates the average of a window of data points. Moving Function is more flexible, allowing you to apply any custom calculation to the window of data.
Q: How can I handle missing data points in my Moving Function Aggregation?
A: You can use the gap_policy
parameter to specify how to handle gaps in your data. Options include "skip" (ignore missing data), "insert_zeros" (treat missing data as zero), or "keep_values" (use the last known value).