Faceted Search: Implementation Guide for Elasticsearch and OpenSearch

Faceted search lets users filter search results by structured attributes — brand, price range, size, color, rating — while showing how many results match each filter value. It's the navigation pattern behind virtually every e-commerce site, job board, and content catalog.

In Elasticsearch and OpenSearch, faceted search is built on aggregations combined with post-filter to maintain accurate facet counts while applying user-selected filters.

The Core Pattern

A faceted search query needs to do two things simultaneously:

Return filtered results matching the user's selected facets
Return facet counts showing how many results match each possible facet value — including values the user hasn't selected yet

This creates a tension: if you filter results by color:red, the color facet should still show counts for blue, green, and black (so the user can change their selection). But the price and brand facets should reflect the filtered results (only red items).

The solution: use post_filter for the user's active facet selections, and aggs at the query level for facet counts.

Basic Implementation

Index Setup

PUT /products
{
  "mappings": {
    "properties": {
      "name": { "type": "text" },
      "brand": { "type": "keyword" },
      "color": { "type": "keyword" },
      "size": { "type": "keyword" },
      "price": { "type": "float" },
      "rating": { "type": "float" },
      "category": { "type": "keyword" },
      "in_stock": { "type": "boolean" }
    }
  }
}

Facet fields must be keyword type (or numeric for ranges). You cannot aggregate on text fields without a keyword sub-field.

Query with Facets (No Filters Applied)

When no facets are selected, return all results with facet counts:

POST /products/_search
{
  "query": { "match": { "name": "running shoes" } },
  "aggs": {
    "brands": {
      "terms": { "field": "brand", "size": 20 }
    },
    "colors": {
      "terms": { "field": "color", "size": 20 }
    },
    "sizes": {
      "terms": { "field": "size", "size": 20 }
    },
    "price_ranges": {
      "range": {
        "field": "price",
        "ranges": [
          { "to": 50, "key": "Under $50" },
          { "from": 50, "to": 100, "key": "$50-$100" },
          { "from": 100, "to": 200, "key": "$100-$200" },
          { "from": 200, "key": "$200+" }
        ]
      }
    },
    "avg_rating": {
      "range": {
        "field": "rating",
        "ranges": [
          { "from": 4, "key": "4+ stars" },
          { "from": 3, "key": "3+ stars" }
        ]
      }
    }
  }
}

Query with Active Facet Filters

When the user selects color: red and brand: Nike, you need:

Results: only red Nike running shoes
Color facet: counts for all colors (among Nike running shoes)
Brand facet: counts for all brands (among red running shoes)
Price/rating facets: counts filtered to red Nike running shoes

This requires the post_filter + filtered aggregations pattern:

POST /products/_search
{
  "query": {
    "match": { "name": "running shoes" }
  },
  "post_filter": {
    "bool": {
      "filter": [
        { "term": { "color": "red" } },
        { "term": { "brand": "Nike" } }
      ]
    }
  },
  "aggs": {
    "all_colors": {
      "aggs": {
        "colors": {
          "terms": { "field": "color", "size": 20 }
        }
      },
      "filter": {
        "bool": {
          "filter": [
            { "term": { "brand": "Nike" } }
          ]
        }
      }
    },
    "all_brands": {
      "aggs": {
        "brands": {
          "terms": { "field": "brand", "size": 20 }
        }
      },
      "filter": {
        "bool": {
          "filter": [
            { "term": { "color": "red" } }
          ]
        }
      }
    },
    "filtered_price_ranges": {
      "aggs": {
        "price_ranges": {
          "range": {
            "field": "price",
            "ranges": [
              { "to": 50, "key": "Under $50" },
              { "from": 50, "to": 100, "key": "$50-$100" },
              { "from": 100, "to": 200, "key": "$100-$200" },
              { "from": 200, "key": "$200+" }
            ]
          }
        }
      },
      "filter": {
        "bool": {
          "filter": [
            { "term": { "color": "red" } },
            { "term": { "brand": "Nike" } }
          ]
        }
      }
    }
  }
}

The pattern: each facet's aggregation is wrapped in a filter that includes all active filters except the one for that facet. This ensures the color facet shows counts unaffected by the color filter (so the user can see other color options), while still reflecting the brand filter.

When users can select multiple values within a facet (e.g., color: red OR blue), use a terms filter:

"post_filter": {
  "bool": {
    "filter": [
      { "terms": { "color": ["red", "blue"] } },
      { "term": { "brand": "Nike" } }
    ]
  }
}

The aggregation filter for the color facet still excludes the color filter entirely, so users see accurate counts for all colors.

For category hierarchies (e.g., Clothing > Shoes > Running Shoes), store the full path:

POST /products/_doc/1
{
  "name": "Nike Air Zoom",
  "category_l1": "Clothing",
  "category_l2": "Shoes",
  "category_l3": "Running Shoes"
}

Build aggregations at each level:

"aggs": {
  "categories_l1": {
    "terms": { "field": "category_l1", "size": 20 },
    "aggs": {
      "categories_l2": {
        "terms": { "field": "category_l2", "size": 20 },
        "aggs": {
          "categories_l3": {
            "terms": { "field": "category_l3", "size": 20 }
          }
        }
      }
    }
  }
}

This returns nested counts: Clothing (500) → Shoes (200) → Running Shoes (75).

Performance Optimization

Faceted search queries are expensive because they combine a search query, multiple aggregations, and post-filtering. Here's how to keep them fast:

1. Use keyword Fields, Not text

Aggregations on keyword fields use doc values (column-oriented storage) and are dramatically faster than text field aggregations that require fielddata.

2. Limit Aggregation Size

Only request the number of buckets you'll display:

{ "terms": { "field": "brand", "size": 20 } }

Don't use "size": 10000 if you're only showing the top 20 values.

3. Use Execution Hint for High-Cardinality Facets

{
  "terms": {
    "field": "brand",
    "size": 20,
    "execution_hint": "map"
  }
}

The map execution hint can be faster for low-cardinality fields. For high-cardinality fields (1000+ unique values), the default global_ordinals is usually better.

4. Cache Aggressively

Faceted search queries with the same filters are highly cacheable. Ensure the request cache is enabled:

PUT /products/_settings
{
  "index.requests.cache.enable": true
}

5. Reduce Shard Count for Small Indices

Each shard adds aggregation overhead. If your product catalog is under 10 GB, a single shard per index is often optimal for aggregation performance.

6. Pre-Compute Common Facets

For facets that rarely change (top-level categories, brand list), cache the aggregation results in your application layer with a short TTL rather than re-querying Elasticsearch on every request.

Frequently Asked Questions

Q: Why are my facet counts wrong after applying a filter?

You're likely applying filters in the main query instead of post_filter, which reduces the aggregation scope. Use post_filter for user-selected facets, and wrap each aggregation in a filter that excludes its own facet's filter.

Q: How do I show a "selected" state for active facet values?

Track selected values in your application state. When rendering facets from aggregation results, check if each bucket key exists in the user's active selections and render it as checked/selected.

Q: Can I sort facet values by something other than count?

Yes. Use "order" in the terms aggregation:

{ "terms": { "field": "brand", "size": 20, "order": { "_key": "asc" } } }

Sort by _key for alphabetical, _count for by count (default), or by a sub-aggregation metric.

Q: How do I handle facets with thousands of unique values (e.g., brands)?

Show only the top N values with a "Show more" option. Use the include/exclude parameters to filter aggregation buckets server-side, or use a composite aggregation for paginating through all values.

Q: What's the impact of faceted search on cluster resources?

Aggregations are the most resource-intensive part. Each terms aggregation builds a hash map of unique values per shard. With 5 facets across 10 shards, that's 50 hash maps per query. Monitor heap usage and consider reducing shard count or caching results for high-traffic faceted search pages.