The synonym filter in Elasticsearch transforms query terms or indexed tokens into related terms, enabling search to match on words that mean the same thing. A search for "automobile" can match documents containing "car", "vehicle", or "auto".
Elasticsearch provides two synonym filters: synonym for simple token replacements and synonym_graph for multi-word expressions. This guide covers both, along with reloadable synonyms, scoring behavior, and production patterns.
The Two Synonym Filters
synonym
The synonym filter replaces or expands tokens in a flat token stream. It works correctly for single-word synonyms but produces incorrect token positions for multi-word synonyms (which can break phrase queries).
{
"settings": {
"analysis": {
"filter": {
"my_synonyms": {
"type": "synonym",
"synonyms": [
"laptop, notebook",
"phone, mobile, cellphone"
]
}
},
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase", "my_synonyms"]
}
}
}
}
}
synonym_graph (Recommended)
The synonym_graph filter produces a proper token graph with correct positions for multi-word synonyms. This means phrase queries and proximity queries work correctly even when synonyms map between different word counts.
{
"filter": {
"my_synonyms": {
"type": "synonym_graph",
"synonyms": [
"ny, new york",
"ai, artificial intelligence",
"usa, united states of america"
]
}
}
}
Always use synonym_graph for query-time synonyms. The synonym filter exists for backward compatibility and index-time use cases where the graph structure isn't needed.
Configuration Methods
Inline Synonyms
For small, static lists, define synonyms directly in the index settings:
{
"filter": {
"my_synonyms": {
"type": "synonym_graph",
"synonyms": [
"quick, fast, speedy",
"big, large, huge",
"happy, glad, joyful"
]
}
}
}
File-Based Synonyms
For larger or frequently updated lists, reference an external file:
{
"filter": {
"my_synonyms": {
"type": "synonym_graph",
"synonyms_path": "analysis/synonyms.txt",
"updateable": true
}
}
}
The file path is relative to the Elasticsearch config directory. The file must be present on every node in the cluster.
File format (one rule per line, # for comments):
# Equivalent synonyms
laptop, notebook, portable computer
phone, mobile, cellphone, smartphone
# Explicit mappings (directional)
ny => new york
sf => san francisco
ml => machine learning
Synonyms API (Elasticsearch 8.10+)
Elasticsearch 8.10 introduced the Synonyms API, which manages synonym sets centrally without file distribution:
# Create a synonym set
PUT /_synonyms/my-synonyms
{
"synonyms_set": [
{ "id": "1", "synonyms": "laptop, notebook, portable computer" },
{ "id": "2", "synonyms": "ny => new york" },
{ "id": "3", "synonyms": "phone, mobile, cellphone" }
]
}
# Reference in analyzer
{
"filter": {
"my_synonyms": {
"type": "synonym_graph",
"synonyms_set": "my-synonyms",
"updateable": true
}
}
}
Updates to the synonym set via the API are automatically applied to all search analyzers referencing it — no file distribution or reload needed.
Reloadable Synonyms
For query-time synonym filters with "updateable": true, you can reload synonyms without restarting Elasticsearch:
# After updating the synonyms file on all nodes:
POST /my-index/_reload_search_analyzers
Requirements:
- The synonym filter must have
"updateable": true - The filter must be used in a
search_analyzer(query-time), not the index analyzer - The synonyms file must be updated on all nodes before reloading
This is the recommended approach for production synonym management when not using the Synonyms API.
Index-Time vs. Query-Time Synonyms
Query-Time (Recommended)
Apply synonyms only during search:
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "standard",
"search_analyzer": "my_synonym_analyzer"
}
}
}
}
Advantages:
- Update synonyms without re-indexing
- Smaller index size
- Easier to manage and test
Index-Time
Apply synonyms during indexing:
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_synonym_analyzer"
}
}
}
}
Advantages:
- No query-time expansion overhead
- More predictable scoring (IDF is computed on expanded terms)
Disadvantages:
- Changing synonyms requires full re-index
- Larger index size
Scoring Behavior with Synonyms
The IDF Problem
Query-time synonym expansion can produce unexpected relevance scoring. When "laptop" expands to "laptop OR notebook OR portable computer", each term has a different IDF (Inverse Document Frequency). Rare synonym terms score disproportionately high.
Example: If "portable computer" appears in only 5 documents but "laptop" appears in 5,000, documents matching "portable computer" get much higher scores even though the user searched for "laptop".
Mitigations
Use
auto_generate_synonyms_phrase_query: true(default inmatchqueries): This generates phrase queries for multi-word synonyms, improving scoring for phrase-level matches.Boost the original term: In your application layer, boost the user's original query term relative to expanded synonyms:
POST /products/_search { "query": { "bool": { "should": [ { "match": { "title": { "query": "laptop", "boost": 2 } } }, { "match": { "title": { "query": "notebook portable computer", "boost": 1 } } } ] } } }Use
synonymformat with explicit mappings: Map the less-common terms to the more common one:notebook => laptop portable computer => laptopThis replaces rather than expands, avoiding the IDF discrepancy.
Testing Synonyms
Analyze API
Always test synonym configurations before deploying:
POST /products/_analyze
{
"analyzer": "my_synonym_analyzer",
"text": "I need a new laptop for work"
}
This returns the token stream with synonym expansions, showing exactly what terms will be searched.
Validate with Search
# Create a test document
POST /products/_doc/1
{
"title": "Best notebook computers for professionals"
}
# Search with synonym expansion
POST /products/_search
{
"query": { "match": { "title": "laptop" } }
}
If synonyms are configured correctly, this should match the document containing "notebook".
Filter Chain Ordering
The order of filters in your analyzer matters:
{
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"my_synonyms",
"stemmer_english"
]
}
}
}
- lowercase first: Synonym matching is case-sensitive internally. Define synonyms in lowercase and apply the lowercase filter before synonyms.
- synonyms second: Sees lowercased, unstemmed tokens.
- stemmer last: Stems both original and synonym tokens consistently.
Never place synonyms after stemming — the stemmer changes token forms and your synonym rules won't match.
Frequently Asked Questions
Q: Can I use the same synonyms.txt file across multiple indices?
Yes. Reference the same file path in each index's analyzer settings. When updating, remember to reload each index individually with _reload_search_analyzers.
Q: Why do my phrase queries break with synonyms?
If you're using the synonym filter (not synonym_graph) with multi-word synonyms, token positions are incorrect and phrase queries fail. Switch to synonym_graph for query-time analysis.
Q: How do synonyms interact with fuzzy matching?
Fuzzy matching (fuzziness: "AUTO" in match queries) applies to each term independently before synonym expansion. A fuzzy match on "lapto" finds "laptop", which then expands to include synonyms. The order is: fuzziness → tokenization → synonym expansion.
Q: Can I have directional synonyms (A → B but not B → A)?
Yes. Use explicit mapping syntax:
laptop => notebook, portable computer
This means searching for "laptop" also matches "notebook" and "portable computer", but searching for "notebook" only matches "notebook".
Q: What's the maximum synonym list size?
There's no hard limit, but very large synonym lists (10,000+ rules) increase analysis latency. For massive vocabularies, consider whether embedding-based semantic search would be more effective and maintainable than exhaustive synonym lists.
Q: Do synonyms work with the _all field or copy_to?
Synonyms apply to whatever analyzer is configured on the field. Fields using copy_to inherit the target field's analyzer. There's no _all field in recent Elasticsearch versions — use copy_to with a field configured with your synonym analyzer.