Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Read more

OpenSearch Synonyms: Configuration, Synonym Filters, and Best Practices

Synonyms in OpenSearch expand search queries to match documents that use different words for the same concept. When a user searches for "laptop", synonyms can also match documents containing "notebook computer". Without synonyms, lexical search is frustratingly literal.

OpenSearch provides two synonym token filters — synonym and synonym_graph — each suited to different use cases.

Synonym Filter Types

synonym

The basic synonym filter works with single-token replacements and equivalences. It operates on the token stream after tokenization.

PUT /products
{
  "settings": {
    "analysis": {
      "filter": {
        "my_synonyms": {
          "type": "synonym",
          "synonyms": [
            "laptop, notebook",
            "phone, mobile, cellphone",
            "tv, television"
          ]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "standard",
          "filter": ["lowercase", "my_synonyms"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}

synonym_graph

The synonym_graph filter handles multi-word synonyms correctly by producing a token graph rather than a flat token stream. This is essential when synonyms map single words to phrases or vice versa.

PUT /products
{
  "settings": {
    "analysis": {
      "filter": {
        "my_synonyms": {
          "type": "synonym_graph",
          "synonyms": [
            "ny, new york",
            "usa, united states, united states of america",
            "ai, artificial intelligence"
          ]
        }
      },
      "analyzer": {
        "search_analyzer": {
          "tokenizer": "standard",
          "filter": ["lowercase", "my_synonyms"]
        }
      }
    }
  }
}

Use synonym_graph for query-time analysis. It correctly handles multi-word expressions that the basic synonym filter mangles.

Synonym Formats

Equivalent Synonyms

All terms are interchangeable:

laptop, notebook, portable computer

Searching for any of these terms matches documents containing any of the others.

Explicit Mappings

Map specific terms to replacements:

laptop => notebook, portable computer
ny => new york

The left side is replaced by the right side. Searching for "laptop" also matches "notebook" and "portable computer", but searching for "notebook" does NOT match "laptop".

Use explicit mappings when the relationship is directional — abbreviations, acronyms, or when one term should expand but not the reverse.

Synonym File

For large synonym lists, use an external file:

PUT /products
{
  "settings": {
    "analysis": {
      "filter": {
        "my_synonyms": {
          "type": "synonym_graph",
          "synonyms_path": "analysis/synonyms.txt",
          "updateable": true
        }
      }
    }
  }
}

The file analysis/synonyms.txt is relative to the OpenSearch config directory and contains one synonym rule per line:

# synonyms.txt
laptop, notebook, portable computer
phone, mobile, cellphone, smartphone
tv, television, telly
ny => new york
sf => san francisco
ai => artificial intelligence

Important: The synonyms file must exist on every node in the cluster at the configured path.

Index-Time vs. Query-Time Synonyms

Index-Time Synonyms

Synonyms are expanded when documents are indexed. The term "laptop" is stored as both "laptop" and "notebook" in the inverted index.

"mappings": {
  "properties": {
    "name": {
      "type": "text",
      "analyzer": "my_synonym_analyzer"
    }
  }
}

Pros: Fast queries — no synonym expansion needed at search time.

Cons: Changing synonyms requires re-indexing all documents. Index size increases because each document stores more terms.

Query-Time Synonyms

Synonyms are expanded when the search query is analyzed. The index stores original terms only.

"mappings": {
  "properties": {
    "name": {
      "type": "text",
      "analyzer": "standard",
      "search_analyzer": "my_synonym_analyzer"
    }
  }
}

Pros: Synonym changes take effect immediately without re-indexing. Index size stays compact.

Cons: Slightly slower queries due to expansion. Some scoring nuances with IDF (terms expanded at query time may have different IDF weights).

Recommendation: Use query-time synonyms unless you have a specific reason for index-time. The flexibility to update synonyms without re-indexing is almost always worth the minor query-time overhead.

Updatable Synonyms

With "updateable": true on a query-time synonym filter, you can reload synonyms without restarting OpenSearch:

# Update the synonyms.txt file on all nodes, then:
POST /products/_reload_search_analyzers

This reloads search analyzers that use updatable synonym filters, applying the new synonym list immediately.

Common Patterns

Domain-Specific Synonyms

Build synonym lists from your domain vocabulary:

# E-commerce
sneakers, trainers, running shoes, athletic shoes
hoodie, hooded sweatshirt, pullover
charger, charging cable, power adapter

# Tech
kubernetes, k8s
elasticsearch, es
continuous integration, ci

Acronym Expansion

aws => amazon web services
api => application programming interface
sql => structured query language
etl => extract transform load

Brand and Product Normalization

iphone, apple phone
macbook, apple laptop
pixel, google phone

Misspelling Correction

While fuzzy matching is generally better for misspellings, common misspellings can be handled as synonyms:

accomodation => accommodation
recieve => receive

Synonym Interaction with Other Analyzers

Placement in the Filter Chain

Synonyms should typically come after lowercase and before stemmer:

{
  "analyzer": {
    "my_analyzer": {
      "tokenizer": "standard",
      "filter": ["lowercase", "my_synonyms", "stemmer"]
    }
  }
}

Why after lowercase: Synonym matching is case-sensitive. If "NY" is in your synonym list but the token is "ny" after lowercasing, the match fails. Place synonyms after lowercase and define synonyms in lowercase.

Why before stemmer: The stemmer would turn "running" into "run", so your synonym rule running shoes, sneakers wouldn't match. Synonyms should see the lowercased but unstemmed tokens.

Synonyms with Custom Analyzers

If you use custom tokenizers or char filters, test that the token stream reaching the synonym filter matches your synonym rules. Use the Analyze API to debug:

POST /products/_analyze
{
  "analyzer": "my_synonym_analyzer",
  "text": "I need a new laptop"
}

This shows the tokens produced, including synonym expansions.

Performance Considerations

  1. Large synonym lists slow analysis: Lists with 10,000+ rules add measurable latency to both indexing and query analysis. Consider splitting into domain-specific synonym sets applied to different fields.

  2. Index-time synonyms increase index size: Each synonym expansion adds tokens to the inverted index. For a 10-synonym expansion rule, the affected term appears 10x in the index.

  3. Query-time expansion increases query complexity: A query for "laptop" that expands to 5 synonyms runs 5 sub-queries. With many expanded terms, query performance degrades.

  4. Synonym graph with multi-word synonyms is slower than flat synonyms: The graph token filter produces more complex token streams. Use synonym_graph only when you actually have multi-word synonyms.

Frequently Asked Questions

Q: Why isn't my synonym working?

Most common causes: (1) The synonym filter is in the index analyzer but you haven't re-indexed after adding synonyms. (2) Case mismatch — synonyms are lowercase but the filter is before the lowercase filter. (3) The synonym file isn't present on all nodes. Use the Analyze API to debug.

Q: Can I use synonyms with the OpenSearch neural search plugin?

Synonyms apply to the lexical (BM25) component of search, not to vector search. For hybrid search, synonyms expand the keyword query while the neural query handles semantic matching independently. They're complementary.

Q: How many synonyms can I have?

There's no hard limit, but performance degrades with very large lists. A few hundred to a few thousand rules is typical and performs well. If you need 10,000+ rules, consider whether query-time expansion or a more sophisticated approach (like embedding-based semantic search) would be more maintainable.

Q: What's the difference between synonyms and query expansion?

Synonyms are explicit, manually curated term mappings. Query expansion is a broader technique that may include synonyms, related terms (from a thesaurus or ML model), and spelling corrections. Synonyms are the simplest and most controllable form of query expansion.

Q: Can I use the same synonym file for OpenSearch and Elasticsearch?

Yes. The synonym file format is identical. The same synonyms.txt file works in both platforms.

Pulse - Elasticsearch Operations Done Right

Pulse can solve your OpenSearch issues

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.