OpenSearch Hybrid Search: Combining Lexical and Neural Search

Hybrid search in OpenSearch combines traditional BM25 keyword search with k-NN vector search to deliver relevance that neither approach achieves alone. Keyword search excels at exact matches, identifiers, and precision. Vector search understands meaning, synonyms, and paraphrases. Hybrid search gives you both.

OpenSearch 2.10+ introduced native hybrid search support through search pipelines and the hybrid query type, making it significantly easier to implement than manual score fusion.

How Hybrid Search Works

A hybrid search query runs two sub-queries in parallel:

BM25 lexical query: Standard full-text matching against the inverted index
k-NN vector query: Approximate nearest-neighbor search against vector embeddings

Each sub-query returns a ranked list of results with scores. A normalization processor in the search pipeline rescales the scores to a common range, and a combination technique merges the two ranked lists into a single result set.

Setting Up Hybrid Search

Step 1: Create a Search Pipeline

The search pipeline defines how scores from different query types are normalized and combined:

PUT /_search/pipeline/hybrid-search-pipeline
{
  "description": "Hybrid search with arithmetic mean combination",
  "phase_results_processors": [
    {
      "normalization-processor": {
        "normalization": {
          "technique": "min_max"
        },
        "combination": {
          "technique": "arithmetic_mean",
          "parameters": {
            "weights": [0.4, 0.6]
          }
        }
      }
    }
  ]
}

Normalization techniques:

min_max: Scales scores to [0, 1] based on min/max in each result set. Good default.
l2: L2 normalization. Better when score distributions differ significantly between query types.

Combination techniques:

arithmetic_mean: Weighted average of normalized scores. Simple and effective.
harmonic_mean: Penalizes results that score poorly on either sub-query. More conservative.
geometric_mean: Balanced between arithmetic and harmonic. Requires both sub-queries to contribute.

The weights array assigns relative importance to each sub-query (in order). [0.4, 0.6] gives 60% weight to vector search and 40% to keyword search.

Step 2: Create an Index with Both Text and Vector Fields

PUT /products
{
  "settings": {
    "index.knn": true,
    "default_pipeline": "",
    "index.search.default_pipeline": "hybrid-search-pipeline"
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "standard"
      },
      "description": {
        "type": "text"
      },
      "category": {
        "type": "keyword"
      },
      "title_embedding": {
        "type": "knn_vector",
        "dimension": 384,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "nmslib",
          "parameters": {
            "ef_construction": 256,
            "m": 16
          }
        }
      }
    }
  }
}

Step 3: Index Documents with Embeddings

Generate embeddings externally (or use OpenSearch's neural search plugin with an ML model) and index alongside text:

POST /products/_doc/1
{
  "title": "Wireless Noise-Cancelling Headphones",
  "description": "Premium over-ear headphones with active noise cancellation and 30-hour battery life",
  "category": "electronics",
  "title_embedding": [0.12, -0.34, 0.56, ...]
}

Step 4: Query with the Hybrid Query Type

POST /products/_search?search_pipeline=hybrid-search-pipeline
{
  "query": {
    "hybrid": {
      "queries": [
        {
          "match": {
            "title": {
              "query": "noise cancelling headphones"
            }
          }
        },
        {
          "knn": {
            "title_embedding": {
              "vector": [0.11, -0.33, 0.55, ...],
              "k": 20
            }
          }
        }
      ]
    }
  },
  "size": 10
}

The search pipeline normalizes scores from the match and knn sub-queries and combines them using the configured technique and weights.

Using the Neural Search Plugin

OpenSearch's neural search plugin can generate embeddings at query time, removing the need to manage embedding infrastructure separately:

Register an ML Model

# Upload a model
POST /_plugins/_ml/models/_upload
{
  "name": "all-MiniLM-L6-v2",
  "version": "1.0.0",
  "model_format": "TORCH_SCRIPT",
  "model_config": {
    "model_type": "bert",
    "embedding_dimension": 384,
    "framework_type": "sentence_transformers"
  },
  "url": "https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/all-MiniLM-L6-v2/1.0.1/torch_script/sentence-transformers_all-MiniLM-L6-v2-1.0.1-torch_script.zip"
}

# Deploy the model
POST /_plugins/_ml/models/<model_id>/_deploy

Neural Query

Once the model is deployed, use neural queries that generate embeddings automatically:

POST /products/_search?search_pipeline=hybrid-search-pipeline
{
  "query": {
    "hybrid": {
      "queries": [
        {
          "match": {
            "title": "wireless headphones for travel"
          }
        },
        {
          "neural": {
            "title_embedding": {
              "query_text": "wireless headphones for travel",
              "model_id": "<model_id>",
              "k": 20
            }
          }
        }
      ]
    }
  }
}

Tuning Hybrid Search Weights

The weights in your combination technique significantly affect result quality. There's no universal best value — it depends on your data and queries.

Approach: A/B Testing with Relevance Judgments

Assemble a test set of 50–100 queries with known relevant documents
Run each query with different weight configurations: [0.3, 0.7], [0.5, 0.5], [0.7, 0.3]
Measure NDCG@10 or MRR for each configuration
Pick the weights that maximize your relevance metric

General Guidance

Content-heavy search (articles, documentation): Favor vector weights ([0.3, 0.7]). Meaning matters more than exact terms.
Product search: Balanced weights ([0.5, 0.5]). Users search by both product names (keyword) and descriptions (semantic).
Technical/identifier-heavy search: Favor keyword weights ([0.7, 0.3]). SKUs, error codes, and model numbers must match exactly.

Adding Filters to Hybrid Queries

Combine hybrid search with structured filters using a bool query wrapper:

POST /products/_search?search_pipeline=hybrid-search-pipeline
{
  "query": {
    "hybrid": {
      "queries": [
        {
          "bool": {
            "must": {
              "match": { "title": "headphones" }
            },
            "filter": {
              "term": { "category": "electronics" }
            }
          }
        },
        {
          "knn": {
            "title_embedding": {
              "vector": [0.11, -0.33, ...],
              "k": 20,
              "filter": {
                "term": { "category": "electronics" }
              }
            }
          }
        }
      ]
    }
  }
}

Apply the same filter to both sub-queries to ensure consistent result sets before score combination.

Performance Considerations

k-NN index memory: HNSW graphs live in memory. Budget roughly num_vectors × dimensions × 4 bytes × 1.5 (the 1.5x accounts for graph overhead). For 10M vectors at 384 dims: ~23 GB.
ef_search parameter: Controls search accuracy vs. speed for HNSW. Higher values (default 100) find better neighbors but take longer. Start with 100; reduce to 64 if latency is critical and accuracy is acceptable.
Embedding latency: If generating embeddings at query time (neural plugin), model inference adds 10–50ms per query depending on model size and hardware. For latency-sensitive applications, generate query embeddings in your application layer with GPU inference.
k value: Set k higher than your final result size — the normalization processor needs enough candidates from each sub-query to combine effectively. A good rule: k = 2 * size or higher.

Frequently Asked Questions

Q: Do I need the neural search plugin for hybrid search?

No. You can generate embeddings externally (in your application) and use the knn query type directly. The neural search plugin is convenient because it handles embedding generation server-side, but it's optional.

Q: Can I add more than two sub-queries to a hybrid query?

Yes. You can combine multiple match, knn, and neural queries. The normalization processor handles any number of sub-queries. Adjust the weights array accordingly.

Q: How do I handle documents that don't have embeddings yet?

Documents without the vector field won't appear in k-NN results but will still appear in BM25 results. The score combination handles this gracefully — those documents get a zero vector score and are ranked based on their keyword score alone.

Q: What version of OpenSearch supports hybrid search?

The hybrid query type was introduced in OpenSearch 2.10. Search pipelines with normalization processors are available from OpenSearch 2.9. For the best hybrid search experience, use OpenSearch 2.12+.

Q: How does OpenSearch hybrid search compare to Elasticsearch's RRF?

Both achieve similar goals. Elasticsearch uses Reciprocal Rank Fusion (RRF) which combines rankings rather than scores. OpenSearch's approach normalizes and combines scores directly, giving you more control via weight parameters. Both are effective — the best choice depends on your existing stack.