NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Elasticsearch Query String Query: Syntax, Parameters, and Examples - Syntax, Example, and Tips

The Elasticsearch query_string query parses a single input string using Lucene's query syntax and runs the resulting expression against one or more fields. It supports boolean operators (AND, OR, NOT), field selectors (title:foo), grouping with parentheses, wildcards, regular expressions, fuzziness, and proximity. Use it when callers (typically power users or admin UIs) need the full Lucene mini-language. For untrusted input prefer simple_query_string, which silently ignores syntax errors instead of throwing.

Syntax

GET /_search
{
  "query": {
    "query_string": {
      "query": "title:elasticsearch AND status:published",
      "default_field": "content",
      "default_operator": "OR"
    }
  }
}

Parameters

Parameter Description Required Default
query The query string to parse. Yes -
default_field Field used when the query has no explicit field selector. Supports wildcards. No * (all eligible fields, capped by index.query.default_field, default *)
fields Array of fields to search; supports wildcards and field^boost. No -
default_operator Operator between terms with no explicit AND/OR. No OR
analyzer Override analyzer for the query string. No Index search analyzer
analyze_wildcard If true, analyzes wildcard terms. Only the * suffix is reliably analyzed; stemmers and similar token filters still skip wildcard tokens. No false
allow_leading_wildcard Permits * or ? as the first character of a term. No true
fuzziness Maximum edit distance for fuzzy terms (e.g. AUTO, 1, 2). No -
fuzzy_max_expansions Max terms a fuzzy expansion can produce. No 50
fuzzy_prefix_length Characters at the start of the term left unchanged. No 0
lenient Ignore format errors (e.g. text typed against a numeric field). No false
minimum_should_match Minimum optional clauses that must match. No -
phrase_slop Positions allowed between tokens in a phrase. No 0
time_zone UTC offset or IANA zone for date conversion. No -
boost Score multiplier for the whole query. No 1.0
max_determinized_states Limit for regexp automaton states. No 10000

Examples

Basic search across the default field:

GET /articles/_search
{
  "query": {
    "query_string": {
      "query": "elasticsearch",
      "default_field": "content"
    }
  }
}

Multi-field search with per-field boosts and an explicit operator:

GET /articles/_search
{
  "query": {
    "query_string": {
      "query": "elasticsearch performance",
      "fields": ["title^3", "summary^2", "content"],
      "default_operator": "AND"
    }
  }
}

Mixed Lucene syntax with field selectors, grouping, and fuzziness (~):

GET /articles/_search
{
  "query": {
    "query_string": {
      "query": "(title:\"distributed search\" OR tags:search) AND author:martin~1",
      "lenient": true
    }
  }
}

Inline regexp using /.../ delimiters:

GET /users/_search
{
  "query": {
    "query_string": {
      "query": "username:/jdoe[0-9]+/"
    }
  }
}

Performance and Use Notes

Leading wildcards are the most common performance trap. *term forces a full terms-dictionary scan per shard; disable them in production with "allow_leading_wildcard": false unless you have rewritten the field with a reverse analyzer or an n-gram index. Regexps and unbounded fuzziness are similarly expensive and should run only on keyword fields with bounded cardinality.

query_string throws a parse error on malformed input (unbalanced quotes, dangling operators, invalid field names without lenient). If end users type the strings, use simple_query_string instead, which ignores invalid tokens and never raises. Always escape the reserved characters + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ / (double-backslash in JSON) when they appear as literals.

Complex query_string queries are a frequent source of slow searches and cluster instability. The manual triage - reading slow logs, parsing each Lucene expression to find the leading wildcard or unbounded fuzziness inside it, then chasing down the originating service - is precisely what Pulse runs continuously.

Common Mistakes

  1. Leaving allow_leading_wildcard at its default of true on a user-facing endpoint, then watching *foo queries melt a shard.
  2. Setting analyze_wildcard: true and expecting stemming to fire - it does not; tokens containing wildcards are still treated as raw text by most filters.
  3. Forgetting to escape : or / in product IDs and URLs embedded in the query string.
  4. Using query_string for end-user input and surfacing parse exceptions to the UI instead of switching to simple_query_string.
  5. Targeting text fields with regexp or wildcard expressions when a keyword sub-field exists.

Find Slow query_string Queries with Pulse

Pulse is an AI DBA for Elasticsearch and OpenSearch that continuously profiles production query traffic. For query_string queries specifically, Pulse:

  • Parses live query_string expressions and flags those containing leading wildcards, embedded regexps with leading .*, unbounded ~ fuzziness, or field:* field-existence patterns
  • Detects clusters where allow_leading_wildcard is enabled and a user-facing endpoint is forwarding unsafe Lucene syntax
  • Identifies parse-error spikes that indicate the wrong endpoint is using query_string instead of simple_query_string
  • Traces each slow query_string back to the calling service via slow-log and APM correlation
  • Recommends concrete rewrites: disable allow_leading_wildcard, switch to simple_query_string for untrusted input, split structured predicates into a bool query with filter clauses, or scope fields to indexed text-only fields
  • Tracks the latency improvement after the rewrite ships

This converts the manual Lucene-syntax debugging loop into a continuous optimization workflow.

Try Pulse on your cluster.

Frequently Asked Questions

Q: What is the difference between query_string and simple_query_string?
A: Both parse a Lucene-like mini-language, but simple_query_string uses a relaxed grammar that silently drops invalid tokens, while query_string throws a parse exception. Use simple_query_string for untrusted input and query_string for trusted, admin-style search.

Q: When should I use query_string over multi_match?
A: Use query_string when the caller needs Lucene operators (field selectors, boolean grouping, regex, fuzziness in one expression). Use multi_match when the input is plain text and you want predictable scoring across known fields.

Q: How do I make query_string case-insensitive?
A: query_string applies the field's search analyzer, so casing is controlled by the analyzer (e.g. standard lowercases by default). For wildcard or regexp terms, add case_insensitive: true inside the query or use a lowercase normalizer on the keyword field.

Q: Are leading wildcards really that slow?
A: Yes. A leading * or ? prevents Lucene from using the term dictionary's sorted prefix lookup and forces a full scan of every term in the field, per shard. Disable allow_leading_wildcard or rewrite the data with reverse/n-gram analysis.

Q: Why does my date or number query fail with a format exception?
A: query_string validates input against each field's mapped type. Set lenient: true to skip incompatible fields, or scope fields to text-only fields. The cleaner fix is to route structured predicates through a range query or term query.

Q: Can I use boost inside the query string itself?
A: Yes - append ^N to a term or grouped expression, e.g. title:elasticsearch^3 OR summary:elasticsearch. The top-level boost parameter multiplies the score of the entire query.

Q: How do I find which query_string patterns are causing slow searches?
A: Pulse ingests Elasticsearch and OpenSearch slow logs, parses each query_string expression to spot leading wildcards, embedded regexps, and unbounded fuzziness, correlates each slow query to the calling service, and recommends safer rewrites (disable allow_leading_wildcard, switch to simple_query_string, or move structured predicates into a bool query filter).

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.