The Elasticsearch wildcard query matches indexed terms against a pattern containing * (zero or more characters) or ? (exactly one character). It operates on the term dictionary, not on raw source values, so it is almost always run against keyword (or wildcard) fields. On analyzed text fields the query matches individual tokens, which is rarely the intent. Use it for short suffix or contains patterns on bounded-cardinality keywords; for large fields prefer the dedicated wildcard field type or an n-gram analyzer.
Syntax
GET /_search
{
"query": {
"wildcard": {
"field_name": {
"value": "pattern*",
"case_insensitive": false,
"boost": 1.0
}
}
}
}
Parameters
| Parameter | Description | Required | Default |
|---|---|---|---|
value |
Wildcard pattern. * matches zero or more characters, ? matches exactly one. |
Yes | - |
case_insensitive |
ASCII case-insensitive matching. Available since Elasticsearch 7.10. | No | false |
boost |
Score multiplier for the matching documents. | No | 1.0 |
rewrite |
Multi-term rewrite method (e.g. constant_score, top_terms_N). |
No | constant_score |
The wildcard query is an "expensive" query type; it is rejected when search.allow_expensive_queries is false.
Examples
Suffix match on a keyword field:
GET /products/_search
{
"query": {
"wildcard": {
"sku.keyword": {
"value": "AX-*"
}
}
}
}
Case-insensitive contains match:
GET /users/_search
{
"query": {
"wildcard": {
"email": {
"value": "*@example.com",
"case_insensitive": true
}
}
}
}
Multiple wildcards inside a single pattern:
GET /products/_search
{
"query": {
"wildcard": {
"product_name": {
"value": "iph*ne*"
}
}
}
}
Wildcard wrapped in a bool filter to prune the candidate set first:
GET /products/_search
{
"query": {
"bool": {
"filter": [
{ "term": { "category": "phones" } },
{ "wildcard": { "model.keyword": "iph*ne" } }
]
}
}
}
Performance and Use Notes
A leading wildcard (*foo, ?foo) forces Lucene to scan every term in the field's dictionary on every shard, since the sorted-prefix lookup that makes wildcard queries fast no longer applies. The official docs warn against this pattern explicitly. For frequent contains-style queries on large text, map the field as the dedicated `wildcard` field type (introduced in 7.9), which stores n-gram-like structures and handles leading wildcards in roughly logarithmic time. For autocomplete, prefer search_as_you_type or an edge n-gram analyzer.
Wildcard queries are flagged as expensive. When search.allow_expensive_queries is set to false at the cluster level, the query is rejected with 400. The case_insensitive flag uses Unicode-aware comparison and may inflate term expansion further; combine with a narrowing filter when possible.
Wildcard queries on large indices are a common cause of slow searches and node CPU spikes. The diagnostic loop above - reading slow logs, grepping for wildcard clauses, checking each field's mapping - is exactly what Pulse automates in production.
Common Mistakes
- Running wildcard queries against an analyzed
textfield and getting partial token matches instead of substring matches against the original string. Use the.keywordsub-field. - Allowing user-supplied leading wildcards on a production endpoint. Strip or reject them in the application layer.
- Forgetting that the query is rejected when
search.allow_expensive_queries: false. Catch the 400 and fall back to amatchorprefixquery. - Using wildcard for autocomplete when `prefix` plus
index_prefixes,search_as_you_type, or an edge-n-gram analyzer would be cheaper. - Assuming
case_insensitiveis free - it widens the candidate term set and on large dictionaries can dominate cost.
Find Slow Wildcard Queries with Pulse
Pulse is an AI DBA for Elasticsearch and OpenSearch that continuously profiles production query traffic. For wildcard queries specifically, Pulse:
- Identifies which production wildcard queries trigger full term-dictionary scans because of a leading
*or?, and which are rejected bysearch.allow_expensive_queries: false - Flags wildcard clauses running against analyzed
textfields where a.keywordsub-field or thewildcarddata type would be correct - Traces each slow wildcard query back to the calling service via slow-log and APM correlation, so the responsible code path is named
- Recommends concrete rewrites: switch to a prefix query with
index_prefixes, migrate hot fields to thewildcarddata type, addeager_global_ordinals, or wrap the wildcard inbool.filterbehind a selective term filter - Tracks the latency and CPU improvement after you apply the fix
This converts the manual slow-log plus DSL-debugging loop into a continuous optimization workflow.
Frequently Asked Questions
Q: What is the difference between a wildcard query and a regexp query?
A: A wildcard query supports only * and ? and is generally cheaper for simple patterns. A regexp query supports the full Lucene regex grammar (character classes, alternation, repetition), but its automaton is more expensive to build and execute.
Q: Why is a leading wildcard so slow?
A: The term dictionary is sorted, so prefix lookups are O(log N). A leading wildcard means every prefix is potentially a match, forcing a full scan over the dictionary per shard. On large fields this can take seconds per query.
Q: Can I run a wildcard query on a text field?
A: Yes, but it matches per-token after analysis, which is rarely what callers want. *ana* over a field analyzed with standard will match "banana" only because the token banana itself matches the pattern; multi-word phrases are matched token-by-token.
Q: How do I make a wildcard query case-insensitive?
A: Set case_insensitive: true on the query (Elasticsearch 7.10+). Alternatively index the field with a lowercase normalizer (for keyword) or a lowercasing analyzer (for text) and lowercase the query value in the application.
Q: When should I use the wildcard field type instead of a keyword field?
A: Use the `wildcard` field type when you need frequent substring or leading-wildcard queries over large strings (log messages, URLs, full names). It indexes n-gram-like structures so leading wildcards are no longer a full term scan.
Q: Are wildcard queries cacheable?
A: When used in a filter context they enter the per-node query cache like other filters. Their cost is in the initial run; subsequent identical filters are cheap.
Q: What is the best tool to find slow wildcard queries in production Elasticsearch?
A: Pulse profiles slow-log entries on live Elasticsearch and OpenSearch clusters, isolates the wildcard queries that are doing leading-wildcard scans or running against text fields, attributes each slow query to the calling service, and recommends mapping or query rewrites - including migration to the wildcard data type or to a prefix query with index_prefixes.
Related Reading
- Elasticsearch Query Language: overview of the query DSL.
- Regexp Query: full regex matching when wildcards are not expressive enough.
- Prefix Query: cheaper alternative for prefix-only matches.
- Query String Query: inline wildcards inside a Lucene-syntax query.
- Terms Query: exact match against a list of values.