MySQL's query optimizer chooses between available access paths — index range scans, index lookups, full table scans — based on cost estimates. When the optimizer picks a full table scan over an existing index, or picks the wrong index, queries slow dramatically. This is not a bug in the index; it is a signal about the query, the data distribution, or the statistics.
Impact
A missed index on a large table turns a millisecond lookup into a seconds-long full scan. On a busy server, full-table-scan queries accumulate open connections and I/O bandwidth, starving other queries. Writes to the table slow because InnoDB buffer pool pages are evicted by scan pages flooding the pool, destroying cache hit ratio.
Common Causes
- Type mismatch in the
WHEREpredicate: comparing a string column to an integer (or vice versa) forces a cast on every row, making the index unusable - Function applied to the indexed column:
WHERE YEAR(created_at) = 2024— the index is oncreated_atbut the optimizer cannot use it for a function result - Leading column of composite index not present in
WHERE: an index on(a, b, c)is not used forWHERE b = 1becauseais not constrained - Low cardinality estimate: if the optimizer estimates the index will return > 30% of rows, a full scan is cheaper (this estimate can be wrong if
ANALYZE TABLEhas not been run recently) LIKEwith a leading wildcard:WHERE name LIKE '%smith'cannot use a B-tree index; onlyLIKE 'smith%'canORacross different indexed columns:WHERE a = 1 OR b = 2without anindex_mergeor the right composite index!=/NOT IN/NOT LIKE: InnoDB cannot efficiently bound a range for these operators; they typically cause full scans- Index statistics are stale:
SHOW INDEX FROM tableshows lowCardinalityrelative to actual data becauseANALYZE TABLEhas not run since a large data change - Small table: the optimizer correctly determines a full scan is cheaper than a B-tree traversal for tables that fit in a handful of pages
- Index on a column with heavy
NULLvalues: a plain index is usable but the optimizer may underestimate cardinality on NULL-heavy columns
Troubleshooting and Resolution Steps
Run
EXPLAINon the slow query:EXPLAIN SELECT * FROM orders WHERE customer_id = 42 ORDER BY created_at DESC LIMIT 10;Key columns to inspect:
type:ALL= full table scan (bad for large tables);ref,range,index= uses an indexkey: which index was actually chosen (NULL means no index used)possible_keys: indexes the optimizer consideredrows: estimated rows to be examinedExtra: look forUsing filesort,Using temporary,Using index condition,Using where
Use
EXPLAIN FORMAT=TREEorEXPLAIN ANALYZE(MySQL 8.0+) for more detail:EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42 ORDER BY created_at DESC LIMIT 10;EXPLAIN ANALYZEactually runs the query and shows actual vs. estimated row counts, revealing cases where statistics are wrong.Fix type mismatch — the most common and invisible cause:
-- Column type is VARCHAR(50) but query passes an integer: -- Bad (implicit cast, index unusable): WHERE user_id = 12345 -- user_id is VARCHAR -- Good: WHERE user_id = '12345' -- Check column type DESCRIBE orders;Rewrite function-wrapped predicates to be sargable (Search ARGument ABLE):
-- Bad: function on column, index on created_at is unused WHERE YEAR(created_at) = 2024 -- Good: range on the column itself WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01' -- Bad: arithmetic on column WHERE price * 1.1 > 100 -- Good: move the arithmetic to the constant side WHERE price > 100 / 1.1Check and update table statistics:
-- Show index statistics SHOW INDEX FROM orders; -- Update statistics (online, non-blocking in MySQL 5.6+) ANALYZE TABLE orders;The
Cardinalitycolumn inSHOW INDEXshould be close to the actual number of distinct values. If it is far off (e.g., shows 100 for a million-row table), the optimizer's cost estimate is wrong.Configure more frequent automatic statistics updates:
-- InnoDB updates stats when ~10% of table rows change (default) SHOW VARIABLES LIKE 'innodb_stats_auto_recalc'; SHOW VARIABLES LIKE 'innodb_stats_persistent_sample_pages'; -- Increase sample size for better estimates on large tables SET GLOBAL innodb_stats_persistent_sample_pages = 25; SET PERSIST innodb_stats_persistent_sample_pages = 25;Add or redesign the index for the actual query pattern:
-- Query: WHERE customer_id = ? ORDER BY created_at DESC LIMIT ? -- A composite index covers both the filter and the sort: ALTER TABLE orders ADD INDEX idx_customer_created (customer_id, created_at DESC); -- Then verify with EXPLAIN EXPLAIN SELECT * FROM orders WHERE customer_id = 42 ORDER BY created_at DESC LIMIT 10;Convert a covering query to use a covering index (avoids returning to the table):
-- Query only needs id, status, created_at from orders ALTER TABLE orders ADD INDEX idx_covering (customer_id, status, created_at, id); -- EXPLAIN will show "Using index" in Extra — no row lookup neededUse
FORCE INDEXas a temporary diagnostic tool (not a production fix):SELECT * FROM orders FORCE INDEX (idx_customer_id) WHERE customer_id = 42;If this is faster than the optimizer's choice, the statistics are wrong or the index definition needs adjusting. Fix the root cause rather than leaving
FORCE INDEXin production code — it prevents the optimizer from adapting as data changes.Handle
LIKE '%prefix%'and full-text searches correctly:-- Trailing-wildcard LIKE can use a B-tree index: WHERE name LIKE 'smith%' -- uses index on name -- Leading-wildcard LIKE cannot: WHERE name LIKE '%smith' -- always full scan -- For leading-wildcard or contains patterns, use FULLTEXT: ALTER TABLE products ADD FULLTEXT INDEX ft_name (name); SELECT * FROM products WHERE MATCH(name) AGAINST ('smith');Use
index_mergeforORconditions:-- Check if index_merge is available EXPLAIN SELECT * FROM orders WHERE customer_id = 1 OR status = 'pending'; -- Extra: "Using union(idx_customer, idx_status)" -- If index_merge is not firing, rewrite as UNION: SELECT * FROM orders WHERE customer_id = 1 UNION SELECT * FROM orders WHERE status = 'pending' AND customer_id != 1;Use generated columns to index function results (MySQL 5.7.5+):
-- Instead of WHERE YEAR(created_at) = 2024, create a virtual column: ALTER TABLE events ADD COLUMN event_year SMALLINT UNSIGNED AS (YEAR(created_at)) VIRTUAL, ADD INDEX idx_event_year (event_year); -- Query can now use the index transparently: SELECT * FROM events WHERE event_year = 2024;
Additional Information
- The optimizer's cost model compares the estimated cost of an index range scan (index I/Os + table row I/Os) against a full table scan (sequential I/Os). For small tables or high-selectivity misestimates, the optimizer may genuinely be correct to choose a full scan.
innodb_stats_persistent = ON(default) stores statistics inmysql.innodb_table_statsandmysql.innodb_index_stats. You can directly update these tables, thoughANALYZE TABLEis the supported method.- MySQL 8.0 introduced invisible indexes (
ALTER TABLE t ALTER INDEX idx_name INVISIBLE). This lets you test removing an index without actually dropping it — queries cannot use invisible indexes but the index is maintained on writes. - Descending indexes (
CREATE INDEX idx_name ON t(col DESC)) are fully supported in MySQL 8.0 and can eliminate filesort for queries withORDER BY col DESC. - The
optimizer_tracesystem variable produces a full JSON trace of the optimizer's decision process, showing exact cost estimates for each access path considered.
Frequently Asked Questions
Q: The index exists and possible_keys shows it, but key is NULL. Why?
A: The optimizer evaluated the index but decided a full scan was cheaper. Most often this means the table is small, cardinality is underestimated, or the predicate matches a large fraction of rows. Run ANALYZE TABLE and re-check; if still not used, examine whether the query predicate is truly selective.
Q: Adding an index made a query slower. How?
A: Indexes on write-heavy tables increase write overhead. If the index is rarely used, the write cost outweighs the read benefit. Also, if the optimizer switches to using the new index for a different query where the old plan was better, that query slows. Use SHOW INDEX and performance_schema.table_io_waits_summary_by_index_usage to identify unused indexes.
Q: My composite index is (a, b) — why won't the optimizer use it for WHERE b = ??
A: A B-tree composite index is sorted first by a, then by b within each a value. Without a constraint on a, every entry must be checked — no different from a full index scan. Add a separate index on b if queries filter on b alone.
Q: EXPLAIN shows rows = 1 but the query is slow. Why?
A: The rows estimate is optimistic and based on statistics, not actual data. Use EXPLAIN ANALYZE to see the actual row count. The slowness may be in lock waits (Lock_time in the slow log) rather than scan size.
Q: Is FORCE INDEX ever appropriate in production?
A: Rarely. It is a hint that bypasses the optimizer permanently, even as data distribution changes. If you use it, document exactly why and set a reminder to revisit after the next major data change or ANALYZE TABLE run.