Why MySQL Isn't Using Your Index (and How to Fix It)

Q: My composite index is `(a, b)` — why won't the optimizer use it for `WHERE b = ?`?

A B-tree composite index is sorted first by a , then by b within each a value. Without a constraint on a , every entry must be checked — no different from a full index scan. Add a separate index on b if queries filter on b alone.

Q: `EXPLAIN` shows `rows = 1` but the query is slow. Why?

The rows estimate is optimistic and based on statistics, not actual data. Use EXPLAIN ANALYZE to see the actual row count. The slowness may be in lock waits ( Lock_time in the slow log) rather than scan size.

MySQL's query optimizer chooses between available access paths — index range scans, index lookups, full table scans — based on cost estimates. When the optimizer picks a full table scan over an existing index, or picks the wrong index, queries slow dramatically. This is not a bug in the index; it is a signal about the query, the data distribution, or the statistics.

Impact

A missed index on a large table turns a millisecond lookup into a seconds-long full scan. On a busy server, full-table-scan queries accumulate open connections and I/O bandwidth, starving other queries. Writes to the table slow because InnoDB buffer pool pages are evicted by scan pages flooding the pool, destroying cache hit ratio.

Common Causes

Type mismatch in the WHERE predicate: comparing a string column to an integer (or vice versa) forces a cast on every row, making the index unusable
Function applied to the indexed column: WHERE YEAR(created_at) = 2024 — the index is on created_at but the optimizer cannot use it for a function result
Leading column of composite index not present in WHERE: an index on (a, b, c) is not used for WHERE b = 1 because a is not constrained
Low cardinality estimate: if the optimizer estimates the index will return > 30% of rows, a full scan is cheaper (this estimate can be wrong if ANALYZE TABLE has not been run recently)
LIKE with a leading wildcard: WHERE name LIKE '%smith' cannot use a B-tree index; only LIKE 'smith%' can
OR across different indexed columns: WHERE a = 1 OR b = 2 without an index_merge or the right composite index
!= / NOT IN / NOT LIKE: InnoDB cannot efficiently bound a range for these operators; they typically cause full scans
Index statistics are stale: SHOW INDEX FROM table shows low Cardinality relative to actual data because ANALYZE TABLE has not run since a large data change
Small table: the optimizer correctly determines a full scan is cheaper than a B-tree traversal for tables that fit in a handful of pages
Index on a column with heavy NULL values: a plain index is usable but the optimizer may underestimate cardinality on NULL-heavy columns

Troubleshooting and Resolution Steps

Run EXPLAIN on the slow query:
```
EXPLAIN SELECT * FROM orders WHERE customer_id = 42 ORDER BY created_at DESC LIMIT 10;
```
Key columns to inspect:
- type: ALL = full table scan (bad for large tables); ref, range, index = uses an index
- key: which index was actually chosen (NULL means no index used)
- possible_keys: indexes the optimizer considered
- rows: estimated rows to be examined
- Extra: look for Using filesort, Using temporary, Using index condition, Using where
Use EXPLAIN FORMAT=TREE or EXPLAIN ANALYZE (MySQL 8.0+) for more detail:
```
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42 ORDER BY created_at DESC LIMIT 10;
```
EXPLAIN ANALYZE actually runs the query and shows actual vs. estimated row counts, revealing cases where statistics are wrong.

Fix type mismatch — the most common and invisible cause:

-- Column type is VARCHAR(50) but query passes an integer:
-- Bad (implicit cast, index unusable):
WHERE user_id = 12345        -- user_id is VARCHAR

-- Good:
WHERE user_id = '12345'

-- Check column type
DESCRIBE orders;

Rewrite function-wrapped predicates to be sargable (Search ARGument ABLE):

-- Bad: function on column, index on created_at is unused
WHERE YEAR(created_at) = 2024

-- Good: range on the column itself
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'

-- Bad: arithmetic on column
WHERE price * 1.1 > 100

-- Good: move the arithmetic to the constant side
WHERE price > 100 / 1.1

Check and update table statistics:

-- Show index statistics
SHOW INDEX FROM orders;

-- Update statistics (online, non-blocking in MySQL 5.6+)
ANALYZE TABLE orders;

The Cardinality column in SHOW INDEX should be close to the actual number of distinct values. If it is far off (e.g., shows 100 for a million-row table), the optimizer's cost estimate is wrong.

Configure more frequent automatic statistics updates:

-- InnoDB updates stats when ~10% of table rows change (default)
SHOW VARIABLES LIKE 'innodb_stats_auto_recalc';
SHOW VARIABLES LIKE 'innodb_stats_persistent_sample_pages';

-- Increase sample size for better estimates on large tables
SET GLOBAL innodb_stats_persistent_sample_pages = 25;
SET PERSIST innodb_stats_persistent_sample_pages = 25;

Add or redesign the index for the actual query pattern:

-- Query: WHERE customer_id = ? ORDER BY created_at DESC LIMIT ?
-- A composite index covers both the filter and the sort:
ALTER TABLE orders ADD INDEX idx_customer_created (customer_id, created_at DESC);

-- Then verify with EXPLAIN
EXPLAIN SELECT * FROM orders WHERE customer_id = 42 ORDER BY created_at DESC LIMIT 10;

Convert a covering query to use a covering index (avoids returning to the table):

-- Query only needs id, status, created_at from orders
ALTER TABLE orders ADD INDEX idx_covering (customer_id, status, created_at, id);

-- EXPLAIN will show "Using index" in Extra — no row lookup needed

Use FORCE INDEX as a temporary diagnostic tool (not a production fix):
```
SELECT * FROM orders FORCE INDEX (idx_customer_id)
WHERE customer_id = 42;
```
If this is faster than the optimizer's choice, the statistics are wrong or the index definition needs adjusting. Fix the root cause rather than leaving FORCE INDEX in production code — it prevents the optimizer from adapting as data changes.

Handle LIKE '%prefix%' and full-text searches correctly:

-- Trailing-wildcard LIKE can use a B-tree index:
WHERE name LIKE 'smith%'    -- uses index on name

-- Leading-wildcard LIKE cannot:
WHERE name LIKE '%smith'    -- always full scan

-- For leading-wildcard or contains patterns, use FULLTEXT:
ALTER TABLE products ADD FULLTEXT INDEX ft_name (name);
SELECT * FROM products WHERE MATCH(name) AGAINST ('smith');

Use index_merge for OR conditions:

-- Check if index_merge is available
EXPLAIN SELECT * FROM orders WHERE customer_id = 1 OR status = 'pending';
-- Extra: "Using union(idx_customer, idx_status)"

-- If index_merge is not firing, rewrite as UNION:
SELECT * FROM orders WHERE customer_id = 1
UNION
SELECT * FROM orders WHERE status = 'pending' AND customer_id != 1;

Use generated columns to index function results (MySQL 5.7.5+):

-- Instead of WHERE YEAR(created_at) = 2024, create a virtual column:
ALTER TABLE events
    ADD COLUMN event_year SMALLINT UNSIGNED AS (YEAR(created_at)) VIRTUAL,
    ADD INDEX idx_event_year (event_year);

-- Query can now use the index transparently:
SELECT * FROM events WHERE event_year = 2024;

Additional Information

The optimizer's cost model compares the estimated cost of an index range scan (index I/Os + table row I/Os) against a full table scan (sequential I/Os). For small tables or high-selectivity misestimates, the optimizer may genuinely be correct to choose a full scan.
innodb_stats_persistent = ON (default) stores statistics in mysql.innodb_table_stats and mysql.innodb_index_stats. You can directly update these tables, though ANALYZE TABLE is the supported method.
MySQL 8.0 introduced invisible indexes (ALTER TABLE t ALTER INDEX idx_name INVISIBLE). This lets you test removing an index without actually dropping it — queries cannot use invisible indexes but the index is maintained on writes.
Descending indexes (CREATE INDEX idx_name ON t(col DESC)) are fully supported in MySQL 8.0 and can eliminate filesort for queries with ORDER BY col DESC.
The optimizer_trace system variable produces a full JSON trace of the optimizer's decision process, showing exact cost estimates for each access path considered.

Frequently Asked Questions

Q: The index exists and possible_keys shows it, but key is NULL. Why? A: The optimizer evaluated the index but decided a full scan was cheaper. Most often this means the table is small, cardinality is underestimated, or the predicate matches a large fraction of rows. Run ANALYZE TABLE and re-check; if still not used, examine whether the query predicate is truly selective.

Q: Adding an index made a query slower. How? A: Indexes on write-heavy tables increase write overhead. If the index is rarely used, the write cost outweighs the read benefit. Also, if the optimizer switches to using the new index for a different query where the old plan was better, that query slows. Use SHOW INDEX and performance_schema.table_io_waits_summary_by_index_usage to identify unused indexes.

Q: My composite index is (a, b) — why won't the optimizer use it for WHERE b = ?? A: A B-tree composite index is sorted first by a, then by b within each a value. Without a constraint on a, every entry must be checked — no different from a full index scan. Add a separate index on b if queries filter on b alone.

Q: EXPLAIN shows rows = 1 but the query is slow. Why? A: The rows estimate is optimistic and based on statistics, not actual data. Use EXPLAIN ANALYZE to see the actual row count. The slowness may be in lock waits (Lock_time in the slow log) rather than scan size.

Q: Is FORCE INDEX ever appropriate in production? A: Rarely. It is a hint that bypasses the optimizer permanently, even as data distribution changes. If you use it, document exactly why and set a reminder to revisit after the next major data change or ANALYZE TABLE run.