The "DB::Exception: Too many bytes" error in ClickHouse is triggered when data processing exceeds a configured byte-level limit such as max_bytes_to_read, max_bytes_to_sort, or max_bytes_to_group_by. The TOO_MANY_BYTES error code acts as a resource guard, preventing individual queries from reading or processing an excessive amount of data.
Impact
When this error occurs, the query is terminated and no results are delivered. This can affect long-running analytical queries that scan large partitions, as well as batch processes attempting to aggregate or sort large datasets. Repeated occurrences may indicate that queries are not leveraging ClickHouse's partitioning and indexing capabilities effectively.
Common Causes
- The
max_bytes_to_readsetting is configured and the query scans more data than allowed - Queries lack proper WHERE clauses to prune partitions and granules, forcing full table scans
- The
max_bytes_to_sortormax_bytes_to_group_bylimit is exceeded during ORDER BY or GROUP BY operations - Tables with inefficient partition keys leading to excessive data reads even for filtered queries
- User profiles or quotas enforce byte limits that are too restrictive for the workload
Troubleshooting and Resolution Steps
Identify which byte limit is being exceeded. The error message typically includes the specific setting name. Check relevant settings:
SELECT name, value FROM system.settings WHERE name IN ('max_bytes_to_read', 'max_bytes_to_sort', 'max_bytes_to_group_by', 'max_bytes_to_transfer', 'max_bytes_in_distinct');Estimate how much data your query will read:
EXPLAIN ESTIMATE SELECT * FROM my_table WHERE date = today();Add or refine filters to reduce the amount of data scanned:
-- Use partition key columns in WHERE clauses SELECT * FROM my_table WHERE date >= '2025-01-01' AND date < '2025-02-01';Increase the limit for a specific query when a larger scan is justified:
SET max_bytes_to_read = 10000000000; -- 10 GB SELECT * FROM my_table WHERE status = 'active';Change the overflow mode to return partial results instead of throwing an error:
SET max_bytes_to_read = 5000000000; SET read_overflow_mode = 'break'; SELECT * FROM my_table;Optimize table structure to reduce bytes read per query:
- Ensure the partition key aligns with common query filters
- Verify the primary key (ORDER BY in table definition) supports your most frequent queries
- Use appropriate compression codecs to reduce on-disk and in-memory data sizes
Check user-level limits:
SELECT * FROM system.settings_profile_elements WHERE setting_name LIKE 'max_bytes%';
Best Practices
- Design partition keys to align with the most common query filters, enabling effective partition pruning.
- Always include partition key columns in WHERE clauses to avoid full table scans.
- Set byte limits at the user profile level, with higher limits for analytical workloads and lower limits for ad-hoc users.
- Use
EXPLAIN ESTIMATEbefore running large queries to understand data volume impact. - Consider pre-aggregating data with materialized views to reduce the amount of raw data queries need to scan.
Frequently Asked Questions
Q: What is the default value of max_bytes_to_read?
A: The default is 0, which means unlimited. The limit is typically set explicitly in server configuration or user profiles.
Q: Does max_bytes_to_read count compressed or uncompressed bytes?
A: It counts the uncompressed bytes that ClickHouse reads and processes, not the compressed on-disk size. This means the actual data scanned can be significantly larger than what you see on disk.
Q: How does this relate to the max_rows_to_read setting?
A: They work independently. max_bytes_to_read limits total data volume in bytes, while max_rows_to_read limits the number of rows. A query is terminated if either limit is exceeded.
Q: Can I set different byte limits for reading vs. sorting vs. grouping?
A: Yes. ClickHouse provides separate settings: max_bytes_to_read, max_bytes_to_sort, max_bytes_to_group_by, and others. Each controls a different phase of query execution.