The "DB::Exception: Too many rows or bytes processed" error in ClickHouse is triggered when a query exceeds combined limits on both the number of rows and the volume of bytes processed. The TOO_MANY_ROWS_OR_BYTES error code indicates that the query has surpassed either max_rows_to_read or max_bytes_to_read (or their equivalents for other stages), and ClickHouse terminates the operation to protect system resources.
Impact
The query fails without returning results. This error frequently affects analytical queries that scan large tables without sufficient filtering, as well as batch operations that attempt to process entire datasets. In multi-tenant environments, this limit is particularly important for preventing a single user's query from monopolizing server resources.
Common Causes
- Queries scanning large tables without partition-pruning WHERE clauses
max_rows_to_readormax_bytes_to_readsettings configured too low for the workload- Missing or ineffective primary key filters, forcing full granule scans
- Cross joins or joins that produce a cartesian product of rows
- User profiles or quotas that impose strict combined row and byte limits
- Aggregation queries that must read large volumes of data before producing a small result set
Troubleshooting and Resolution Steps
Identify which limit was exceeded. The error message usually specifies whether it was the row limit, byte limit, or both:
SELECT name, value FROM system.settings WHERE name IN ('max_rows_to_read', 'max_bytes_to_read', 'max_rows_to_group_by', 'max_bytes_to_group_by');Estimate the query's data footprint before execution:
EXPLAIN ESTIMATE SELECT count() FROM my_table WHERE date >= '2025-01-01';Add partition key filters to reduce data scanned:
-- Ensure your WHERE clause includes the partition key SELECT * FROM events WHERE event_date >= '2025-03-01' AND event_date < '2025-03-15' AND user_id = 12345;Increase limits for specific queries when a larger scan is justified:
SET max_rows_to_read = 100000000; SET max_bytes_to_read = 50000000000; SELECT ...;Use overflow modes to get partial results instead of an error:
SET read_overflow_mode = 'break'; SELECT * FROM large_table;This returns data up to the limit and stops without throwing an error.
Optimize the table's primary key to enable better data skipping:
-- Check how many granules are scanned SELECT * FROM system.query_log WHERE query_id = 'your_query_id' FORMAT Vertical; -- Look at read_rows and read_bytesPre-aggregate data using materialized views to reduce the volume of raw data queries need to scan.
Best Practices
- Always include partition key columns in WHERE clauses for large tables.
- Configure row and byte limits at the user profile level, with appropriate values for each type of workload.
- Use
EXPLAIN ESTIMATEduring query development to understand data volume before running queries. - Design primary keys to maximize data skipping for your most common query patterns.
- Implement materialized views for frequently run aggregations to reduce repeated scans of raw data.
- In multi-tenant setups, use quotas and settings profiles to enforce per-user resource limits.
Frequently Asked Questions
Q: Does ClickHouse check rows and bytes independently or together?
A: They are checked independently. The query fails if either limit is exceeded. You do not need to exceed both simultaneously for the error to occur.
Q: What is the difference between TOO_MANY_ROWS_OR_BYTES and TOO_MANY_ROWS or TOO_MANY_BYTES?
A: TOO_MANY_ROWS and TOO_MANY_BYTES apply to specific individual limits (like result rows or bytes at a particular processing stage). TOO_MANY_ROWS_OR_BYTES is a broader error code that can be raised when combined row and byte processing limits are exceeded during data reading.
Q: Do these limits apply to the compressed or uncompressed data size?
A: The byte limits apply to uncompressed data as processed by ClickHouse, not the compressed on-disk size. Actual data read from disk may be much smaller due to compression.
Q: Can I set different limits for different stages of query processing?
A: Yes. ClickHouse provides separate settings for different stages: max_rows_to_read / max_bytes_to_read for the reading phase, max_rows_to_group_by / max_bytes_to_group_by for aggregation, max_rows_to_sort / max_bytes_to_sort for sorting, and others.