NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

ClickHouse DB::Exception: Too many rows or bytes processed

The "DB::Exception: Too many rows or bytes processed" error in ClickHouse is triggered when a query exceeds combined limits on both the number of rows and the volume of bytes processed. The TOO_MANY_ROWS_OR_BYTES error code indicates that the query has surpassed either max_rows_to_read or max_bytes_to_read (or their equivalents for other stages), and ClickHouse terminates the operation to protect system resources.

Impact

The query fails without returning results. This error frequently affects analytical queries that scan large tables without sufficient filtering, as well as batch operations that attempt to process entire datasets. In multi-tenant environments, this limit is particularly important for preventing a single user's query from monopolizing server resources.

Common Causes

  1. Queries scanning large tables without partition-pruning WHERE clauses
  2. max_rows_to_read or max_bytes_to_read settings configured too low for the workload
  3. Missing or ineffective primary key filters, forcing full granule scans
  4. Cross joins or joins that produce a cartesian product of rows
  5. User profiles or quotas that impose strict combined row and byte limits
  6. Aggregation queries that must read large volumes of data before producing a small result set

Troubleshooting and Resolution Steps

  1. Identify which limit was exceeded. The error message usually specifies whether it was the row limit, byte limit, or both:

    SELECT name, value FROM system.settings
    WHERE name IN ('max_rows_to_read', 'max_bytes_to_read',
                    'max_rows_to_group_by', 'max_bytes_to_group_by');
    
  2. Estimate the query's data footprint before execution:

    EXPLAIN ESTIMATE SELECT count() FROM my_table WHERE date >= '2025-01-01';
    
  3. Add partition key filters to reduce data scanned:

    -- Ensure your WHERE clause includes the partition key
    SELECT * FROM events
    WHERE event_date >= '2025-03-01' AND event_date < '2025-03-15'
    AND user_id = 12345;
    
  4. Increase limits for specific queries when a larger scan is justified:

    SET max_rows_to_read = 100000000;
    SET max_bytes_to_read = 50000000000;
    SELECT ...;
    
  5. Use overflow modes to get partial results instead of an error:

    SET read_overflow_mode = 'break';
    SELECT * FROM large_table;
    

    This returns data up to the limit and stops without throwing an error.

  6. Optimize the table's primary key to enable better data skipping:

    -- Check how many granules are scanned
    SELECT * FROM system.query_log
    WHERE query_id = 'your_query_id'
    FORMAT Vertical;
    -- Look at read_rows and read_bytes
    
  7. Pre-aggregate data using materialized views to reduce the volume of raw data queries need to scan.

Best Practices

  • Always include partition key columns in WHERE clauses for large tables.
  • Configure row and byte limits at the user profile level, with appropriate values for each type of workload.
  • Use EXPLAIN ESTIMATE during query development to understand data volume before running queries.
  • Design primary keys to maximize data skipping for your most common query patterns.
  • Implement materialized views for frequently run aggregations to reduce repeated scans of raw data.
  • In multi-tenant setups, use quotas and settings profiles to enforce per-user resource limits.

Frequently Asked Questions

Q: Does ClickHouse check rows and bytes independently or together?
A: They are checked independently. The query fails if either limit is exceeded. You do not need to exceed both simultaneously for the error to occur.

Q: What is the difference between TOO_MANY_ROWS_OR_BYTES and TOO_MANY_ROWS or TOO_MANY_BYTES?
A: TOO_MANY_ROWS and TOO_MANY_BYTES apply to specific individual limits (like result rows or bytes at a particular processing stage). TOO_MANY_ROWS_OR_BYTES is a broader error code that can be raised when combined row and byte processing limits are exceeded during data reading.

Q: Do these limits apply to the compressed or uncompressed data size?
A: The byte limits apply to uncompressed data as processed by ClickHouse, not the compressed on-disk size. Actual data read from disk may be much smaller due to compression.

Q: Can I set different limits for different stages of query processing?
A: Yes. ClickHouse provides separate settings for different stages: max_rows_to_read / max_bytes_to_read for the reading phase, max_rows_to_group_by / max_bytes_to_group_by for aggregation, max_rows_to_sort / max_bytes_to_sort for sorting, and others.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.