The "DB::Exception: Received too many requests error" in ClickHouse surfaces when a remote endpoint responds with an HTTP 429 (Too Many Requests) status code, indicating that ClickHouse has exceeded the allowed request rate. This error, identified by the code RECEIVED_ERROR_TOO_MANY_REQUESTS, is most frequently seen when querying or writing to S3-compatible storage, external HTTP endpoints, or cloud APIs that enforce rate limits.
Impact
Rate-limiting errors cause the affected query or data pipeline to fail. If ClickHouse is reading from or writing to a remote source as part of an ETL workflow, the entire pipeline can stall. In clusters where multiple nodes concurrently access the same remote endpoint, throttling on one node may indicate that all nodes are being rate-limited, compounding the problem.
Common Causes
- High query concurrency against a single remote endpoint, especially when multiple ClickHouse nodes or queries access the same S3 bucket simultaneously.
- Burst patterns in data loading or reading that exceed the remote service's rate limits.
- Cloud provider throttling due to exceeding per-account or per-bucket request quotas (e.g., AWS S3 prefix-level rate limits).
- Misconfigured parallelism settings, such as too many threads reading from the same source at once.
- External API endpoints with strict per-second or per-minute rate limits.
Troubleshooting and Resolution Steps
Review the ClickHouse server logs to confirm the HTTP 429 status and identify which remote endpoint is doing the throttling:
grep "RECEIVED_ERROR_TOO_MANY_REQUESTS\|429" /var/log/clickhouse-server/clickhouse-server.err.logReduce the number of concurrent requests ClickHouse sends to the remote endpoint. For S3, you can limit the number of threads used for reading:
SET max_threads = 4; SET max_download_threads = 4;Enable and configure retry logic with backoff. ClickHouse supports retry settings for S3:
<s3> <retry_attempts>10</retry_attempts> </s3>If you are reading many small files from S3, consider consolidating them into fewer, larger files to reduce the total number of requests.
Distribute the load by spreading data across multiple S3 prefixes. AWS S3, for instance, partitions rate limits by prefix, so organizing data under different prefixes can increase your effective throughput.
For cloud-hosted ClickHouse accessing cloud storage, check whether the cloud provider offers a way to request a rate limit increase for your account or bucket.
Add delays between batch operations in your application layer to smooth out burst traffic patterns.
Best Practices
- Monitor request rates to remote endpoints and set alerts before you approach known rate limits.
- Use exponential backoff in retry logic, both in ClickHouse configuration and in any application code that triggers queries.
- Distribute data across multiple prefixes or buckets when working with cloud object storage at scale.
- Prefer reading larger, consolidated files over many small ones to minimize the total number of HTTP requests.
- Schedule heavy data loading jobs during off-peak hours when other workloads are less likely to compete for the same rate limits.
Frequently Asked Questions
Q: Is this error specific to S3?
A: No. While it is most commonly seen with S3 and S3-compatible storage, it can occur with any remote HTTP endpoint that enforces rate limits, including custom APIs and other cloud services.
Q: Will ClickHouse automatically retry when it hits a rate limit?
A: ClickHouse has built-in retry support for S3 operations. However, the default retry count may not be sufficient for heavy rate-limiting scenarios. Increasing the retry_attempts setting in the S3 configuration helps handle transient throttling.
Q: How many requests per second can S3 handle?
A: AWS S3 supports at least 5,500 GET/HEAD requests per second and 3,500 PUT/POST/DELETE requests per second per prefix. If you need higher throughput, distributing objects across multiple prefixes is the recommended approach.
Q: Can I use a caching layer to reduce requests to the remote endpoint?
A: Yes. Placing a caching proxy between ClickHouse and the remote endpoint, or using ClickHouse's local file cache for remote storage, can significantly reduce the number of outbound requests and help you stay within rate limits.