ClickHouse DB::Exception: Distributed table too many pending bytes

Q: Can I set different limits for different distributed tables?

Yes. The bytes_to_throw_insert and bytes_to_delay_insert settings can be specified per table in the distributed table engine parameters, allowing different thresholds for different tables.

Q: Is there a way to delay inserts instead of rejecting them outright?

Yes. The bytes_to_delay_insert setting introduces an artificial delay on inserts when the pending bytes exceed a certain threshold, applying backpressure before hitting the hard rejection limit. This gives the distribution monitor time to catch up.

The "DB::Exception: Distributed table too many pending bytes" error in ClickHouse occurs when the volume of data queued for asynchronous delivery to remote shards exceeds the configured limit. The error code is DISTRIBUTED_TOO_MANY_PENDING_BYTES. When you INSERT into a distributed table, ClickHouse writes the data to a local buffer directory and sends it to the target shards asynchronously. If the sending process falls behind, the pending data accumulates until this threshold is reached.

Impact

This error blocks further INSERT operations into the affected distributed table, which can cause:

Data ingestion pipeline failures and backpressure
Growing data loss risk if upstream systems discard data they cannot deliver
Disk space consumption on the initiator node from accumulated pending files
A clear signal that the cluster's write throughput is insufficient for the current load

Common Causes

Target shards are unreachable or slow -- Network issues or overloaded shards prevent timely delivery of buffered data.
High insert throughput exceeding shard capacity -- The rate of incoming data surpasses what the distributed sending mechanism can flush to remote shards.
Distributed directory monitor is too slow -- The background process that ships pending data is not running frequently enough or is bottlenecked.
Disk I/O contention on the initiator node -- The node buffering the data has slow disk performance, slowing both writes and reads of pending files.
The max_bytes_to_distribute limit is set too low -- The configured threshold does not accommodate normal burst patterns.
Remote shards rejecting inserts -- Tables in read-only mode, schema mismatches, or quota limits on the shard side prevent data from being accepted.

Troubleshooting and Resolution Steps

Check the current pending bytes for distributed tables:

SELECT
    database,
    table,
    bytes_to_send,
    files_to_send
FROM system.distribution_queue
ORDER BY bytes_to_send DESC;

This shows how much data is queued for each distributed table.

Verify connectivity to target shards:

SELECT cluster, shard_num, replica_num, host_name, port
FROM system.clusters
WHERE cluster = 'your_cluster';

Then test connectivity to each shard:

clickhouse-client --host <shard_host> --port 9000 --query "SELECT 1"

Check for errors in the distribution queue:

SELECT
    database,
    table,
    error_count,
    last_error
FROM system.distribution_queue
WHERE error_count > 0;

Increase the pending bytes limit if the current value is too restrictive:

-- Per-table setting in the distributed table definition:
ALTER TABLE your_distributed_table
MODIFY SETTING bytes_to_throw_insert = 5000000000;  -- 5 GB

Speed up the distribution monitor:

<!-- In the distributed table engine settings -->
<distributed_directory_monitor_sleep_time_ms>100</distributed_directory_monitor_sleep_time_ms>
<distributed_directory_monitor_max_sleep_time_ms>10000</distributed_directory_monitor_max_sleep_time_ms>

Lower sleep times cause the monitor to check for pending data more frequently.

Manually flush pending data:
```
SYSTEM FLUSH DISTRIBUTED your_db.your_distributed_table;
```
This forces an immediate attempt to send all pending data.
If shards are rejecting data, investigate the root cause on the shard side -- check for read-only mode, schema mismatches, or full disks:
```
-- On the shard node:
SELECT * FROM system.replicas WHERE is_readonly = 1;
```

Best Practices

Monitor system.distribution_queue regularly and alert on rising bytes_to_send values before they hit the threshold.
Size the bytes_to_throw_insert limit to accommodate expected burst patterns while still protecting against unbounded growth.
Consider inserting directly into local tables on each shard (using a load balancer or client-side routing) for high-throughput workloads, bypassing the distributed table's buffering mechanism entirely.
Ensure the initiator node has fast disk I/O for the distributed table's data directory, as pending files are written and read from disk.
Keep shard nodes healthy and responsive -- slow or unavailable shards are the most common root cause of pending data accumulation.
Use fsync_after_insert and fsync_directories settings appropriately to balance durability against write performance.

Frequently Asked Questions

Q: Where are the pending bytes stored on disk?
A: ClickHouse stores pending data in subdirectories under the distributed table's data path, typically under /var/lib/clickhouse/data/<database>/<distributed_table>/. Each shard has its own subdirectory containing files waiting to be sent.

Q: Will I lose data if I restart ClickHouse while there are pending bytes?
A: No. Pending data is persisted to disk. When ClickHouse restarts, the distribution monitor will resume sending the queued files to their target shards.

Q: What happens if I drop the distributed table while pending data exists?
A: The pending data files will be removed along with the table. Any unsent data will be lost. If you need to preserve the data, flush it first with SYSTEM FLUSH DISTRIBUTED before dropping the table.

Q: Can I set different limits for different distributed tables?
A: Yes. The bytes_to_throw_insert and bytes_to_delay_insert settings can be specified per table in the distributed table engine parameters, allowing different thresholds for different tables.

Q: Is there a way to delay inserts instead of rejecting them outright?
A: Yes. The bytes_to_delay_insert setting introduces an artificial delay on inserts when the pending bytes exceed a certain threshold, applying backpressure before hitting the hard rejection limit. This gives the distribution monitor time to catch up.