The "DB::Exception: No remote shard available for distributed query" error in ClickHouse occurs when a distributed query cannot reach any replica for one or more shards. The NO_REMOTE_SHARD_AVAILABLE error code indicates that all replicas within a shard are unreachable, making it impossible to execute the distributed query completely.
Impact
The distributed query fails entirely. Even if some shards are healthy, ClickHouse cannot return a partial result by default (unless configured to do so). This means no results are returned to the client. If this condition persists, all queries against the distributed table that need data from the affected shard will fail.
Common Causes
- All replicas in the affected shard are down or unreachable due to network issues
- Firewall rules blocking inter-node communication on the ClickHouse native protocol port (default 9000)
- The cluster configuration in ClickHouse references hostnames or IPs that are incorrect or no longer valid
- DNS resolution failures preventing ClickHouse from connecting to remote nodes
- The remote nodes are overloaded and not accepting new connections
- TLS/SSL certificate issues between cluster nodes preventing secure connections
- ZooKeeper-based cluster discovery returning outdated or incorrect node information
Troubleshooting and Resolution Steps
Check the cluster configuration and health of all shards:
SELECT cluster, shard_num, replica_num, host_name, host_address, port, is_local, errors_count FROM system.clusters WHERE cluster = 'your_cluster';Test connectivity to the unreachable shard:
# Test network connectivity: nc -zv remote-shard-host 9000 # Test ClickHouse connectivity: clickhouse-client -h remote-shard-host --query "SELECT 1"Check the replica status on the remote shard (if accessible via other means):
SELECT database, table, is_readonly, is_session_expired FROM system.replicas;Review ClickHouse logs for connection errors:
grep -i 'connection\|shard\|replica\|refused\|timeout' /var/log/clickhouse-server/clickhouse-server.log | tail -30If partial results are acceptable, enable the skip_unavailable_shards setting:
SET skip_unavailable_shards = 1; SELECT * FROM your_distributed_table;Note: this returns incomplete results without warning.
Verify DNS resolution from the ClickHouse server:
dig remote-shard-hostname nslookup remote-shard-hostnameCheck if the remote ClickHouse service is running:
ssh remote-shard-host "systemctl status clickhouse-server"
Best Practices
- Configure multiple replicas per shard so that a single node failure does not make the shard unavailable.
- Monitor cluster connectivity proactively using the
errors_countcolumn insystem.clusters. - Use
skip_unavailable_shardscautiously and only when partial results are explicitly acceptable for the use case. - Set up health check monitoring for all cluster nodes with alerts for unreachable shards.
- Use connection pooling and appropriate timeout settings (
connect_timeout_with_failover_ms) to handle transient network issues. - Keep the cluster configuration up to date when nodes are added, removed, or have their addresses changed.
- Consider using
distributed_replica_error_half_lifeanddistributed_replica_error_capsettings to control how ClickHouse handles flaky replicas.
Frequently Asked Questions
Q: Can I get partial results from the available shards?
A: Yes, by setting skip_unavailable_shards = 1. However, the results will be incomplete and may be misleading for aggregate queries (e.g., counts and sums will be lower than actual). Use this option carefully and ensure the application understands the results may be partial.
Q: How does ClickHouse choose which replica to query within a shard?
A: ClickHouse uses a configurable load balancing strategy set by the load_balancing setting. Options include random, nearest_hostname, round_robin, in_order, and first_or_random. If the chosen replica is unavailable, ClickHouse tries the next one.
Q: How long does ClickHouse wait before declaring a shard unavailable?
A: The connect_timeout_with_failover_ms setting (default 1000ms per attempt) controls how long ClickHouse waits when trying to connect to each replica. The connections_with_failover_max_tries setting (default 3) controls the total number of retry attempts across replicas.