ClickHouse DB::Exception: DNS resolution failed

Q: Can I use /etc/hosts instead of DNS?

Yes. Entries in /etc/hosts are resolved locally without DNS. This can serve as a reliable fallback for cluster nodes, though it requires manual updates across all servers when IP addresses change.

The "DB::Exception: DNS resolution failed" error in ClickHouse occurs when the server cannot resolve a hostname to an IP address. The DNS_ERROR error code surfaces in several contexts: connecting to cluster nodes defined by hostname, accessing remote tables, resolving dictionary source addresses, or connecting to ZooKeeper/ClickHouse Keeper. DNS resolution is a prerequisite for any network communication that uses hostnames rather than IP addresses.

Impact

DNS failures affect ClickHouse operations broadly:

Distributed queries across shards will fail if shard hostnames cannot be resolved
Replication stops if the ZooKeeper or ClickHouse Keeper hostname is unresolvable
Remote table functions and dictionaries sourced from external systems will fail
New connections from clients using hostnames will be rejected
Cluster health checks may report nodes as down even when they are running

Common Causes

DNS server is unreachable or unresponsive
The hostname is misspelled in the ClickHouse configuration
DNS records for the target host have been deleted or not yet propagated
/etc/resolv.conf is misconfigured or points to a dead resolver
Network partition isolating the ClickHouse server from the DNS infrastructure
DNS cache poisoning or corruption returning incorrect results
Container or pod DNS configuration issues in Kubernetes environments
Firewall blocking UDP/TCP port 53 to the DNS server

Troubleshooting and Resolution Steps

Test DNS resolution from the ClickHouse server:
```
nslookup problematic-hostname
dig problematic-hostname
host problematic-hostname
```
If all three fail, the problem is with DNS, not ClickHouse.
Check the DNS configuration:
```
cat /etc/resolv.conf
```
Verify that nameserver entries point to valid, reachable DNS servers:
```
ping -c 2 $(awk '/^nameserver/{print $2; exit}' /etc/resolv.conf)
```
Verify the hostname is correct in ClickHouse config:
```
grep -r "problematic-hostname" /etc/clickhouse-server/
```
Fix any typos in cluster definitions, remote server configurations, or dictionary sources.
Check if the DNS record exists:
```
dig problematic-hostname @8.8.8.8
```
Using a public DNS server helps determine if the record exists globally or only in your private DNS.
Flush the ClickHouse DNS cache:
```
SYSTEM DROP DNS CACHE;
```
ClickHouse caches DNS results internally. If a record recently changed, the cache may hold stale data.
For Kubernetes environments, verify CoreDNS/kube-dns is running:
```
kubectl get pods -n kube-system | grep dns
kubectl logs -n kube-system <dns-pod-name>
```
Check that the ClickHouse pod's DNS policy allows it to resolve both cluster-internal and external names.
Use IP addresses as a workaround while debugging DNS issues: Replace hostnames with IP addresses in cluster configuration temporarily to restore functionality:
```
<shard>
  <replica>
    <host>10.0.1.5</host>  
    <port>9000</port>
  </replica>
</shard>
```

Best Practices

Use stable, reliable DNS infrastructure with redundant nameservers
Consider using IP addresses in cluster configurations for critical infrastructure to eliminate DNS as a failure point
Set dns_cache_update_period in ClickHouse config to control how often cached DNS entries are refreshed
In Kubernetes, ensure DNS policies are correctly set and CoreDNS has sufficient resources
Monitor DNS resolution latency and failures as part of your infrastructure monitoring
Maintain /etc/hosts entries as a fallback for critical cluster nodes
Test DNS resolution during infrastructure changes before they reach production

Frequently Asked Questions

Q: Does ClickHouse cache DNS lookups?
A: Yes. ClickHouse maintains an internal DNS cache to avoid repeated lookups. You can clear it with SYSTEM DROP DNS CACHE and control the refresh interval with the dns_cache_update_period setting.

Q: Can I use /etc/hosts instead of DNS?
A: Yes. Entries in /etc/hosts are resolved locally without DNS. This can serve as a reliable fallback for cluster nodes, though it requires manual updates across all servers when IP addresses change.

Q: Why does this error appear intermittently?
A: Intermittent DNS failures often point to an overloaded DNS server, network packet loss, or DNS round-robin returning unreachable IPs. Check DNS server health and network stability.

Q: Does this error affect ClickHouse Keeper connections?
A: Yes. If ClickHouse Keeper or ZooKeeper hostnames cannot be resolved, replicated tables lose their coordination layer. Replication, DDL operations, and distributed DDL will all be affected until DNS is restored.