ClickHouse Replication: Cannot Resolve Host of Another Server

Q: Can I use an IP address instead of a hostname?

Yes, set interserver_http_host to the IP. This is brittle if the IP changes, so prefer a stable DNS name when possible.

Q: How do I make existing replicas pick up the new hostname?

Restart ClickHouse on the affected node after changing /etc/hostname or the config. The replica re-registers its host znode on startup.

Q: Does this affect distributed query execution?

Distributed queries use the cluster definition, not the registered FQDN, so they may keep working while replication fails. Always check both paths.

ClickHouse replication is push-pull between replicas over HTTP. Each replica registers its own FQDN in ZooKeeper or Keeper when it joins, and other replicas use that string to fetch parts. When the registered name does not resolve from peer nodes, you see errors like Cannot resolve host (xxxxx), error 0: DNS error and Not found address of host: xxxx. (DNS_ERROR). Replication stalls because parts cannot be fetched. The fix is either to make the FQDN resolvable cluster-wide, or to override what gets registered with interserver_http_host.

How ClickHouse picks the hostname

On startup, the server resolves its own FQDN via the system's hostname configuration and writes that value into the replica znode in Keeper, typically under /clickhouse/tables/.../replicas/<replica_name>/host. Other replicas read this value and call back over HTTP on port 9009 to fetch parts. If your hostname resolves only on the local machine, the rest of the cluster fails to connect.

Common scenarios that trigger this:

Containers where the hostname is the container ID and not in DNS.
Cloud VMs where the internal hostname is set to something like ip-10-0-0-5 but DNS is not configured for that name.
Hostnames that resolve internally only via /etc/hosts on one node.
DNS records that exist for one replica but not the new replica being added.

Solution 1: Fix the hostname so DNS resolves it

The cleanest fix is to ensure the registered FQDN is actually resolvable across the cluster. Set the hostname:

sudo vim /etc/hostname
# or
sudo hostnamectl set-hostname ch1.prod.company.com

Then update DNS or /etc/hosts on every other node so the new name resolves to the right IP. Restart ClickHouse so it re-registers under the new hostname.

This is the recommended approach. It keeps your environment self-describing and avoids surprises when new replicas join. Tools that introspect the cluster will report meaningful names.

Solution 2: Override with interserver_http_host

If you cannot change the hostname (locked-down image, container limitations), explicitly tell ClickHouse what to register:

<clickhouse>
    <interserver_http_host>ch1.prod.company.com</interserver_http_host>
</clickhouse>

You can also set an IP if DNS is unavailable:

<clickhouse>
    <interserver_http_host>10.0.0.5</interserver_http_host>
</clickhouse>

Restart the server. The replica znode in Keeper now records this value, and peers will use it. Note that this setting interacts with related options such as interserver_http_port, interserver_https_host, and interserver_https_port when TLS is used. There are known edge cases where interserver_http_host interacts oddly with other server configuration, so prefer fixing the hostname when feasible.

Verifying the registered hostname

Check what was actually written to Keeper:

SELECT path, value
FROM system.zookeeper
WHERE path LIKE '/clickhouse/tables/%/replicas/%/host%';

Or look at the cluster's host mapping directly:

SELECT cluster, shard_num, replica_num, host_name, host_address, port
FROM system.clusters;

system.clusters reflects the <remote_servers> config, while the per-replica HTTP endpoint registered in Keeper is what other replicas actually use to fetch parts. If a config change has not been picked up, the replica has not re-registered: restart it.

Confirming connectivity

From any other replica, test the HTTP interserver port:

curl -v http://ch1.prod.company.com:9009/

A 200 OK with Ok. confirms reachability. If curl fails with a DNS error, the underlying issue is still present.

Common Pitfalls

Setting interserver_http_host to localhost or 127.0.0.1. The value is given to other replicas, so they would call themselves.
Changing the hostname without restarting ClickHouse. The Keeper registration is stamped at startup.
Updating one node's DNS but not others. The error appears asymmetrically and looks intermittent.
Forgetting that container restarts can change the hostname. Pin it explicitly via hostname: in the container spec.
Mixing TLS and plain HTTP across replicas. Mismatched scheme makes fetches fail in ways that look like DNS errors.

Frequently Asked Questions

Q: Why does the error mention a hostname I never configured? A: ClickHouse reads the OS hostname at startup. If your system was provisioned with a generic name, that name gets registered. Either rename the host or override with interserver_http_host.

Q: Can I use an IP address instead of a hostname? A: Yes, set interserver_http_host to the IP. This is brittle if the IP changes, so prefer a stable DNS name when possible.

Q: How do I make existing replicas pick up the new hostname? A: Restart ClickHouse on the affected node after changing /etc/hostname or the config. The replica re-registers its host znode on startup.

Q: Does this affect distributed query execution? A: Distributed queries use the <remote_servers> cluster definition, not the registered FQDN, so they may keep working while replication fails. Always check both paths.

Q: I see the error only for one new node. Why? A: That node's hostname does not resolve from the others. Add it to DNS or /etc/hosts on the peers, or set interserver_http_host on the new node to a name they already know.