ClickHouse DB::Exception: Unknown status of replicated INSERT

The "DB::Exception: Unknown status of replicated INSERT" error in ClickHouse occurs when the server cannot confirm whether a replicated INSERT operation was successfully committed. The UNKNOWN_STATUS_OF_INSERT error code indicates an ambiguous outcome: the data may or may not have been written successfully, and ClickHouse cannot determine the result due to a failure in the replication confirmation process.

Impact

This error creates uncertainty about data state. The INSERT may have succeeded (data is present), partially succeeded, or failed entirely. The client cannot distinguish between these outcomes from the error alone. This is a critical situation for data integrity because naive retry logic could either create duplicates (if the original succeeded) or lose data (if it did not and the retry is not attempted).

Common Causes

  1. ZooKeeper session timeout during the INSERT confirmation phase, after data was written to disk but before the replication log entry was confirmed
  2. Network partition between the ClickHouse server and ZooKeeper during the commit phase
  3. ZooKeeper cluster instability (leader election, quorum loss) at the moment of insert confirmation
  4. The ClickHouse server crashed or was killed during the insert finalization
  5. High ZooKeeper latency causing the confirmation to exceed the client timeout
  6. Overloaded ZooKeeper cluster unable to process the commit in time

Troubleshooting and Resolution Steps

  1. Check if the data was actually inserted by querying the table:

    SELECT count() FROM your_table
    WHERE <conditions matching the inserted data>;
    
  2. Check the query log for the insert's status:

    SELECT query_id, type, exception_code, written_rows, result_rows
    FROM system.query_log
    WHERE query_id = 'your_query_id'
    ORDER BY event_time DESC;
    
  3. Check ZooKeeper health and latency:

    echo mntr | nc zookeeper-host 2181 | grep latency
    
  4. Verify replica synchronization status:

    SELECT database, table, is_readonly, future_parts, queue_size, inserts_in_queue
    FROM system.replicas
    WHERE table = 'your_table';
    
  5. If using deduplication (the default for ReplicatedMergeTree), it is safe to retry the INSERT:

    -- With insert_deduplicate = 1 (default), retrying is safe:
    -- If the original succeeded, the retry will be deduplicated
    -- If the original failed, the retry will insert the data
    INSERT INTO your_table VALUES (...);
    
  6. Check ZooKeeper session status in the ClickHouse logs:

    grep -i 'zookeeper\|session\|unknown.*status' /var/log/clickhouse-server/clickhouse-server.log | tail -30
    
  7. Monitor ZooKeeper request latency to prevent future occurrences:

    SELECT * FROM system.zookeeper WHERE path = '/';
    

Best Practices

  • Always keep insert_deduplicate = 1 (the default) for ReplicatedMergeTree tables. This makes INSERT retries safe by automatically detecting and rejecting duplicate blocks.
  • Use consistent insert block sizes across retries so the deduplication hash matches the original attempt.
  • Include a query ID with each INSERT so you can trace the outcome in system.query_log. The query ID is set by the client interface, not via a SETTINGS clause:
    clickhouse-client --query_id 'unique-insert-id-123' \
      --query "INSERT INTO your_table VALUES (...)"
    # Over HTTP: pass query_id as a URL parameter, e.g. ?query_id=unique-insert-id-123
    
  • Monitor ZooKeeper latency and session health. High latency is often a precursor to this error.
  • Implement application-level retry with deduplication reliance rather than trying to verify the insert status manually.
  • Ensure your ZooKeeper cluster is properly sized for the ClickHouse workload (sufficient nodes, memory, and disk I/O).
  • Consider using insert_quorum to get stronger confirmation guarantees at the cost of higher latency.

Frequently Asked Questions

Q: Is it safe to retry the INSERT after this error?
A: Yes, if insert_deduplicate is enabled (the default). ClickHouse computes a hash of the inserted data block and stores it in ZooKeeper. If the original insert succeeded, the retry will be detected as a duplicate and silently ignored. If it failed, the retry will insert the data normally.

Q: How is this different from a regular INSERT failure?
A: A regular INSERT failure (e.g., timeout, network error) clearly indicates the operation did not succeed. UNKNOWN_STATUS_OF_INSERT specifically means ClickHouse does not know whether it succeeded or failed. The data may or may not be in the table.

Q: Can I use insert_quorum to prevent this error?
A: insert_quorum requires confirmation from multiple replicas before acknowledging the insert. While it provides stronger guarantees, it does not completely eliminate the possibility of unknown status in cases of severe network partitions. It does reduce the window of uncertainty.

Q: How long does ClickHouse remember insert hashes for deduplication?
A: The replicated_deduplication_window (default 10000 blocks) and replicated_deduplication_window_seconds (default 3600 seconds / 1 hour) settings control how long insert hashes are retained. Retries within this window will be deduplicated. (These defaults have changed across versions, so confirm them on your deployment via system.merge_tree_settings.)

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.