ClickHouse DB::Exception: Unexpected end of file

The "DB::Exception: Unexpected end of file" error in ClickHouse signals that the server reached the end of an input stream before it finished reading a complete data block. In other words, ClickHouse was expecting more data but the source ran dry. This is the UNEXPECTED_END_OF_FILE error, and it shows up most often during bulk imports of CSV, TSV, or other text formats -- though it can also appear when reading from remote sources or replicating data between nodes.

Impact

When this error fires, the entire INSERT operation is rolled back. No partial data makes it into the target table, which means:

Data pipelines stall until the root cause is addressed.
Downstream consumers may see gaps or stale data.
If retries are not handled carefully, you risk duplicated work or repeated failures that pile up in logs.

Common Causes

Truncated source files -- a file was only partially written to disk (e.g., an upstream ETL job crashed mid-export) and ClickHouse tries to read past what exists.
Network interruption -- when streaming data over HTTP or the native TCP protocol, a dropped connection leaves the server with an incomplete payload.
Incorrect Content-Length header -- an HTTP client advertises more bytes than it actually sends, so ClickHouse keeps reading until the socket closes unexpectedly.
Compressed data corruption -- gzip or lz4 streams that are truncated will decompress partially and then signal an abrupt EOF.
Client timeout or crash -- the sending application (clickhouse-client, a Python driver, etc.) terminates before flushing all data.
Disk full on the client side -- when ClickHouse reads from a local file, the file might appear complete to the OS but was silently truncated due to a full filesystem during creation.

Troubleshooting and Resolution Steps

Verify the source file is complete. Check the file size and try parsing it outside of ClickHouse:
```
wc -l /path/to/data.csv
tail -5 /path/to/data.csv
```
If the last line is incomplete or the file ends mid-record, the file is truncated.
Re-download or regenerate the file. If the source came from an object store or remote system, fetch it again and compare checksums:
```
md5sum /path/to/data.csv
```
Test with a small subset first. Extract the first thousand lines and try the import to confirm the format itself is valid:
```
head -1000 /path/to/data.csv | clickhouse-client --query="INSERT INTO my_table FORMAT CSV"
```
Inspect compressed files. If you are importing a gzip file, verify its integrity:
```
gzip -t /path/to/data.csv.gz
```
Check network stability. When inserting over HTTP, look at proxy and load balancer timeout settings. Increase send_timeout and receive_timeout on the ClickHouse server if needed:
```
<send_timeout>300</send_timeout>
<receive_timeout>300</receive_timeout>
```
Review client-side logs. The sending application may have logged an error (out-of-memory, segfault, timeout) that explains why it stopped sending data mid-stream.
Use input_format_allow_errors_num cautiously. This setting lets ClickHouse skip malformed rows, but it will not help if the stream itself is cut off -- the parser still hits an unexpected EOF. It is useful mainly for row-level issues.

Best Practices

Always validate source files before importing. A quick checksum or line-count comparison catches truncation early.
Use chunked uploads for large datasets so that a failure affects only one chunk rather than the entire load.
Implement retry logic with idempotent inserts (deduplication tokens or ReplacingMergeTree) so that a failed partial load can be safely retried.
Monitor network health between clients and ClickHouse nodes, especially when transferring data across regions.
When using HTTP inserts, set appropriate timeouts on both the client and server side to avoid silent connection drops.

Frequently Asked Questions

Q: Does ClickHouse insert any rows if this error occurs, or is the whole batch lost?
A: The entire INSERT is atomic in ClickHouse. If the server encounters UNEXPECTED_END_OF_FILE before the stream is complete, the whole batch is rejected and no rows are written.

Q: Can I use input_format_allow_errors_num to work around this error?
A: No. That setting handles malformed individual rows within an otherwise complete stream. An unexpected EOF means the stream itself is incomplete, so the parser cannot continue regardless of error tolerance settings.

Q: I am inserting via HTTP and getting this error intermittently. What should I check?
A: Look at load balancers, reverse proxies, and firewalls between your client and ClickHouse. Many of these have idle or request-body timeouts that can terminate long-running uploads. Also verify that the client is not using chunked transfer encoding incorrectly.

Q: Does this error appear when reading from S3 or other object storage?
A: Yes. If the object in S3 is corrupt or if the connection to S3 drops during a read, ClickHouse will report this same error. Retrying the query usually resolves transient network issues.

Q: How can I tell whether the problem is with the file or the network?
A: Import the file locally on the ClickHouse server itself (bypassing the network). If the local import succeeds, the issue is network-related. If it fails, the file is the problem.