ClickHouse DB::Exception: Received empty data

Q: How does RECEIVED_EMPTY_DATA differ from NO_DATA_TO_INSERT?

NO_DATA_TO_INSERT is specific to INSERT operations where the query was received but no data payload followed. RECEIVED_EMPTY_DATA is broader -- it can occur in any context where a data block was expected but an empty one was received, including during reads from external sources and inter-node communication.

The "DB::Exception: Received empty data" error in ClickHouse is raised when the server or a component within it receives an empty data block in a context where actual data was expected. The error code is RECEIVED_EMPTY_DATA. This can surface during data ingestion, inter-node communication, or when reading from external sources that return nothing.

Impact

The effect of this error depends on where it occurs:

An INSERT operation will fail and write no rows
A query reading from an external table function (e.g., url(), s3(), remote()) will abort
Data pipelines streaming data into ClickHouse will need to handle the error and potentially retry
Existing data in the target table is not affected

Common Causes

Empty HTTP response body -- When using url() or HTTP-based table functions, the remote server returns a 200 status with no body content.
Empty file on object storage -- An S3 or GCS file referenced in a query contains zero bytes.
Client sends an empty data block -- The client library closes the data stream without sending any blocks, or sends an explicit empty block prematurely.
Inter-node communication issue -- In distributed queries, a remote node sends an empty block where data was expected, possibly due to an error on the remote side.
Corrupted or truncated input -- A file or stream that was expected to contain data is truncated to zero length.
Compression layer returning empty output -- A compressed data stream that decompresses to nothing, often due to corruption.

Troubleshooting and Resolution Steps

Check the data source directly to confirm it actually contains data:

# For a file
wc -c /path/to/data/file

# For an S3 object
aws s3 ls s3://bucket/path/to/file

# For a URL
curl -sI http://remote-host/data-endpoint | grep Content-Length

Test with a minimal query to isolate the issue:

-- If reading from S3
SELECT count() FROM s3('https://bucket.s3.amazonaws.com/file.csv', 'CSVWithNames');

-- If reading from a URL
SELECT * FROM url('http://example.com/data', 'JSONEachRow', 'col1 String') LIMIT 5;

Inspect the full error message for clues about which stage failed. The error often includes context about the data source or the operation that was in progress.

If the error occurs during an INSERT, verify the data stream:

# Check if the pipe or input is empty
echo '{"col1": "test"}' | clickhouse-client --query "INSERT INTO my_table FORMAT JSONEachRow"

For inter-node issues, check the remote node's logs:
```
grep -i "error\|exception" /var/log/clickhouse-server/clickhouse-server.log | tail -30
```
The remote node may have encountered an error that caused it to send an empty response.

Verify the file is not a zero-byte placeholder. Some data pipelines create marker files or empty files that ClickHouse may attempt to read:

-- Use a glob pattern that excludes empty files (handle at application level)
SELECT * FROM s3('s3://bucket/data/*.parquet') WHERE _size > 0;

Best Practices

Validate data source availability and content before issuing queries that depend on external data.
In ETL pipelines, add checks for empty files or responses before feeding data into ClickHouse.
When using glob patterns with s3() or file(), be aware that empty files matching the pattern can trigger this error.
Implement retry logic with appropriate backoff for transient data availability issues from external sources.
Monitor external data sources for availability and content to detect problems before they affect ClickHouse queries.

Frequently Asked Questions

Q: How does RECEIVED_EMPTY_DATA differ from NO_DATA_TO_INSERT?
A: NO_DATA_TO_INSERT is specific to INSERT operations where the query was received but no data payload followed. RECEIVED_EMPTY_DATA is broader -- it can occur in any context where a data block was expected but an empty one was received, including during reads from external sources and inter-node communication.

Q: Can this error occur intermittently?
A: Yes. If the data source is temporarily unavailable or returns empty responses sporadically (e.g., a flaky HTTP endpoint), the error will appear intermittently. Implementing retry logic is the appropriate mitigation.

Q: Does this error affect data already stored in the table?
A: No. The error prevents the current operation from completing, but it does not corrupt or modify existing data.

Q: Is there a setting to skip empty data blocks instead of failing?
A: There is no universal setting, but for specific table functions like s3(), you can use the SETTINGS clause with s3_skip_empty_files = 1 (available in newer ClickHouse versions) to skip empty files in glob patterns.