NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

ClickHouse DB::Exception: Cannot parse input assertion failed

The "DB::Exception: Cannot parse input: expected ..." error in ClickHouse fires when the parser encounters bytes that violate a structural expectation of the chosen format. The CANNOT_PARSE_INPUT_ASSERTION_FAILED error code means the parser tried to match a specific token -- a delimiter, a bracket, a keyword -- and found something else instead. It is essentially ClickHouse saying, "I expected to see X at this position, but I found Y."

Impact

This error rejects the entire batch, leaving the table unchanged:

  • INSERT operations fail completely, and no rows are committed.
  • Because the error is format-structural (not a single-value type mismatch), it often indicates a fundamental mismatch between the data and the declared format.
  • Retrying without fixing the root cause will produce the same failure every time.

Common Causes

  1. Wrong delimiter -- the data uses commas but ClickHouse is parsing as TSV (which expects tabs), or vice versa.
  2. Format mismatch -- declaring FORMAT JSONEachRow when the data is actually CSV, or using VALUES when the data is in another format.
  3. Missing or extra fields -- the parser expects a delimiter between fields but finds a newline (too few fields) or expects a newline but finds another delimiter (too many fields).
  4. Incorrect line endings -- the data uses \r\n (Windows) line endings, and the parser does not handle the extra carriage return correctly in some edge cases.
  5. BOM (byte order mark) -- a UTF-8 BOM at the start of the file is interpreted as unexpected bytes before the first valid token.
  6. Template or custom format issues -- when using CustomSeparated or Template formats, the format string does not match the actual data layout.

Troubleshooting and Resolution Steps

  1. Read the full error message. It typically says something like expected '\t' but found ',', which immediately tells you the delimiter is wrong:

    DB::Exception: Cannot parse input: expected '\t' at position 42 ...
    
  2. Verify the format matches the data. Open the file and inspect the actual delimiters:

    head -3 /path/to/data.tsv | cat -A
    

    The cat -A flag will show tabs as ^I and line endings as $ or ^M$.

  3. Switch to the correct format. If the data is comma-separated, use CSV:

    INSERT INTO my_table FORMAT CSV
    

    If it is tab-separated, use TSV:

    INSERT INTO my_table FORMAT TSV
    
  4. Remove BOM if present. Strip the BOM from UTF-8 files:

    sed -i '1s/^\xEF\xBB\xBF//' /path/to/data.csv
    
  5. Handle custom delimiters. If your data uses a pipe (|) or semicolon (;) as a delimiter, use CustomSeparated format with the appropriate settings:

    SET format_custom_field_delimiter = '|';
    INSERT INTO my_table FORMAT CustomSeparated
    
  6. Normalize line endings if you suspect Windows-style line endings are causing issues:

    dos2unix /path/to/data.csv
    
  7. Test with clickhouse-local to experiment with format settings without affecting a running server:

    clickhouse-local --query="SELECT * FROM file('data.csv', CSV, 'a String, b Int32') LIMIT 5"
    

Best Practices

  • Always inspect the first few lines of your data with visible whitespace characters (cat -A or hexdump -C) before choosing a format.
  • Document the expected format, delimiter, and line ending convention for each data source in your pipeline.
  • Use clickhouse-local for rapid format experimentation during development.
  • When generating data programmatically, use a library that handles formatting and escaping correctly rather than building delimited strings manually.
  • Strip BOMs from files as a standard preprocessing step, since most tools that write BOMs do so unnecessarily.

Frequently Asked Questions

Q: The error says "expected '\t' but found ','". What does this mean?
A: ClickHouse is parsing your data as TSV (tab-separated) but found a comma instead of a tab. Switch to FORMAT CSV or change the data to use tab delimiters.

Q: I am using CustomSeparated format and still getting this error. What should I check?
A: Verify that format_custom_field_delimiter, format_custom_row_before_delimiter, format_custom_row_after_delimiter, and format_custom_escaping_rule all match the actual structure of your data. Even a single mismatched character will trigger this error.

Q: Can line ending differences between Windows and Linux cause this error?
A: Yes. If your data has \r\n line endings but ClickHouse expects \n, the carriage return (\r) can be interpreted as part of the last field's value, which then fails to parse correctly. Converting to Unix line endings with dos2unix resolves this.

Q: Does this error ever happen with JSON formats?
A: It can, though it is less common. For example, if the parser expects a { at the start of a JSONEachRow line and finds something else (like a [ from a JSON array format), this error will fire.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.