NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

ClickHouse DB::Exception: Cannot parse text input

The "DB::Exception: Cannot parse text input" error is one of the more common parsing errors you will encounter in ClickHouse. Identified by the error code CANNOT_PARSE_TEXT, it fires when the server tries to interpret a text value and fails to convert it into the target column's data type. Think of it as a catch-all for type conversion problems -- the input simply does not look like what the column expects.

Impact

When ClickHouse raises CANNOT_PARSE_TEXT, the INSERT or query that triggered it fails and no rows from that batch are committed. The practical consequences include:

  • Data ingestion stops until the offending data is corrected or skipped.
  • Pipelines relying on strict error-free inserts will back up.
  • In large batch imports, a single bad value can block millions of otherwise valid rows.

Common Causes

  1. Type mismatch between source data and column definition -- a string like "N/A" landing in an Int32 column, or a decimal value going into an integer column.
  2. Locale-specific formatting -- numbers with commas as thousands separators (e.g., 1,000) or decimal commas (e.g., 3,14) that ClickHouse does not expect.
  3. Empty strings for non-Nullable columns -- an empty field in CSV that maps to a non-Nullable numeric or date column.
  4. Encoding issues -- non-UTF-8 characters or invisible control characters embedded in otherwise normal-looking text.
  5. Misaligned columns -- the data has shifted (a missing delimiter earlier in the row) so a text value ends up being parsed as a number or date.

Troubleshooting and Resolution Steps

  1. Read the full error message carefully. ClickHouse usually includes the column name, the expected type, and the offending value. This narrows the search immediately:

    DB::Exception: Cannot parse text 'abc' as Int32: ...
    
  2. Isolate the bad rows. Use input_format_allow_errors_num to let ClickHouse skip a limited number of bad rows and ingest the rest:

    SET input_format_allow_errors_num = 10;
    INSERT INTO my_table FORMAT CSV ...
    

    The skipped rows are logged so you can inspect them afterward.

  3. Check for empty fields. If your source data uses empty strings for missing values and the target column is not Nullable, either make the column Nullable or provide a default:

    ALTER TABLE my_table MODIFY COLUMN my_col Nullable(Int32);
    
  4. Handle locale-specific number formats. Strip thousands separators or convert decimal commas before import:

    sed 's/,//g' data.csv > data_clean.csv
    
  5. Use clickhouse-local to test parsing. This lets you validate data without touching a running server:

    clickhouse-local --query="SELECT * FROM file('data.csv', CSV, 'a Int32, b String') LIMIT 10"
    
  6. Cast explicitly in your query. If the data sometimes contains values that need coercion, use toInt32OrNull(), toFloat64OrZero(), or similar safe-cast functions to handle edge cases gracefully.

  7. Check encoding. Run file on your data to confirm it is UTF-8:

    file /path/to/data.csv
    

Best Practices

  • Define Nullable columns for any field that might legitimately be empty or missing in source data.
  • Use ClickHouse's OrNull and OrZero function variants (e.g., toInt32OrNull) when building transformation queries to avoid hard failures.
  • Validate data types at the source before they reach ClickHouse, especially when dealing with user-generated or third-party data.
  • Set input_format_allow_errors_num or input_format_allow_errors_ratio in production pipelines to prevent a single bad row from blocking an entire batch.
  • Log and alert on skipped rows so that data quality issues are visible rather than silently ignored.

Frequently Asked Questions

Q: How is CANNOT_PARSE_TEXT different from more specific errors like CANNOT_PARSE_NUMBER?
A: CANNOT_PARSE_TEXT is a general-purpose parsing error. ClickHouse may raise more specific error codes (like CANNOT_PARSE_NUMBER or CANNOT_PARSE_DATE) depending on the target type. When it does not have a more specific code, it falls back to CANNOT_PARSE_TEXT.

Q: Can I skip bad rows instead of failing the entire insert?
A: Yes. Set input_format_allow_errors_num to the maximum number of rows you are willing to skip, or use input_format_allow_errors_ratio to specify a percentage. Skipped rows are discarded from the insert but the rest go through.

Q: My CSV has empty fields for integer columns. What is the cleanest fix?
A: Either change the column to Nullable(Int32) so that empty values become NULL, or set input_format_csv_empty_as_default = 1 so that ClickHouse uses the column's default value for empty fields.

Q: I see this error only with certain rows in a large file. How do I find them?
A: Use clickhouse-local with a permissive schema (all String columns) to load the file, then query for rows where casting fails: SELECT * FROM file('data.csv', CSV, 'a String, b String') WHERE toInt32OrNull(a) IS NULL AND a != ''.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.