ClickHouse DB::Exception: Unexpected data after parsed value (Code: 632)

The "DB::Exception: Unexpected data after parsed value" error in ClickHouse occurs when the parser successfully reads a complete value but finds additional unexpected characters after it. The UNEXPECTED_DATA_AFTER_PARSED_VALUE error code (code 632) is raised during data ingestion or type casting when a field contains trailing content that does not belong to the parsed type.

Impact

The INSERT or query that triggered the parsing fails, and no rows from that batch are committed. This commonly blocks data pipelines and ETL jobs. The error message usually includes the position and surrounding characters where the unexpected data was found, which is helpful for diagnosing the issue.

Common Causes

  1. A numeric column contains values with trailing text, such as "123abc" being parsed as an integer
  2. A date or datetime string has extra characters after the valid timestamp, like "2024-01-15 extra"
  3. A JSON field contains content after the closing brace or bracket
  4. The input format delimiter does not match the actual data, causing field boundaries to be misidentified
  5. A CSV or TSV file has extra whitespace, BOM characters, or trailing commas that confuse the parser
  6. Type casting with CAST() or toInt64() on a string value that contains non-numeric suffixes

Troubleshooting and Resolution Steps

  1. Read the error message carefully. ClickHouse shows the leftover characters, the type it parsed, and the value it had parsed so far:

    DB::Exception: Unexpected data 'abc' after parsed Int64 value '123'
    
  2. If importing from a file, inspect the problematic rows. Look for trailing characters:

    SELECT my_column
    FROM file('data.csv', CSV, 'my_column String')
    WHERE NOT match(my_column, '^[0-9]+$')
    LIMIT 20;
    
  3. Use a more permissive type (String) to import the data first, then clean it:

    CREATE TABLE staging (raw_value String) ENGINE = Memory;
    INSERT INTO staging SELECT my_column FROM file('data.csv', CSV, 'my_column String');
    
    -- Inspect problematic values
    SELECT raw_value
    FROM staging
    WHERE toInt64OrNull(raw_value) IS NULL;
    
  4. Use ClickHouse's safe conversion functions to handle dirty data:

    SELECT
        toInt64OrDefault(raw_value, 0) AS cleaned_value
    FROM staging;
    
  5. If the issue is with date/datetime parsing, check for trailing timezone or extra text:

    SELECT
        parseDateTimeBestEffortOrNull(raw_timestamp) AS parsed
    FROM your_source
    WHERE parseDateTimeBestEffortOrNull(raw_timestamp) IS NULL;
    
  6. For JSON ingestion, validate that each line or record is well-formed JSON:

    python3 -c "import json, sys; [json.loads(line) for line in open('data.json')]"
    
  7. Check the input format settings. For CSV, ensure the delimiter and quoting rules match the data:

    SET format_csv_delimiter = ',';
    SET input_format_csv_allow_whitespace_or_tab_as_delimiter = 0;
    

Best Practices

  • Validate and clean data before ingesting it into ClickHouse, especially when the source is external or user-generated.
  • Use OrNull and OrDefault variants of conversion functions (e.g., toInt64OrNull, toDateOrNull) to handle dirty data gracefully.
  • When dealing with mixed-quality data, ingest into String columns first, then transform and move to typed tables.
  • Set input_format_allow_errors_num or input_format_allow_errors_ratio to skip a limited number of bad rows instead of failing the entire batch.
  • Include sample data validation in your ETL pipeline to catch format issues early.

Frequently Asked Questions

Q: Can I skip rows that fail to parse instead of failing the entire insert?
A: Yes. Set input_format_allow_errors_num to allow a specific number of malformed rows, or use input_format_allow_errors_ratio to allow a percentage. Skipped rows are discarded silently.

Q: Why does this error happen with JSON data?
A: This typically means the JSON has trailing characters after a complete JSON object, such as extra whitespace, duplicate closing braces, or concatenated objects without proper line separation.

Q: How is this different from a TYPE_MISMATCH error?
A: A type mismatch means the value cannot be interpreted as the target type at all (e.g., a word in an integer column). UNEXPECTED_DATA_AFTER_PARSED_VALUE means the beginning of the value parsed correctly, but there is leftover content that should not be there.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.