The "DB::Exception: Incorrect number of columns in INSERT" error means that ClickHouse found a row with a different number of fields than the table (or the explicit column list in the INSERT statement) expects. The INCORRECT_NUMBER_OF_COLUMNS error code is particularly common when working with delimited text formats like CSV and TSV, where an extra or missing delimiter can throw off the column count for the rest of the row.
Impact
A column-count mismatch causes the INSERT to fail entirely:
- No rows from the batch are written, even if only one row has the wrong number of fields.
- Automated pipelines will stall and require manual intervention or error-handling logic.
- In streaming scenarios, a single malformed message can block consumption from the source (e.g., Kafka topic).
Common Causes
- Extra or missing delimiters in the data -- a comma inside an unquoted CSV field splits one column into two, or a missing tab in TSV merges two columns into one.
- Schema evolution without updating the data source -- the table gained or lost columns but the export process still produces the old column count.
- Header row included in data -- CSV files with a header row that ClickHouse tries to parse as a data row, creating a mismatch if
input_format_csv_skip_first_linesis not set. - Partial column list in INSERT without matching data --
INSERT INTO t (a, b)with data containing three fields per row. - Trailing delimiters -- some export tools add a trailing comma or tab at the end of each row, creating a phantom empty column.
Troubleshooting and Resolution Steps
Check the error message for the expected vs. actual count. ClickHouse typically reports something like "expected 5 columns, got 6", which tells you whether you have extra or missing fields.
Inspect the offending row. If you know the approximate row number, extract it and count the delimiters:
# For CSV, count commas in the problematic line sed -n '1042p' data.csv | tr -cd ',' | wc -cCheck for unquoted delimiters within fields. This is the number one cause of column-count issues in CSV. Look for fields that contain commas but are not wrapped in double quotes:
# Show lines where the comma count differs from the header header_commas=$(head -1 data.csv | tr -cd ',' | wc -c) awk -F',' "NF != $((header_commas + 1))" data.csv | head -5Skip the header row if present:
SET input_format_csv_skip_first_lines = 1; INSERT INTO my_table FORMAT CSV ...Specify an explicit column list. If your data contains only a subset of the table's columns, list them explicitly:
INSERT INTO my_table (col_a, col_b, col_c) FORMAT CSV ...Handle trailing delimiters. If your data has a trailing delimiter on every row, you can preprocess it:
sed 's/,$//' data.csv > data_fixed.csvOr add a dummy column to your table to absorb the extra empty field.
Use
input_format_allow_errors_numto skip bad rows if only a few rows are affected and you want to ingest the rest:SET input_format_allow_errors_num = 100;
Best Practices
- Always validate the column count of your source data before importing, especially when the data comes from external or third-party systems.
- Use
input_format_csv_skip_first_lines = 1whenever your CSV files include a header row. - Prefer quoted CSV formats (RFC 4180 compliant) where fields containing delimiters are wrapped in double quotes.
- When table schemas change, update all data producers and transformation logic at the same time.
- Test imports with a small sample before running full loads to catch structural mismatches early.
Frequently Asked Questions
Q: Can I tell ClickHouse to ignore extra columns in the data?
A: For JSON formats you can set input_format_skip_unknown_fields = 1. For CSV/TSV there is no direct equivalent -- you need to either fix the data or define a table/column list that matches the actual field count.
Q: What if only some rows have the wrong number of columns?
A: Use input_format_allow_errors_num or input_format_allow_errors_ratio to skip those rows. The valid rows will be inserted and the bad ones will be reported in the server log.
Q: My data has a trailing comma on every line. Is there a ClickHouse setting for that?
A: There is no dedicated setting for trailing delimiters. The simplest approach is to preprocess the file to remove trailing commas, or to add an extra dummy column (e.g., _dummy String DEFAULT '') to your table to absorb the empty field.
Q: Does this error apply to JSON formats too?
A: Not directly. JSON formats like JSONEachRow use key-value pairs, so the column count is implicit. However, you may see a related error if the JSON object has missing required keys that don't have defaults.