The "DB::Exception: Avro exception" error in ClickHouse occurs when processing data in the Apache Avro format fails. The AVRO_EXCEPTION error code covers a range of issues including schema mismatches, corrupted files, unsupported Avro types, and encoding errors that arise during reading or writing Avro data.
Impact
The query or INSERT operation that triggered the Avro processing fails entirely. For SELECT queries reading Avro files (such as from S3 or file()), no results are returned. For INSERT operations writing Avro output, the output is not produced. The underlying data source is not modified by the failure.
Common Causes
- The Avro file is corrupted or truncated
- A schema mismatch between the Avro file's embedded schema and the ClickHouse table's column types
- The Avro file uses a type or logical type that ClickHouse does not support
- An incompatible Avro schema evolution — for example, a required field was removed
- The Avro file uses a codec (compression) that ClickHouse cannot decode
- Nested or complex union types that ClickHouse cannot map to its type system
Troubleshooting and Resolution Steps
Read the full error message carefully, as it typically includes details from the Avro library about what went wrong:
SELECT * FROM file('data.avro', 'Avro') LIMIT 1;Validate the Avro file independently using the
avro-toolsutility:java -jar avro-tools.jar getschema data.avro java -jar avro-tools.jar tojson data.avro | head -5Check schema compatibility by comparing the Avro schema to your ClickHouse table definition:
DESCRIBE TABLE my_table;Pay attention to nullable fields (Avro unions with null), logical types (dates, timestamps), and nested records.
If the Avro file uses an unsupported codec, re-encode it with a supported one (null, deflate, or snappy):
java -jar avro-tools.jar recodec --codec deflate input.avro output.avroFor schema evolution issues, specify the expected schema explicitly when reading:
SELECT * FROM file('data.avro', 'Avro') SETTINGS format_avro_schema_registry_url = 'http://schema-registry:8081';If complex union types are the issue, flatten them before ingesting into ClickHouse, or use a preprocessing step to convert the Avro file to a simpler format.
Best Practices
- Validate Avro files before ingesting them into ClickHouse, especially when they come from external sources.
- Keep Avro schemas aligned with ClickHouse table definitions, and update both when schema changes occur.
- Use a schema registry to manage Avro schema evolution and ensure backward compatibility.
- Prefer simple Avro types and avoid deeply nested unions when the data is destined for ClickHouse.
- Test Avro ingestion with a small sample before running large batch imports.
Frequently Asked Questions
Q: Which Avro types does ClickHouse support?
A: ClickHouse supports boolean, int, long, float, double, bytes, string, enum, array, fixed, and various logical types like date, timestamp-millis, and timestamp-micros. Complex union types beyond [null, T] may not be supported.
Q: Can I use AvroConfluent format with a schema registry?
A: Yes. Use the AvroConfluent format and set format_avro_schema_registry_url to point to your Confluent Schema Registry. This is common when consuming Avro-encoded Kafka messages.
Q: Why does my nullable field cause an error?
A: In Avro, nullable fields are represented as a union of null and another type, e.g., ["null", "string"]. ClickHouse maps this to a Nullable column. If your ClickHouse column is not Nullable but the Avro field includes null values, the conversion will fail.