The "DB::Exception: Cannot parse IPv4 address" error occurs when ClickHouse encounters a string that cannot be interpreted as a valid IPv4 address. The CANNOT_PARSE_IPV4 error code is raised specifically when the input fails the strict validation rules for the IPv4 data type: four decimal octets separated by dots, each in the range 0-255.
Impact
This error prevents the INSERT or query from completing:
- The entire batch fails and no rows are committed.
- Network monitoring, security, and log analytics pipelines that heavily use IP address columns are commonly affected.
- Even one malformed address in a large import will block the whole operation unless error tolerance is configured.
Common Causes
- Octet out of range -- a value like
192.168.1.256where one octet exceeds 255. - Too few or too many octets -- strings like
10.0.1(three octets) or10.0.0.1.5(five octets). - Non-numeric characters -- letters, spaces, or special characters within the IP string (e.g.,
10.0.0.x). - Leading zeros -- while
010.000.001.001is sometimes accepted, inconsistently formatted octets can cause issues depending on the context. - Empty or placeholder values -- empty strings,
0,unknown, orN/Ain an IPv4 column. - Appended port or suffix -- values like
10.0.0.1:80or10.0.0.1/24. - IPv6 addresses -- inserting an IPv6 address into an IPv4 column.
Troubleshooting and Resolution Steps
Check the error message. ClickHouse reports the specific string that could not be parsed as IPv4.
Scan for invalid addresses before import:
SELECT raw_ip FROM file('data.csv', CSV, 'raw_ip String') WHERE toIPv4OrNull(raw_ip) IS NULL AND raw_ip != '' LIMIT 20;Use safe conversion functions in INSERT...SELECT patterns:
INSERT INTO my_table SELECT toIPv4OrDefault(raw_ip) AS client_ip FROM staging;Strip port numbers and CIDR suffixes:
# Remove :port sed 's/:[0-9]\{1,5\}$//' data.csv > data_clean.csv # Remove /mask sed 's|/[0-9]\{1,2\}||' data_clean.csv > data_final.csvMake the column Nullable when IPs may be absent:
ALTER TABLE my_table MODIFY COLUMN src_ip Nullable(IPv4);Use error tolerance to skip bad rows:
SET input_format_allow_errors_num = 50; INSERT INTO my_table FORMAT CSVIf you have mixed IPv4/IPv6 data, switch to the
IPv6column type, which can store both:ALTER TABLE my_table MODIFY COLUMN src_ip IPv6;
Best Practices
- Validate IPv4 addresses at the source or in your ETL layer before they reach ClickHouse.
- Use
Nullable(IPv4)for any column that may contain missing or unknown IP values. - If your data contains a mix of IPv4 and IPv6 addresses, use the
IPv6type to accommodate both. - Separate IP addresses from port numbers, CIDR prefixes, and other metadata before storage.
- Use
toIPv4OrNullin ad-hoc queries and transformations to gracefully handle bad data.
Frequently Asked Questions
Q: What exactly constitutes a valid IPv4 string for ClickHouse?
A: Four decimal integers separated by dots, where each integer is in the range 0-255. For example: 192.168.1.1, 0.0.0.0, 255.255.255.255. No spaces, ports, masks, or non-numeric characters.
Q: Does ClickHouse accept leading zeros in IPv4 octets like 010.001.001.001?
A: ClickHouse generally accepts leading zeros in IPv4 parsing, treating them as decimal (not octal). However, it is best practice to avoid leading zeros to prevent ambiguity.
Q: Can I convert an integer to an IPv4 address?
A: Yes. Use toIPv4(number) where number is a UInt32 value. For example, toIPv4(3232235777) returns 192.168.1.1.
Q: My data sometimes has the string 0 instead of a proper IP. What happens?
A: The string 0 is not a valid IPv4 format. Use Nullable(IPv4) and map 0 to NULL in preprocessing, or use toIPv4OrDefault('0') which will return 0.0.0.0.