The substring
function in ClickHouse is used to extract a portion of a string based on specified start position and length. It's particularly useful for string manipulation and data extraction tasks within queries.
Syntax
substring(str, start[, length])
Example usage
SELECT substring('Hello, World!', 1, 5) AS result;
-- Output: Hello
SELECT substring('ClickHouse', -5) AS result;
-- Output: House
Common issues
- If the start position is beyond the string length, an empty string is returned.
- Negative start positions count from the end of the string.
- If length is omitted, the function returns the substring from the start position to the end of the string.
Best practices
- Use substring in combination with other string functions for complex text processing.
- Be cautious with large datasets as string operations can be resource-intensive.
- Consider using materialized columns for frequently used substring operations to improve query performance.
Frequently Asked Questions
Q: Can I use substring with non-ASCII characters?
A: Yes, substring works with UTF-8 encoded strings, including non-ASCII characters.
Q: How does substring handle NULL values?
A: If any of the arguments are NULL, substring returns NULL.
Q: Is there a performance difference between substring and left/right functions?
A: For simple extractions from the beginning or end of a string, left and right functions might be slightly more efficient.
Q: Can substring be used in WHERE clauses?
A: Yes, substring can be used in WHERE clauses for filtering based on parts of strings.
Q: How does substring behave with empty strings?
A: When applied to an empty string, substring returns an empty string regardless of the start and length parameters.