The SIMILAR TO operator in PostgreSQL provides SQL standard regular expression pattern matching. It combines features of LIKE wildcards with regex character classes and quantifiers, offering more powerful pattern matching than LIKE but with SQL-standard syntax.
Syntax
-- Pattern matching with SIMILAR TO
column SIMILAR TO 'pattern'
column NOT SIMILAR TO 'pattern'
-- Common pattern elements:
-- % matches zero or more characters (like LIKE)
-- _ matches exactly one character (like LIKE)
-- | alternation (OR)
-- * zero or more repetitions
-- + one or more repetitions
-- ? zero or one occurrence
-- {m} exactly m repetitions
-- {m,} m or more repetitions
-- {m,n} between m and n repetitions
-- [chars] character class
-- [^chars] negated character class
-- () grouping
Example usage
-- Character class matching
SELECT * FROM products
WHERE sku SIMILAR TO '[A-Z]{3}-[0-9]{3}';
-- Matches: ABC-123, XYZ-789, etc.
-- Alternation (OR)
SELECT * FROM files
WHERE filename SIMILAR TO '%(\.pdf|\.doc|\.docx)';
-- Matches files ending in .pdf, .doc, or .docx
-- Quantifiers
SELECT * FROM phone_numbers
WHERE number SIMILAR TO '[0-9]{3}-[0-9]{3}-[0-9]{4}';
-- Matches: 555-123-4567
-- One or more (+)
SELECT * FROM codes
WHERE code SIMILAR TO '[A-Z]+[0-9]+';
-- Matches: ABC123, X1, HELLO99, etc.
-- Zero or more (*)
SELECT * FROM usernames
WHERE username SIMILAR TO '[a-z][a-z0-9]*';
-- Matches: user, user123, abc, x, etc.
-- Optional character (?)
SELECT * FROM words
WHERE word SIMILAR TO 'colou?r';
-- Matches: color, colour
-- Grouping with parentheses
SELECT * FROM emails
WHERE email SIMILAR TO '%@(gmail|yahoo|hotmail)\.com';
-- Character class ranges
SELECT * FROM passwords
WHERE password SIMILAR TO '%[A-Z]%' -- Contains uppercase
AND password SIMILAR TO '%[a-z]%' -- Contains lowercase
AND password SIMILAR TO '%[0-9]%'; -- Contains digit
-- Negated character class
SELECT * FROM usernames
WHERE username SIMILAR TO '[^0-9]%';
-- Starts with non-digit
-- Complex pattern: email validation
SELECT * FROM contacts
WHERE email SIMILAR TO '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}';
-- IP address pattern
SELECT * FROM servers
WHERE ip_address SIMILAR TO '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}';
-- NOT SIMILAR TO
SELECT * FROM products
WHERE name NOT SIMILAR TO '%[0-9]%';
-- Products without numbers in name
-- Multiple constraints
SELECT * FROM products
WHERE code SIMILAR TO '[A-Z]{2}[0-9]{4}'
AND name SIMILAR TO '%(Pro|Plus|Premium)%';
-- Date format matching (YYYY-MM-DD)
SELECT * FROM dates
WHERE date_string SIMILAR TO '[0-9]{4}-[0-9]{2}-[0-9]{2}';
-- Hexadecimal color codes
SELECT * FROM colors
WHERE hex_code SIMILAR TO '#[0-9A-Fa-f]{6}';
-- Postal codes (US ZIP)
SELECT * FROM addresses
WHERE zip_code SIMILAR TO '[0-9]{5}(-[0-9]{4})?';
-- Matches: 12345 or 12345-6789
-- Product codes with optional suffix
SELECT * FROM inventory
WHERE item_code SIMILAR TO '[A-Z]{3}-[0-9]{3}(-[A-Z])?';
-- Matches: ABC-123 or ABC-123-X
Common issues
- Escaping: Special characters like
.need escaping with\to match literally. - Performance: SIMILAR TO is generally slower than simple LIKE patterns.
- Not full regex: Less powerful than POSIX regex (
~operator). No lookaheads, backreferences, etc. - SQL standard: SIMILAR TO is SQL standard but less commonly used than POSIX regex.
Best practices
- For simple wildcards, use LIKE (faster and simpler).
- For full regex power, use
~or~*POSIX regex operators. - Use SIMILAR TO when you need SQL-standard portable queries.
- Test patterns thoroughly, as regex can be tricky to get right.
Frequently Asked Questions
Q: What's the difference between SIMILAR TO and LIKE?
A: LIKE uses only % (any characters) and _ (single character) wildcards. SIMILAR TO supports character classes [A-Z], quantifiers {n}, alternation |, and more regex features.
Q: Should I use SIMILAR TO or the ~ operator for regex?
A: The ~ operator provides full POSIX regular expressions with more features (lookaheads, backreferences, etc.). SIMILAR TO is SQL-standard but less powerful. Use ~ for PostgreSQL-specific applications.
Q: How do I match a literal % or _ with SIMILAR TO?
A: Escape with backslash: SIMILAR TO '%\%%' matches strings containing literal %. Unlike LIKE, SIMILAR TO requires escaping these characters.
Q: Can SIMILAR TO be case-insensitive?
A: Not directly. Use character classes with both cases: [Cc]olor matches "color" or "Color". Or use POSIX ~* operator for case-insensitive regex.
Q: Is SIMILAR TO slower than LIKE?
A: Yes, SIMILAR TO involves regex compilation and matching, which is slower than simple LIKE wildcards. Use LIKE when simple patterns suffice.