PostgreSQL Invalid Regular Expression (SQLSTATE 2201B)

When PostgreSQL cannot parse a regular expression pattern, it raises ERROR: invalid regular expression: <reason> with SQLSTATE 2201B and condition name invalid_regular_expression. This occurs at query execution time, not at parse time, so the error surfaces when the function or operator is actually evaluated against a pattern that violates POSIX ERE syntax.

What This Error Means

SQLSTATE 2201B belongs to error class 22 — Data Exception. These are runtime errors that occur because a value supplied to an operation is inappropriate for that operation, as opposed to syntax or connection errors. The condition name invalid_regular_expression is defined in the SQL standard and PostgreSQL maps it directly to class 22.

PostgreSQL's regular expression engine is based on Henry Spencer's ARE (Advanced Regular Expressions) library, which supports POSIX ERE syntax with extensions. When a pattern string is syntactically malformed — unbalanced brackets, invalid escape sequences, a quantifier with no preceding atom, and so on — the engine rejects it before attempting any match, and the error is raised immediately.

The transaction remains open after this error; the statement that triggered it is rolled back, but no implicit connection-level state is changed. You can catch it in PL/pgSQL with an EXCEPTION block using the condition name invalid_regular_expression or the SQLSTATE literal.

Common Causes

  1. Unbalanced bracket expressions. A character class like [a-z without a closing ] is invalid.
  2. Unmatched parentheses. An opening ( or (?: without a corresponding ) breaks the pattern.
  3. Quantifier with no preceding atom. Starting a pattern with *, +, ?, or {n,m} — for example *foo — leaves the quantifier nothing to apply to.
  4. Invalid back-reference or escape sequence. Using \d or \w as if PostgreSQL's default POSIX engine recognised Perl-style shorthand — it does not unless you use the regexp_* functions with the flags argument set appropriately, or switch to SIMILAR TO semantics. Under the default POSIX ERE engine, \d is not a digit shorthand and may trigger this error.
  5. Unterminated or malformed interval quantifier. A pattern like a{3, without the closing } is invalid.
  6. Dynamic patterns built from user input. Concatenating user-supplied strings into a pattern without sanitisation can produce syntactically broken patterns at runtime.

How to Fix invalid_regular_expression

  1. Validate the pattern before using it. The simplest check is to test the pattern in psql or against a known string:

    SELECT 'test' ~ 'your_pattern_here';
    

    If this raises 2201B, the pattern itself is broken.

  2. Fix unbalanced brackets and parentheses. Audit the pattern character by character:

    -- Bad: unbalanced bracket
    SELECT 'hello' ~ '[a-z';
    -- ERROR:  invalid regular expression: brackets [] not balanced
    
    -- Fixed:
    SELECT 'hello' ~ '[a-z]';
    
  3. Remove leading quantifiers. A quantifier must follow an atom (a literal character, ., a bracket class, or a group):

    -- Bad: quantifier with no atom
    SELECT 'hello' ~ '*ello';
    -- ERROR:  invalid regular expression: quantifier operand invalid
    
    -- Fixed: anchor or start with a character
    SELECT 'hello' ~ 'h*ello';
    
  4. Avoid Perl-style shorthands under POSIX ERE. PostgreSQL's ~ operator and regexp_match / regexp_replace use POSIX ERE by default. Shorthand classes like \d, \w, \s are not valid in that mode. Use POSIX bracket expressions instead:

    -- Bad: \d not valid in POSIX ERE
    SELECT '123' ~ '\d+';
    -- ERROR:  invalid regular expression: invalid escape \ sequence
    
    -- Fixed: use POSIX bracket expression
    SELECT '123' ~ '[0-9]+';
    
  5. Escape dynamic input before embedding it in a pattern. If the pattern comes from user input or an application variable, escape metacharacters with regexp_replace before embedding:

    -- Escape a user-supplied literal to use as a fixed string
    SELECT regexp_replace(user_input, '([.^$*+?|()\[\]{}\\])', '\\\1', 'g');
    

    Better yet, restructure the query so user input is a plain string operand rather than part of the pattern.

  6. Handle the error in PL/pgSQL if the pattern is truly dynamic:

    DO $$
    DECLARE
      pat text := '[bad';
      result text;
    BEGIN
      SELECT 'hello' ~ pat INTO result;
    EXCEPTION
      WHEN invalid_regular_expression THEN
        RAISE NOTICE 'Pattern is not a valid regular expression: %', pat;
    END;
    $$;
    

Additional Information

  • PostgreSQL has used the same Spencer ARE engine since very early versions; the POSIX ERE behaviour has been consistent across all modern releases (9.x through 17.x).
  • Related SQLSTATE codes in class 22: 22025 (invalid_escape_sequence), 2201W (invalid_row_count_in_limit_clause), 22023 (invalid_parameter_value). If the pattern is structurally valid but uses a flag or option the engine does not support, you may see 22023 instead.
  • Most PostgreSQL client drivers (libpq, psycopg2, asyncpg, JDBC) surface this as a server-side ProgrammingError or ServerError with the SQLSTATE 2201B in the exception metadata. Check e.pgcode (Python) or getSQLState() (Java) to distinguish this from other runtime errors.
  • ORMs that build dynamic LIKE or regex clauses — such as Django's regex field lookup or ActiveRecord's matches_regexp — will propagate this as an unhandled database error if user input is passed through without escaping. Treat any user-controlled regex as untrusted input.

Frequently Asked Questions

Why does \d work in some tools but fail in PostgreSQL? Most other regex flavours (Python re, JavaScript, Perl) support \d as a digit shorthand. PostgreSQL's default regex engine is POSIX ERE, which does not define \d. Use [0-9] or [[:digit:]] instead. The SIMILAR TO operator uses a different (SQL-standard) regex dialect that also does not support \d.

Can I use Perl-compatible regex (PCRE) in PostgreSQL? Not natively. PostgreSQL ships with its own ARE engine, not PCRE. If you need PCRE features, the pg_pcre extension (third-party) adds PCRE support, but it is not bundled with the standard distribution. For most practical needs, POSIX ERE with bracket expressions covers the same ground.

Does this error abort my entire transaction? No — only the statement that raised the error is rolled back. The surrounding transaction remains open (in an error state requiring a ROLLBACK or SAVEPOINT rollback if you are in an explicit transaction block). In autocommit mode (the default in most clients), only the single statement is affected.

How can I test whether a string is a valid regex before executing a query? The cleanest approach is a small helper function that wraps the pattern test in an exception handler:

CREATE OR REPLACE FUNCTION is_valid_regexp(pat text) RETURNS boolean
LANGUAGE plpgsql AS $$
BEGIN
  PERFORM '' ~ pat;
  RETURN true;
EXCEPTION
  WHEN invalid_regular_expression THEN
    RETURN false;
END;
$$;

SELECT is_valid_regexp('[0-9]+');  -- true
SELECT is_valid_regexp('[0-9');    -- false

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.