PostgreSQL Untranslatable Character (SQLSTATE 22P05)

When PostgreSQL cannot translate a character stored in the database encoding into the encoding requested by the client, it raises:

ERROR:  character with byte sequence 0x... in encoding "UTF8" has no equivalent in encoding "LATIN1"
SQLSTATE: 22P05

The condition name is untranslatable_character, and it belongs to SQLSTATE class 22 (Data Exception).

What This Error Means

SQLSTATE class 22 covers data exceptions — situations where the data itself violates a constraint or encoding rule rather than a problem with SQL syntax or schema. The untranslatable_character error (22P05) is specifically about encoding translation: PostgreSQL has a character in its internal storage that it cannot represent in the encoding the client has requested.

PostgreSQL stores data in the database's server encoding (typically UTF8 for modern databases, but possibly LATIN1, WIN1252, EUC_JP, or others for older databases). When a client connects, it negotiates an encoding via the client_encoding setting. Before sending query results to the client, PostgreSQL attempts to transcode the data from the server encoding to the client encoding. If a character exists in the server encoding that has no equivalent in the client encoding, PostgreSQL raises 22P05.

This is distinct from SQLSTATE 22021 (character_not_in_repertoire), which is raised during writes — when inserting or converting data whose source encoding cannot represent the character. Error 22P05 typically surfaces during reads or on output, when a row already stored in the database contains a character that the client's encoding cannot handle.

Common Causes

  1. Mismatch between database encoding and client encoding. A database stored as UTF8 contains characters (e.g., emoji, CJK characters, certain diacritics) that have no representation in the client encoding (LATIN1, WIN1252, etc.). This is by far the most common cause.

  2. Legacy database with a narrow server encoding. Applications connecting to an older database using LATIN1 or WIN1252 as the server encoding may encounter this if data was loaded by bypassing encoding checks and now contains byte sequences the encoding does not support.

  3. Explicit SET client_encoding to a restrictive encoding. A session or application layer issues SET client_encoding = 'LATIN1' (or equivalent JDBC/libpq parameter), and then queries a table that contains Unicode-only characters such as emoji or non-Latin scripts.

  4. COPY or \copy with an incompatible encoding target. Exporting data with COPY TO ... WITH ENCODING 'LATIN1' from a UTF8 database that holds multi-byte characters outside the LATIN1 range.

How to Fix untranslatable_character

  1. Set client encoding to UTF8 (recommended). The simplest and safest fix for most applications is to ensure the client requests UTF8:

    SET client_encoding = 'UTF8';
    

    For libpq-based connections (psql, most drivers), you can also set it in the connection string or via the PGCLIENTENCODING environment variable:

    PGCLIENTENCODING=UTF8 psql -d mydb
    

    JDBC drivers accept it as a connection property: ?options=-c%20client_encoding%3DUTF8.

  2. Identify which rows contain the problematic characters. Before changing the encoding, you may want to locate the offending data:

    SELECT id, col
    FROM my_table
    WHERE col != convert_from(
        convert_to(col, 'LATIN1'), 'LATIN1'
    );
    

    A simpler check using a TRY approach is possible with a helper function if you need to avoid exceptions in a loop.

  3. Use convert or encode to handle untranslatable characters explicitly. If you must export to a narrow encoding, replace or remove characters that cannot be translated:

    SELECT encode(
        convert(col::bytea, 'UTF8', 'LATIN1'),
        'escape'
    ) FROM my_table;
    

    Alternatively, strip non-LATIN1 characters using a regex before the conversion:

    SELECT regexp_replace(col, '[^ -ÿ]', '', 'g')
    FROM my_table;
    
  4. Migrate the database to UTF8. For databases with a narrow server encoding that was always a poor fit, a proper fix is migrating to UTF8. This requires a dump-and-restore with encoding conversion:

    pg_dump -d olddb | iconv -f LATIN1 -t UTF8 | psql -d newdb_utf8
    

    Verify the target database was created with ENCODING = 'UTF8' first.

Additional Information

  • SQLSTATE class 22 includes related encoding errors: 22021 (character_not_in_repertoire) occurs on input/writes; 22P05 occurs on output/reads. Both relate to encoding mismatches but at different stages.
  • PostgreSQL's encoding conversion system uses conversion functions stored in pg_conversion. You can inspect available conversions with SELECT * FROM pg_conversion;.
  • The pg_client_encoding() function returns the current session's client encoding, which is useful for debugging: SELECT pg_client_encoding();
  • Most modern PostgreSQL deployments (PostgreSQL 9.x and later) default to UTF8 for both server and client encoding, making this error much less common than it was with PostgreSQL 7.x/8.x era databases.
  • ORMs and drivers vary in how they surface this error. Python's psycopg2 and asyncpg will raise a DataError wrapping the original PostgreSQL exception. Java's JDBC driver raises SQLException with SQLState 22P05. Django will surface it as django.db.DataError.
  • If you see this error in logs but cannot reproduce it interactively, check whether your application explicitly sets client_encoding in the connection pool configuration.

Frequently Asked Questions

Why does this error only appear for some rows, not all? The error is triggered by specific characters that exist in the server encoding but not in the client encoding. Most of your data may consist of basic ASCII or characters that happen to exist in both encodings. The error only fires when PostgreSQL tries to transcode a character that falls outside the target encoding's repertoire — for example, an emoji or a Chinese character in a column that is otherwise mostly ASCII text.

Will changing client_encoding affect performance? The transcoding step adds minimal CPU overhead. For most workloads this is imperceptible. If you are streaming very large text blobs at high throughput, UTF8-to-UTF8 passthrough (no conversion needed) is slightly faster than cross-encoding conversion, which is another reason to align server and client encodings.

Can I suppress the error and return a replacement character instead? Not natively via a single setting. PostgreSQL does not have a built-in ON ERROR REPLACE option for encoding conversion. You need to either use regexp_replace to strip problematic characters before conversion, or handle the error in application code and retry with a different approach (e.g., requesting the column as a bytea and decoding in the application).

How do I check the server encoding of my database?

SELECT pg_encoding_to_char(encoding) AS server_encoding
FROM pg_database
WHERE datname = current_database();

Or simply run \l in psql to see the encoding column for all databases.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.