The "Character not in repertoire" error occurs when attempting to insert characters that are not valid in the database's character encoding, similar to invalid byte sequence errors.
Impact
Prevents data insertion containing incompatible characters, blocking imports from systems with different encodings.
Common Causes
- Character encoding mismatch
- Invalid characters for database encoding
- Binary data in text fields
- Control characters in text
- Encoding conversion failures
Troubleshooting and Resolution Steps
Check and set encoding:
-- Check database encoding SHOW server_encoding; -- Set client encoding SET client_encoding = 'UTF8';Clean invalid characters:
# Python: Remove invalid characters def clean_text(text): # Remove null bytes and control characters text = text.replace('\x00', '') text = ''.join(char for char in text if char.isprintable() or char in '\n\r\t') return textUse convert functions:
-- Convert encoding SELECT convert_from(convert_to('text', 'UTF8'), 'LATIN1');
Additional Information
- UTF-8 supports all Unicode characters
- Use UTF-8 for new databases
- Clean data before insertion
- Test imports with sample data first
Frequently Asked Questions
Q: What encoding should I use?
A: UTF-8 is recommended for maximum compatibility and Unicode support.
Q: Can I change database encoding?
A: No, must recreate database. Set encoding at creation time.