PostgreSQL CREATE TABLE: Syntax, Constraints, and Practical Decisions

CREATE TABLE is the entry point for every schema in PostgreSQL, and the decisions you make here - column types, constraint placement, naming, storage options - compound over time. A table defined carelessly in development often requires disruptive migrations in production, particularly once foreign keys, partitioning, or replication come into play. This guide covers the full syntax with the trade-offs that actually matter.

Basic Syntax and Column Definitions

The minimal form is straightforward:

CREATE TABLE orders (
    id          bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    customer_id bigint      NOT NULL,
    status      text        NOT NULL DEFAULT 'pending',
    total       numeric(12,2),
    created_at  timestamptz NOT NULL DEFAULT now()
);

Every column definition follows the pattern name type [constraints]. The type choice deserves deliberate thought. For integers, integer (32-bit, range up to ~2.1 billion) is fine for most lookup tables or low-cardinality foreign keys, but primary keys on any table that could grow large should use bigint. The cost difference is 4 bytes per row - rarely worth optimizing for - whereas exhausting a 32-bit sequence is an emergency migration under load.

For string columns, PostgreSQL makes text and varchar(n) functionally equivalent in storage and performance; the varchar(n) form only adds a length check at write time. Use text when you have no real business constraint on length, and varchar(n) when a hard ceiling is part of the domain model (e.g., a 2-character country code). Avoid char(n) - it pads with spaces and is almost never what you want.

Auto-incrementing primary keys deserve specific attention. The serial and bigserial pseudo-types still work but are a legacy shorthand that creates an implicit sequence and couples it to the column in a fragile way. The SQL-standard alternative, GENERATED ALWAYS AS IDENTITY (available since PostgreSQL 10), is explicit and cleaner to manage. GENERATED BY DEFAULT AS IDENTITY allows overriding the generated value during insert, which is useful for data migrations. Default to bigint GENERATED ALWAYS AS IDENTITY for new tables.

Constraints

PostgreSQL supports column-level constraints (declared inline) and table-level constraints (declared separately, capable of spanning multiple columns). Both are equivalent for single-column cases, but table-level syntax is required for composite keys and multi-column uniqueness.

CREATE TABLE shipments (
    id          bigint GENERATED ALWAYS AS IDENTITY,
    order_id    bigint NOT NULL,
    carrier     text   NOT NULL,
    tracking_no text,
    shipped_at  timestamptz,
    CONSTRAINT shipments_pkey PRIMARY KEY (id),
    CONSTRAINT shipments_order_fk FOREIGN KEY (order_id)
        REFERENCES orders (id) ON DELETE CASCADE,
    CONSTRAINT shipments_tracking_unique UNIQUE (carrier, tracking_no),
    CONSTRAINT shipments_carrier_nonempty CHECK (carrier <> '')
);

Naming constraints explicitly - using CONSTRAINT name - pays off when you hit a violation in application logs or during pg_restore. Anonymous constraint names like orders_order_id_fkey are auto-generated but inconsistent across environments, which creates noise when comparing schemas.

Foreign key behavior on parent row deletion is controlled by ON DELETE. The options are NO ACTION (default - raises an error at statement end), RESTRICT (raises immediately, before deferred constraints fire), CASCADE (deletes child rows), SET NULL, and SET DEFAULT. CASCADE is convenient but dangerous in wide schemas where you may not anticipate the downstream deletions. RESTRICT or NO ACTION forces callers to handle the relationship explicitly.

CHECK constraints accept any boolean expression referencing the row's own columns. They fire on insert and update but not on reads, and they're stored in the catalog so they're visible to query planners for constraint exclusion. One subtlety: CHECK (col IS NOT NULL) is not equivalent to NOT NULL; the latter is stored separately and is faster to evaluate. A CHECK that allows NULL to pass (because NULL expressions evaluate to NULL, not FALSE) can surprise you - CHECK (price > 0) will not reject a NULL price.

Table Creation Variants

IF NOT EXISTS suppresses the error when a table already exists, emitting a notice instead. It's safe for idempotent migration scripts but gives no guarantee that the existing table matches your definition - schema drift remains silent. Pair it with proper migration tooling rather than relying on it as a schema management strategy.

LIKE copies column structure from a source table without creating an ongoing relationship:

CREATE TABLE orders_archive (LIKE orders INCLUDING ALL);

INCLUDING ALL pulls in defaults, constraints, indexes, comments, and storage settings. Without it, you get bare column names and types only. This is useful for staging tables, archive tables, or test fixtures where you want structural parity. Unlike inheritance, there's no ongoing coupling - changes to orders won't propagate to orders_archive.

CREATE TABLE AS materializes a query result into a new table:

CREATE TABLE high_value_orders AS
SELECT * FROM orders WHERE total > 10000;

The resulting table inherits column names and types from the query output but carries no constraints, indexes, or defaults. It's a snapshot, not a view. WITH NO DATA creates the structure without populating rows, useful when you want the schema of a derived query without running it. Prefer CREATE TABLE AS over SELECT INTO - the latter is non-standard in PL/pgSQL and ECPG contexts and offers fewer options.

UNLOGGED tables skip write-ahead logging, making writes faster at the cost of data loss on crash. They're appropriate for ephemeral staging data, session caches, or intermediate ETL state where durability is not required. Temporary tables (CREATE TEMP TABLE) go further - they exist only for the current session and are dropped automatically when the connection closes.

Table Inheritance

PostgreSQL has supported table inheritance since its early versions. A child table inherits all columns from one or more parent tables and can add its own:

CREATE TABLE events (
    id          bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    occurred_at timestamptz NOT NULL,
    payload     jsonb
);

CREATE TABLE error_events (
    error_code  text NOT NULL,
    stack_trace text
) INHERITS (events);

Querying events returns rows from both events and error_events by default. The ONLY keyword restricts a query to just the named table: SELECT * FROM ONLY events. You can identify which physical table a row came from using the tableoid system column cast to regclass.

The limitations are real and often underestimated. UNIQUE constraints and PRIMARY KEY on the parent do not propagate to children, so duplicates can exist across the hierarchy. Foreign keys from other tables cannot reference the inheritance hierarchy as a whole - they can only point to one specific table in the tree. NOT NULL and CHECK constraints do propagate.

For most partitioning use cases, PostgreSQL's declarative partitioning (PARTITION BY RANGE | LIST | HASH) is the better path. It has proper constraint propagation, supports partition-wise joins and aggregates, and handles partition pruning in the planner. Table inheritance is most useful when the tables genuinely represent different entity subtypes with different column sets, not when they represent shards of the same entity.

Schema Organization and Naming Conventions

PostgreSQL schemas (namespaces, not table structures) are an underused organizational tool. Grouping related tables under a schema - billing.invoices, billing.line_items rather than public.billing_invoices - keeps the public schema uncluttered and makes permission management more granular. The search path controls which schema is resolved first for unqualified names.

Naming conventions compound in readability. Snake case (order_line_items) is the PostgreSQL convention; mixed case forces quoting everywhere ("OrderLineItems"). Table names in plural form (orders, not order) map cleanly to REST resources and read naturally in joins. Foreign key columns should reference their target: customer_id rather than cust or foreign_key_1. Index names benefit from encoding their purpose: idx_orders_customer_id or uq_shipments_carrier_tracking.

Storage parameters like fillfactor are worth setting on tables with heavy update workloads. The default fillfactor=100 fills each page completely; setting fillfactor=70 leaves headroom for in-place updates (HOT updates), reducing dead tuple bloat. This is a table-level tuning decision that belongs in the CREATE TABLE statement. ALTER TABLE ... SET (fillfactor = N) changes the stored setting, but existing pages retain their original density - you must run VACUUM FULL or use pg_repack to apply the new fillfactor to existing data, both of which rewrite the table. Setting fillfactor at CREATE TABLE time avoids that disruptive rewrite later.