PostgreSQL Vector Search: Complete Guide to pgvector and Similarity Search

PostgreSQL Vector Search: Complete Guide to pgvector and Similarity Search

Vector search in PostgreSQL enables powerful AI-powered applications including semantic search, recommendation systems, image similarity, and RAG (Retrieval-Augmented Generation) applications. Using the pgvector extension, PostgreSQL can efficiently store and query high-dimensional vector embeddings alongside your relational data.

What is Vector Search?

Vector search, also known as similarity search, finds items that are semantically or conceptually similar rather than exact matches. Instead of matching keywords, vector search compares mathematical representations (embeddings) of data to find similar items.

Key Concepts:

  • Embeddings: Numerical representations of data (text, images, audio) as high-dimensional vectors
  • Similarity: Measured using distance metrics (Euclidean, cosine, inner product)
  • Semantic search: Finding results based on meaning rather than exact text matches
  • Nearest neighbor search: Finding the closest vectors to a query vector

Use Cases

Semantic Search:

  • Find documents with similar meaning, not just matching keywords
  • Build "More like this" features
  • Intelligent document retrieval

Recommendation Systems:

  • Product recommendations based on user behavior
  • Content recommendations
  • Similar user matching

RAG Applications:

  • Retrieval-Augmented Generation for LLMs
  • Context-aware chatbots
  • Knowledge base question answering

Image and Media Search:

  • Find similar images
  • Reverse image search
  • Audio similarity matching

Anomaly Detection:

  • Identify outliers in datasets
  • Fraud detection
  • Quality control

Installing pgvector

Installation Methods

From Package Manager

# Ubuntu/Debian
sudo apt install postgresql-18-pgvector

# macOS with Homebrew
brew install pgvector

# RHEL/CentOS
sudo dnf install pgvector_18

Docker with pgvector

# Pull PostgreSQL image with pgvector
docker pull ankane/pgvector

# Run container
docker run --name postgres-vector \
  -e POSTGRES_PASSWORD=mypassword \
  -p 5432:5432 \
  -d ankane/pgvector

Build from Source

# Clone repository
git clone --branch v0.7.0 https://github.com/pgvector/pgvector.git
cd pgvector

# Build and install
make
sudo make install

Enable pgvector Extension

-- Create extension in your database
CREATE EXTENSION vector;

-- Verify installation
SELECT * FROM pg_extension WHERE extname = 'vector';

-- Check version
SELECT extversion FROM pg_extension WHERE extname = 'vector';

Working with Vector Data

Creating Tables with Vector Columns

-- Create table with vector column
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    content TEXT,
    embedding vector(1536),  -- OpenAI ada-002 embedding size
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create table for product embeddings
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    description TEXT,
    category TEXT,
    price DECIMAL(10,2),
    image_embedding vector(512),  -- CLIP embedding
    text_embedding vector(384),   -- MiniLM embedding
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create table for user preferences
CREATE TABLE user_preferences (
    user_id INTEGER PRIMARY KEY,
    preference_vector vector(128),
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Inserting Vector Data

-- Insert document with embedding
INSERT INTO documents (title, content, embedding)
VALUES (
    'Introduction to PostgreSQL',
    'PostgreSQL is a powerful open-source database...',
    '[0.1, 0.2, 0.3, ...]'::vector  -- 1536-dimensional vector
);

-- Insert from array
INSERT INTO documents (title, content, embedding)
VALUES (
    'Vector Search Guide',
    'Vector search enables semantic similarity...',
    ARRAY[0.1, 0.2, 0.3, ...]::vector
);

-- Insert multiple vectors
INSERT INTO products (name, description, image_embedding, text_embedding)
VALUES
    ('Laptop', 'High-performance laptop', '[0.5, 0.2, ...]'::vector, '[0.3, 0.4, ...]'::vector),
    ('Mouse', 'Wireless mouse', '[0.1, 0.8, ...]'::vector, '[0.6, 0.2, ...]'::vector);

Distance Metrics and Similarity

Understanding Distance Metrics

pgvector supports three primary distance metrics:

1. L2 Distance (Euclidean)

  • Measures straight-line distance between vectors
  • Good for general similarity
  • Operator: <->

2. Cosine Distance

  • Measures angle between vectors (1 - cosine similarity)
  • Good for text embeddings
  • Operator: <=>

3. Inner Product (Negative)

  • Measures dot product (negated for nearest neighbor)
  • Good for recommendation systems
  • Operator: <#>

Querying Similar Vectors

-- Find most similar documents using cosine distance
SELECT
    id,
    title,
    1 - (embedding <=> '[0.1, 0.2, 0.3, ...]'::vector) AS similarity
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, 0.3, ...]'::vector
LIMIT 10;

-- Find similar products using L2 distance
SELECT
    id,
    name,
    embedding <-> '[0.5, 0.2, ...]'::vector AS distance
FROM products
ORDER BY embedding <-> '[0.5, 0.2, ...]'::vector
LIMIT 5;

-- Find similar items using inner product
SELECT
    id,
    name,
    (embedding <#> '[0.5, 0.2, ...]'::vector) * -1 AS score
FROM products
ORDER BY embedding <#> '[0.5, 0.2, ...]'::vector
LIMIT 5;

Combining Vector Search with Filters

-- Semantic search with category filter
SELECT
    id,
    name,
    category,
    embedding <=> '[0.1, 0.2, ...]'::vector AS distance
FROM products
WHERE category = 'electronics'
  AND price BETWEEN 100 AND 1000
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;

-- Find similar documents created in last 30 days
SELECT
    id,
    title,
    created_at,
    embedding <=> '[0.1, 0.2, ...]'::vector AS distance
FROM documents
WHERE created_at > CURRENT_DATE - INTERVAL '30 days'
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 20;

Indexing for Performance

Index Types

pgvector supports two index types for approximate nearest neighbor (ANN) search:

IVFFlat (Inverted File with Flat compression)

  • Faster index build time
  • Uses less memory
  • Good for smaller datasets (< 1M vectors)

HNSW (Hierarchical Navigable Small World)

  • Better query performance
  • Higher recall
  • Good for larger datasets
  • More memory intensive

Creating IVFFlat Indexes

-- Create IVFFlat index with cosine distance
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

-- Create IVFFlat index with L2 distance
CREATE INDEX ON products
USING ivfflat (embedding vector_l2_ops)
WITH (lists = 100);

-- Create IVFFlat index with inner product
CREATE INDEX ON user_preferences
USING ivfflat (preference_vector vector_ip_ops)
WITH (lists = 50);

Choosing the lists parameter:

  • Start with lists = rows / 1000 for datasets under 1M rows
  • Minimum: 10 lists
  • Too few lists: slower queries
  • Too many lists: slower index builds, less accuracy

Creating HNSW Indexes

-- Create HNSW index with cosine distance
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Create HNSW index with L2 distance
CREATE INDEX ON products
USING hnsw (embedding vector_l2_ops)
WITH (m = 16, ef_construction = 64);

-- Create HNSW index with inner product
CREATE INDEX ON recommendations
USING hnsw (item_vector vector_ip_ops)
WITH (m = 24, ef_construction = 128);

HNSW Parameters:

  • m: Maximum connections per layer (16-64, default 16)
    • Higher m = better recall but more memory
  • ef_construction: Size of dynamic candidate list (64-200, default 64)
    • Higher = better quality but slower build

Index Performance Tuning

-- Set probes for IVFFlat queries (higher = better recall, slower)
SET ivfflat.probes = 10;  -- Default: 1, range: 1 to lists

-- Set ef_search for HNSW queries (higher = better recall, slower)
SET hnsw.ef_search = 100;  -- Default: 40

Index Recommendations

Small datasets (< 10K vectors):

  • No index needed (exact search is fast enough)

Medium datasets (10K - 1M vectors):

  • Use IVFFlat with lists = rows / 1000
  • Cosine distance for text embeddings
  • L2 distance for image embeddings

Large datasets (> 1M vectors):

  • Use HNSW with m = 16, ef_construction = 64
  • Increase m and ef_construction for better recall
  • Monitor memory usage

Building AI Applications

Semantic Search Application

-- Create semantic search function
CREATE OR REPLACE FUNCTION search_documents(
    query_embedding vector(1536),
    match_threshold FLOAT DEFAULT 0.7,
    match_count INT DEFAULT 10
)
RETURNS TABLE (
    id INTEGER,
    title TEXT,
    content TEXT,
    similarity FLOAT
)
LANGUAGE plpgsql
AS $$
BEGIN
    RETURN QUERY
    SELECT
        d.id,
        d.title,
        d.content,
        1 - (d.embedding <=> query_embedding) AS similarity
    FROM documents d
    WHERE 1 - (d.embedding <=> query_embedding) > match_threshold
    ORDER BY d.embedding <=> query_embedding
    LIMIT match_count;
END;
$$;

-- Use the function
SELECT * FROM search_documents(
    '[0.1, 0.2, 0.3, ...]'::vector,
    0.7,
    10
);

Hybrid Search (Vector + Text)

-- Combine vector similarity with full-text search
SELECT
    id,
    title,
    content,
    ts_rank(to_tsvector('english', content), query) AS text_score,
    1 - (embedding <=> '[0.1, 0.2, ...]'::vector) AS vector_score,
    -- Weighted combination
    (ts_rank(to_tsvector('english', content), query) * 0.3 +
     (1 - (embedding <=> '[0.1, 0.2, ...]'::vector)) * 0.7) AS combined_score
FROM documents, plainto_tsquery('english', 'search query') query
WHERE to_tsvector('english', content) @@ query
   OR (1 - (embedding <=> '[0.1, 0.2, ...]'::vector)) > 0.7
ORDER BY combined_score DESC
LIMIT 20;

Recommendation System

-- Create collaborative filtering recommendation
CREATE TABLE user_item_interactions (
    user_id INTEGER,
    item_id INTEGER,
    interaction_type TEXT,
    embedding vector(128),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Find similar users
CREATE OR REPLACE FUNCTION find_similar_users(
    target_user_id INTEGER,
    limit_count INT DEFAULT 10
)
RETURNS TABLE (
    user_id INTEGER,
    similarity FLOAT
)
LANGUAGE sql
AS $$
    WITH target_vector AS (
        SELECT preference_vector
        FROM user_preferences
        WHERE user_id = target_user_id
    )
    SELECT
        up.user_id,
        1 - (up.preference_vector <=> tv.preference_vector) AS similarity
    FROM user_preferences up, target_vector tv
    WHERE up.user_id != target_user_id
    ORDER BY up.preference_vector <=> tv.preference_vector
    LIMIT limit_count;
$$;

-- Get recommendations based on similar users
SELECT DISTINCT p.id, p.name, p.description
FROM find_similar_users(123, 5) su
JOIN user_item_interactions uii ON uii.user_id = su.user_id
JOIN products p ON p.id = uii.item_id
WHERE uii.interaction_type = 'purchase'
  AND NOT EXISTS (
      SELECT 1 FROM user_item_interactions
      WHERE user_id = 123 AND item_id = p.id
  )
LIMIT 10;

RAG (Retrieval-Augmented Generation)

-- Create knowledge base table for RAG
CREATE TABLE knowledge_base (
    id SERIAL PRIMARY KEY,
    source TEXT NOT NULL,
    chunk_text TEXT NOT NULL,
    chunk_index INTEGER,
    embedding vector(1536),
    metadata JSONB,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create index for fast retrieval
CREATE INDEX ON knowledge_base
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Retrieve relevant context for LLM
CREATE OR REPLACE FUNCTION get_rag_context(
    query_embedding vector(1536),
    max_chunks INT DEFAULT 5
)
RETURNS TABLE (
    chunk_text TEXT,
    source TEXT,
    similarity FLOAT
)
LANGUAGE sql
AS $$
    SELECT
        chunk_text,
        source,
        1 - (embedding <=> query_embedding) AS similarity
    FROM knowledge_base
    ORDER BY embedding <=> query_embedding
    LIMIT max_chunks;
$$;

Integration with Embedding Models

Python Integration with OpenAI

import openai
import psycopg2
from psycopg2.extras import execute_values

# Initialize OpenAI client
client = openai.OpenAI(api_key="your-api-key")

# Connect to PostgreSQL
conn = psycopg2.connect(
    host="localhost",
    database="vectordb",
    user="postgres",
    password="password"
)

def get_embedding(text, model="text-embedding-3-small"):
    """Get embedding from OpenAI"""
    text = text.replace("\n", " ")
    response = client.embeddings.create(input=[text], model=model)
    return response.data[0].embedding

def insert_document(title, content):
    """Insert document with embedding"""
    embedding = get_embedding(content)

    with conn.cursor() as cur:
        cur.execute(
            """
            INSERT INTO documents (title, content, embedding)
            VALUES (%s, %s, %s)
            RETURNING id
            """,
            (title, content, embedding)
        )
        doc_id = cur.fetchone()[0]
        conn.commit()

    return doc_id

def semantic_search(query, limit=10):
    """Search documents by semantic similarity"""
    query_embedding = get_embedding(query)

    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT
                id,
                title,
                content,
                1 - (embedding <=> %s::vector) AS similarity
            FROM documents
            ORDER BY embedding <=> %s::vector
            LIMIT %s
            """,
            (query_embedding, query_embedding, limit)
        )
        results = cur.fetchall()

    return results

# Usage
doc_id = insert_document(
    "PostgreSQL Vector Search",
    "Learn how to implement semantic search with pgvector..."
)

results = semantic_search("How to use vector search?", limit=5)
for id, title, content, similarity in results:
    print(f"{title} (similarity: {similarity:.3f})")

Python Integration with Sentence Transformers

from sentence_transformers import SentenceTransformer
import psycopg2

# Load model
model = SentenceTransformer('all-MiniLM-L6-v2')

def get_embedding(text):
    """Get embedding using sentence transformers"""
    return model.encode(text).tolist()

def batch_insert_documents(documents):
    """Batch insert documents with embeddings"""
    with conn.cursor() as cur:
        data = [
            (doc['title'], doc['content'], get_embedding(doc['content']))
            for doc in documents
        ]

        execute_values(
            cur,
            """
            INSERT INTO documents (title, content, embedding)
            VALUES %s
            """,
            data,
            template="(%s, %s, %s::vector)"
        )
        conn.commit()

# Batch insert
documents = [
    {"title": "Doc 1", "content": "Content 1..."},
    {"title": "Doc 2", "content": "Content 2..."},
]
batch_insert_documents(documents)

Node.js Integration

const { Pool } = require('pg');
const { OpenAI } = require('openai');

const pool = new Pool({
    host: 'localhost',
    database: 'vectordb',
    user: 'postgres',
    password: 'password'
});

const openai = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY
});

async function getEmbedding(text) {
    const response = await openai.embeddings.create({
        model: "text-embedding-3-small",
        input: text
    });
    return response.data[0].embedding;
}

async function insertDocument(title, content) {
    const embedding = await getEmbedding(content);

    const result = await pool.query(
        'INSERT INTO documents (title, content, embedding) VALUES ($1, $2, $3) RETURNING id',
        [title, content, JSON.stringify(embedding)]
    );

    return result.rows[0].id;
}

async function semanticSearch(query, limit = 10) {
    const queryEmbedding = await getEmbedding(query);

    const result = await pool.query(
        `SELECT
            id,
            title,
            content,
            1 - (embedding <=> $1::vector) AS similarity
        FROM documents
        ORDER BY embedding <=> $1::vector
        LIMIT $2`,
        [JSON.stringify(queryEmbedding), limit]
    );

    return result.rows;
}

// Usage
(async () => {
    const docId = await insertDocument(
        'Vector Search in PostgreSQL',
        'Complete guide to implementing vector search...'
    );

    const results = await semanticSearch('vector database tutorial', 5);
    results.forEach(r => {
        console.log(`${r.title} (similarity: ${r.similarity.toFixed(3)})`);
    });
})();

Performance Optimization

Optimize Memory Usage

-- Set appropriate work memory for index builds
SET maintenance_work_mem = '2GB';

-- Build index
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

-- Reset for normal operations
SET maintenance_work_mem = '64MB';

Partition Large Tables

-- Partition by date for time-series data
CREATE TABLE documents_partitioned (
    id SERIAL,
    title TEXT,
    content TEXT,
    embedding vector(1536),
    created_at TIMESTAMP
) PARTITION BY RANGE (created_at);

-- Create partitions
CREATE TABLE documents_2025_q1 PARTITION OF documents_partitioned
    FOR VALUES FROM ('2025-01-01') TO ('2025-04-01');

CREATE TABLE documents_2025_q2 PARTITION OF documents_partitioned
    FOR VALUES FROM ('2025-04-01') TO ('2025-07-01');

-- Index each partition
CREATE INDEX ON documents_2025_q1 USING hnsw (embedding vector_cosine_ops);
CREATE INDEX ON documents_2025_q2 USING hnsw (embedding vector_cosine_ops);

Query Optimization

-- Analyze query performance
EXPLAIN (ANALYZE, BUFFERS)
SELECT id, title
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;

-- Use prepared statements
PREPARE search_docs AS
    SELECT id, title, embedding <=> $1 AS distance
    FROM documents
    ORDER BY embedding <=> $1
    LIMIT $2;

EXECUTE search_docs('[0.1, 0.2, ...]'::vector, 10);

Batch Processing

-- Batch update embeddings
UPDATE documents d
SET embedding = u.new_embedding
FROM (VALUES
    (1, '[0.1, 0.2, ...]'::vector),
    (2, '[0.3, 0.4, ...]'::vector),
    (3, '[0.5, 0.6, ...]'::vector)
) AS u(id, new_embedding)
WHERE d.id = u.id;

Monitoring and Maintenance

Monitor Index Health

-- Check index size
SELECT
    schemaname,
    tablename,
    indexname,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE indexname LIKE '%vector%'
ORDER BY pg_relation_size(indexrelid) DESC;

-- Check index usage
SELECT
    schemaname,
    tablename,
    indexname,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes
WHERE indexname LIKE '%vector%';

Rebuild Indexes

-- Rebuild index if needed (after bulk updates)
REINDEX INDEX documents_embedding_idx;

-- Rebuild concurrently (no downtime)
CREATE INDEX CONCURRENTLY documents_embedding_idx_new
ON documents USING hnsw (embedding vector_cosine_ops);

DROP INDEX documents_embedding_idx;
ALTER INDEX documents_embedding_idx_new RENAME TO documents_embedding_idx;

Vacuum and Analyze

-- Vacuum table to reclaim space
VACUUM ANALYZE documents;

-- Auto-vacuum settings for vector tables
ALTER TABLE documents SET (
    autovacuum_vacuum_scale_factor = 0.05,
    autovacuum_analyze_scale_factor = 0.02
);

Best Practices

1. Choose the Right Embedding Model

  • Text: Use models like OpenAI's text-embedding-3-small (1536d) or Sentence Transformers (384d)
  • Images: Use CLIP (512d) or similar vision models
  • Multimodal: Use models that support multiple input types
  • Dimension considerations: Higher dimensions = more accuracy but slower queries

2. Normalize Vectors for Cosine Distance

-- Normalize vectors before storage (for cosine distance)
CREATE OR REPLACE FUNCTION normalize_vector(v vector)
RETURNS vector AS $$
    SELECT (v / sqrt((v <#> v) * -1))::vector;
$$ LANGUAGE SQL IMMUTABLE;

-- Insert with normalization
INSERT INTO documents (title, content, embedding)
VALUES ('Title', 'Content', normalize_vector('[0.1, 0.2, ...]'::vector));

3. Use Appropriate Index Parameters

-- For high recall requirements
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 32, ef_construction = 128);
SET hnsw.ef_search = 200;

-- For balanced performance
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
SET hnsw.ef_search = 100;

-- For speed over accuracy
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
SET ivfflat.probes = 5;

4. Implement Caching

-- Create materialized view for frequently searched vectors
CREATE MATERIALIZED VIEW popular_searches AS
SELECT
    query_text,
    query_embedding,
    COUNT(*) as search_count
FROM search_logs
GROUP BY query_text, query_embedding
HAVING COUNT(*) > 10;

CREATE INDEX ON popular_searches USING hnsw (query_embedding vector_cosine_ops);

-- Refresh periodically
REFRESH MATERIALIZED VIEW CONCURRENTLY popular_searches;

5. Handle Dimensionality Mismatch

-- Validate vector dimensions
ALTER TABLE documents
ADD CONSTRAINT check_embedding_dim
CHECK (vector_dims(embedding) = 1536);

-- Handle dynamic dimensions with a function
CREATE OR REPLACE FUNCTION validate_and_insert(
    p_title TEXT,
    p_content TEXT,
    p_embedding vector
)
RETURNS INTEGER AS $$
DECLARE
    v_id INTEGER;
BEGIN
    IF vector_dims(p_embedding) != 1536 THEN
        RAISE EXCEPTION 'Invalid embedding dimension: expected 1536, got %',
            vector_dims(p_embedding);
    END IF;

    INSERT INTO documents (title, content, embedding)
    VALUES (p_title, p_content, p_embedding)
    RETURNING id INTO v_id;

    RETURN v_id;
END;
$$ LANGUAGE plpgsql;

Troubleshooting

Common Issues

Issue 1: Slow queries without index

-- Check if index is being used
EXPLAIN (ANALYZE)
SELECT * FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;

-- Look for "Index Scan using" in the output
-- If you see "Seq Scan", the index isn't being used

Issue 2: Out of memory during index build

-- Increase maintenance_work_mem
SET maintenance_work_mem = '4GB';

-- Or use IVFFlat instead of HNSW
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

Issue 3: Low recall

-- Increase search parameters
SET hnsw.ef_search = 200;  -- Higher = better recall
SET ivfflat.probes = 20;   -- Higher = better recall

-- Or rebuild index with better parameters
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 32, ef_construction = 128);

Further Reading

Pulse - Elasticsearch Operations Done Right

Pulse can solve your Elasticsearch issues

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.