Azure Database for PostgreSQL: Flexible Server Engineering Guide

Azure Database for PostgreSQL Flexible Server is Microsoft's current recommended managed PostgreSQL offering, replacing the older Single Server deployment model (which reached end-of-life in March 2025). Flexible Server gives you meaningful control over maintenance windows, HA configuration, and PostgreSQL parameter tuning - things the Single Server model papered over with rigid defaults. It supports PostgreSQL versions 11 through 18 (note: versions 11, 12, and 13 are under Extended Support, which is a paid tier; version 11 standard support ended November 2024) and runs on the standard Azure compute fabric, meaning you get the same underlying VM hardware as any other Azure workload.

The service sits in a fully managed tier - you don't manage OS patches, major version upgrade paths are handled in-place, and storage auto-grows without manual intervention. But "fully managed" does not mean "zero operational surface". Connection limits, parameter tuning, extension compatibility, and networking topology still require deliberate engineering decisions.

High Availability and Failover

Flexible Server offers two HA models: zonal (standby in the same availability zone as the primary) and zone-redundant (standby in a physically separate zone). Zone-redundant HA is the production-grade choice - it provides automatic failover with near-zero data loss during both planned maintenance and unplanned outages, using synchronous replication to the standby.

Failover time in zone-redundant mode targets 60-120 seconds for unplanned failures. That number matters if your application does not implement retry logic with exponential backoff. Microsoft added Premium SSD v2 HA support in 2025, which reduces failover latency further and allows independent configuration of IOPS, throughput, and capacity - meaning you can right-size storage performance independently of disk size, rather than accepting a fixed IOPS-per-GiB ratio.

The HA configuration introduces a cost multiplier: you are provisioning two compute instances and paying for standby storage. This is worth knowing upfront when sizing budgets, since HA nearly doubles the compute cost of a deployment. For non-production environments, disabling HA and using point-in-time restore as a recovery mechanism is a reasonable trade-off.

Point-in-Time Restore and Backups

Backups are automated, taken as volume snapshots combined with transaction log archiving. The default retention window is 7 days, configurable up to 35 days. Point-in-time restore creates a new server from backup - it does not restore in-place - so factor the provisioning time (typically a few minutes) into your RTO calculations. On-demand backups reached general availability in 2025, letting you take an explicit backup before a risky migration or schema change rather than relying solely on the scheduled window.

Read replicas operate as streaming replication targets, but they do not participate in backup. Restore operations are only available on servers in the primary role - if you promote a replica, its backup history from when it was a replica is inaccessible until promotion is confirmed.

Networking and Identity Integration

Flexible Server supports two network access modes: public access (with firewall rules) and private access via VNet injection or Private Link. VNet injection deploys the server directly into a delegated subnet, making it accessible only from within the VNet without any public IP. Private Link creates a private endpoint in your VNet, and traffic routes over the Microsoft backbone - this model is compatible with hub-and-spoke network topologies and works well when the PostgreSQL instance needs to be reachable from multiple VNets or on-premises environments via ExpressRoute.

Microsoft Entra ID (formerly Azure Active Directory) authentication is supported natively alongside password-based auth. You can configure an Entra admin at the server level, map Entra users and groups to PostgreSQL roles, and disable password authentication entirely for environments with strict audit requirements. Managed Identity is the recommended approach for application connections - a system-assigned or user-assigned identity acquires a short-lived access token, which gets passed in the password field of the connection string. No stored credentials, no rotation scripts.

import psycopg2
from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()
token = credential.get_token("https://ossrdbms-aad.database.windows.net/.default")

conn = psycopg2.connect(
    host="myserver.postgres.database.azure.com",
    dbname="mydb",
    user="myapp",  # Flexible Server uses the plain identity/username; the @servername suffix is a Single Server pattern
    password=token.token,
    sslmode="require"
)

This approach integrates cleanly with workload identity in AKS, where the pod identity maps directly to an Entra service principal without any secret management overhead.

Compute Tiers and Storage

Flexible Server offers three compute tiers. Burstable (B-series VMs) is credit-based CPU - appropriate for dev/test or workloads with intermittent peaks, but unsuitable for sustained high-throughput production databases. Built-in PgBouncer is not available on Burstable instances, which is a concrete limitation worth noting when evaluating costs. General Purpose provides balanced compute with ~5 GiB RAM per vCore and suits most transactional workloads. Memory Optimized goes up to 10 GiB per vCore and targets workloads with large working sets that benefit from keeping more data in shared_buffers and the OS page cache.

Storage options are Premium SSD and Premium SSD v2. Premium SSD v2 is the default recommendation for new deployments - it allows independent tuning of IOPS (up to 80,000), throughput (up to 1,200 MB/s), and capacity (up to 64 TiB) without requiring a storage tier upgrade. With Premium SSD v1, IOPS scale linearly with disk size, so you often end up over-provisioning capacity to hit a throughput target. Premium SSD v2 eliminates that coupling.

Connection management on Flexible Server defaults to max_connections scaled to available memory, which is typically a few hundred connections on smaller instances. For workloads with many short-lived connections - common in serverless or containerized environments - the built-in PgBouncer runs as a sidecar on the same VM and proxies connections in transaction pooling mode. The recommended connection count through PgBouncer is 2 to 5 times the vCore count. If you need statement-level pooling, note that prepared statements in statement mode are not supported; transaction mode is the practical default.

Azure PostgreSQL vs. AWS RDS PostgreSQL and Aurora

When comparing against AWS, the relevant options are RDS for PostgreSQL (community PostgreSQL on managed EC2) and Aurora PostgreSQL (AWS's custom storage engine with a PostgreSQL-compatible frontend).

RDS PostgreSQL and Azure Flexible Server are architecturally similar - both run standard PostgreSQL on managed compute with streaming replication for HA. The functional differences are mostly in ecosystem integration: RDS connects cleanly to IAM, VPC, and S3; Flexible Server connects to Entra ID, VNets, and Azure Storage. Version support timelines are comparable, though both services typically lag the upstream community release by one to several months.

Aurora PostgreSQL diverges significantly at the storage layer. Aurora uses a distributed, log-structured storage engine spread across six copies in three AZs, decoupled from the compute layer. This architecture enables Aurora to support up to 15 read replicas (versus 15 on RDS and a similar limit on Flexible Server) with minimal replication lag, and allows nearly instant compute scaling because storage is not attached to a single instance. The trade-off is that Aurora PostgreSQL's compatibility with community PostgreSQL is imperfect - certain extensions, replication slots, and behavioral edge cases differ, and Aurora typically follows community releases by approximately 3-5 months for recent major versions (PG 17 had roughly a 2.5-month lag, PG 16 roughly 4.5 months).

For Azure-centric shops, Flexible Server is the obvious path - the Entra ID integration, Private Link topology, and native Azure Monitor telemetry all reduce operational friction compared to running a PostgreSQL-compatible database on AWS inside an Azure-heavy architecture. For teams already deep in AWS, Aurora PostgreSQL offers a better read scale-out story and stronger I/O performance on write-heavy workloads, at a higher cost per vCPU-hour. RDS PostgreSQL remains the closest AWS equivalent to Flexible Server in terms of architectural simplicity and predictability.

One concrete limitation of Flexible Server relative to Aurora: Flexible Server does not decouple compute and storage. Scaling compute requires a brief restart (planned maintenance), and storage capacity is coupled to the instance, not shared across replicas. Aurora's shared storage means adding or removing read replicas takes seconds with no data copying. If your workload has unpredictable read spikes that require rapid horizontal scaling, that difference matters.