Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Read more

OpenSearch Backup and Restore: Snapshots, Repositories, and Recovery

Backups in OpenSearch are implemented through the snapshot and restore mechanism. Snapshots capture the state of your cluster's indices — data, mappings, settings, and optionally cluster state — and store them in an external repository for disaster recovery.

This guide covers configuring repositories, creating snapshots, restoring data, automating backup schedules, and planning for disaster recovery.

Snapshot Repositories

Before creating snapshots, you need a snapshot repository — a storage location where snapshot data is written.

S3 Repository (Most Common)

# Install the repository-s3 plugin (if not already installed)
# bin/opensearch-plugin install repository-s3

# Register the repository
PUT /_snapshot/my-s3-repo
{
  "type": "s3",
  "settings": {
    "bucket": "my-opensearch-backups",
    "region": "us-east-1",
    "base_path": "snapshots",
    "server_side_encryption": true,
    "storage_class": "standard_ia"
  }
}

Configure S3 credentials via the OpenSearch keystore:

bin/opensearch-keystore add s3.client.default.access_key
bin/opensearch-keystore add s3.client.default.secret_key

Or use IAM roles (recommended for EC2/EKS deployments) — no keystore entries needed.

GCS Repository

PUT /_snapshot/my-gcs-repo
{
  "type": "gcs",
  "settings": {
    "bucket": "my-opensearch-backups",
    "base_path": "snapshots"
  }
}

Shared Filesystem Repository

For on-premises or development environments:

PUT /_snapshot/my-fs-repo
{
  "type": "fs",
  "settings": {
    "location": "/mnt/backup/opensearch-snapshots",
    "compress": true
  }
}

The path must be configured in opensearch.yml as an allowed repository path:

path.repo: ["/mnt/backup/opensearch-snapshots"]

All data nodes must have access to the shared filesystem path.

Verify Repository

POST /_snapshot/my-s3-repo/_verify

This confirms all nodes can read and write to the repository.

Creating Snapshots

Full Cluster Snapshot

PUT /_snapshot/my-s3-repo/snapshot-2025-06-15
{
  "indices": "*",
  "ignore_unavailable": true,
  "include_global_state": true
}
  • include_global_state: true captures cluster settings, index templates, and ingest pipelines.
  • Snapshots are incremental — only data that changed since the last snapshot is transferred.

Selective Snapshot

Back up specific indices:

PUT /_snapshot/my-s3-repo/snapshot-logs-june
{
  "indices": "logs-2025.06.*",
  "ignore_unavailable": true,
  "include_global_state": false
}

Monitor Snapshot Progress

GET /_snapshot/my-s3-repo/snapshot-2025-06-15/_status

List All Snapshots

GET /_snapshot/my-s3-repo/_all

Restoring from Snapshots

Restore All Indices

POST /_snapshot/my-s3-repo/snapshot-2025-06-15/_restore
{
  "indices": "*",
  "ignore_unavailable": true,
  "include_global_state": false
}

Indices being restored must not already exist in the cluster. Either delete them first or use rename patterns.

Restore with Renaming

Restore to different index names (useful for testing or comparison):

POST /_snapshot/my-s3-repo/snapshot-2025-06-15/_restore
{
  "indices": "products",
  "rename_pattern": "(.+)",
  "rename_replacement": "restored-$1",
  "include_global_state": false
}

Restore Specific Indices

POST /_snapshot/my-s3-repo/snapshot-2025-06-15/_restore
{
  "indices": "products,orders",
  "ignore_unavailable": true,
  "include_global_state": false
}

Restore with Modified Settings

Change shard count or replicas during restore:

POST /_snapshot/my-s3-repo/snapshot-2025-06-15/_restore
{
  "indices": "products",
  "index_settings": {
    "index.number_of_replicas": 0
  },
  "ignore_index_settings": [
    "index.refresh_interval"
  ]
}

Setting replicas to 0 during restore speeds up recovery — add replicas back after the restore completes.

Automating Backups with Snapshot Management (SM)

OpenSearch's Snapshot Management policy automates recurring snapshots:

POST /_plugins/_sm/policies/daily-backup
{
  "description": "Daily backup of all indices",
  "creation": {
    "schedule": {
      "cron": {
        "expression": "0 2 * * *",
        "timezone": "UTC"
      }
    }
  },
  "deletion": {
    "schedule": {
      "cron": {
        "expression": "0 3 * * *",
        "timezone": "UTC"
      }
    },
    "condition": {
      "max_age": "30d",
      "max_count": 30,
      "min_count": 7
    }
  },
  "snapshot_config": {
    "repository": "my-s3-repo",
    "indices": "*",
    "ignore_unavailable": true,
    "include_global_state": false
  }
}

This policy:

  • Creates a snapshot daily at 2 AM UTC
  • Deletes snapshots older than 30 days, keeping at least 7 and at most 30
  • Runs deletion checks daily at 3 AM UTC

Check SM Policy Status

GET /_plugins/_sm/policies/daily-backup

Disaster Recovery Planning

Define RPO and RTO

Metric Definition Guidance
RPO (Recovery Point Objective) Maximum acceptable data loss Determines snapshot frequency. 1-hour RPO = hourly snapshots.
RTO (Recovery Time Objective) Maximum acceptable downtime Determines restore infrastructure readiness.

Snapshot Frequency Recommendations

Use Case Snapshot Frequency Retention
Production e-commerce Every 1–4 hours 30 days
Log aggregation Daily 7–14 days
Compliance/audit indices Daily 1–7 years (archive tier)
Development/staging Weekly 7 days

Cross-Region Backups

For disaster recovery against regional outages, replicate snapshots to a secondary region:

  • S3 Cross-Region Replication: Configure S3 CRR on the snapshot bucket to automatically replicate to another region.
  • Dual-repository approach: Register snapshot repositories in two regions and create snapshots to both.

Testing Restores

Untested backups are not backups. Regularly validate your snapshots:

  1. Restore to a staging cluster on a schedule (monthly minimum)
  2. Verify document counts match production
  3. Run sample queries to validate data integrity
  4. Measure restore time to validate your RTO

Performance Tuning

Snapshot Speed

Snapshot throughput is limited by:

  • Node I/O bandwidth: Snapshots read from disk. Avoid snapshotting during peak ingestion.
  • Repository throughput: S3/GCS network bandwidth. Use max_snapshot_bytes_per_sec to cap if snapshots impact query performance.
  • Concurrent snapshots: Only one snapshot can run at a time per repository.
PUT /_cluster/settings
{
  "persistent": {
    "cluster.max_concurrent_snapshot_operations": 1
  }
}

Restore Speed

Speed up restores:

PUT /_cluster/settings
{
  "transient": {
    "indices.recovery.max_bytes_per_sec": "500mb"
  }
}

Also:

  • Restore with number_of_replicas: 0, then add replicas after
  • Increase indices.recovery.max_concurrent_file_chunks
  • Ensure target cluster has sufficient disk space (2x the restored index size for safety)

Frequently Asked Questions

Q: Are snapshots incremental?

Yes. The first snapshot of an index captures all data. Subsequent snapshots only store segments that changed since the last snapshot. This makes frequent snapshots efficient in both time and storage.

Q: Can I restore a snapshot to a different OpenSearch version?

You can restore snapshots to the same major version or one major version higher. For example, a snapshot from OpenSearch 1.x can be restored to OpenSearch 2.x, but not vice versa.

Q: How much storage do snapshots use?

Roughly equal to the primary shard size of the snapshotted indices (replicas aren't snapshotted). Incremental snapshots after the first are much smaller — typically 1–10% of the full size depending on data change rate.

Q: Can I snapshot a single index without affecting the cluster?

Yes. Snapshots are non-blocking — they don't lock indices or stop indexing. There's some I/O overhead, so schedule snapshots during low-traffic periods for large clusters.

Q: What happens if a snapshot fails midway?

Partial snapshots are marked as failed and should not be used for restore. The repository and existing complete snapshots remain unaffected. Investigate the failure cause (disk space, permissions, network), fix it, and re-run.

Q: How do I migrate data between clusters using snapshots?

Register the same snapshot repository on both clusters. Create a snapshot on the source cluster, then restore it on the target cluster. This is the recommended approach for cluster migrations and is covered in detail in our OpenSearch Migration Guide.

Pulse - Elasticsearch Operations Done Right

Pulse can solve your OpenSearch issues

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.