Manual snapshots for Amazon OpenSearch Service clusters

What are snapshots?

Snapshots are backups of your OpenSearch cluster's indices and state. They capture the complete data and configuration at a specific point in time, allowing you to restore your cluster in case of data loss or corruption.

Why manual snapshots?

Amazon OpenSearch Service automatically creates daily snapshots and retains them for 14 days. However, manual snapshots are essential for:

Long-term retention: Keep backups beyond the 14-day automatic retention period
Migration: Move data between clusters or AWS regions
Compliance: Meet regulatory requirements for data retention
Custom scheduling: Create snapshots at specific times that align with your maintenance windows

Setting up S3 as a snapshot repository

1. Create an S3 bucket

aws s3 mb s3://my-opensearch-snapshots --region us-east-1

2. Create an IAM role for OpenSearch

Create a trust policy that allows OpenSearch to assume the role:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Service": "es.amazonaws.com"
    },
    "Action": "sts:AssumeRole"
  }]
}

Create a permissions policy for S3 access:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Action": [
      "s3:ListBucket",
      "s3:GetObject",
      "s3:PutObject",
      "s3:DeleteObject"
    ],
    "Effect": "Allow",
    "Resource": [
      "arn:aws:s3:::my-opensearch-snapshots",
      "arn:aws:s3:::my-opensearch-snapshots/*"
    ]
  }]
}

3. Register the snapshot repository

curl -XPUT 'https://my-domain.us-east-1.es.amazonaws.com/_snapshot/my-snapshot-repo' \
  -H 'Content-Type: application/json' \
  -d '{
    "type": "s3",
    "settings": {
      "bucket": "my-opensearch-snapshots",
      "region": "us-east-1",
      "role_arn": "arn:aws:iam::123456789012:role/OpenSearchSnapshotRole"
    }
  }'

Taking your first snapshot

Execute a manual snapshot:

curl -XPUT 'https://my-domain.us-east-1.es.amazonaws.com/_snapshot/my-snapshot-repo/snapshot-1?wait_for_completion=false'

Check snapshot status:

curl -XGET 'https://my-domain.us-east-1.es.amazonaws.com/_snapshot/my-snapshot-repo/snapshot-1'

Automating periodic snapshots

Use Index State Management (ISM) or create a Lambda function with EventBridge:

Lambda function example:

import boto3
import requests
from requests_aws4auth import AWS4Auth

def lambda_handler(event, context):
    host = 'https://my-domain.us-east-1.es.amazonaws.com'
    region = 'us-east-1'
    service = 'es'

    credentials = boto3.Session().get_credentials()
    awsauth = AWS4Auth(credentials.access_key, credentials.secret_key,
                       region, service, session_token=credentials.token)

    path = f'/_snapshot/my-snapshot-repo/snapshot-{event["time"]}'
    url = host + path

    r = requests.put(url, auth=awsauth)
    return {'statusCode': 200, 'body': r.text}

EventBridge rule: Schedule the Lambda to run daily at 2 AM UTC using cron expression: cron(0 2 * * ? *)

Restoring to another cluster

1. Register the same S3 repository on the target cluster

curl -XPUT 'https://target-domain.us-east-1.es.amazonaws.com/_snapshot/my-snapshot-repo' \
  -H 'Content-Type: application/json' \
  -d '{
    "type": "s3",
    "settings": {
      "bucket": "my-opensearch-snapshots",
      "region": "us-east-1",
      "role_arn": "arn:aws:iam::123456789012:role/OpenSearchSnapshotRole"
    }
  }'

2. Restore the snapshot

curl -XPOST 'https://target-domain.us-east-1.es.amazonaws.com/_snapshot/my-snapshot-repo/snapshot-1/_restore' \
  -H 'Content-Type: application/json' \
  -d '{
    "indices": "index-1,index-2",
    "ignore_unavailable": true,
    "include_global_state": false
  }'

3. Monitor restore progress

curl -XGET 'https://target-domain.us-east-1.es.amazonaws.com/_recovery'

Best practices

Test your restore process regularly
Monitor snapshot size and duration
Use index filtering to snapshot only necessary indices
Implement proper IAM policies with least privilege access
Consider cross-region replication for disaster recovery