Amazon OpenSearch Service Cross-Cluster Replication: Complete Setup Guide

Amazon OpenSearch Service provides native support for cross-cluster replication (CCR), enabling you to replicate indices between domains across different AWS regions or within the same region. This feature is essential for disaster recovery, geographic distribution, and maintaining data consistency across your OpenSearch infrastructure. This guide provides step-by-step instructions for setting up CCR on Amazon OpenSearch Service.

Official Documentation: For the most up-to-date information, refer to the AWS OpenSearch Service Cross-Cluster Replication documentation.

Cross-cluster replication in Amazon OpenSearch Service enables you to replicate indices between domains across different AWS regions or within the same region, providing essential capabilities for disaster recovery, geographic distribution, and data consistency. This feature allows you to replicate data between OpenSearch Service domains, enable disaster recovery across AWS regions, reduce latency by placing data closer to users, separate read and write operations across domains, maintain data consistency with automatic synchronization, and leverage AWS infrastructure for high availability and scalability.

Prerequisites

Before setting up CCR, ensure you have:

  1. Two or more Amazon OpenSearch Service domains (leader and follower)
  2. OpenSearch version 1.0.0 or later (CCR was introduced in OpenSearch 1.0.0)
  3. Appropriate IAM permissions for CCR operations
  4. Network connectivity between domains (VPC peering, VPN, or public access)
  5. Security configurations (fine-grained access control enabled)
  6. Sufficient storage on follower domains

Step 1: Configure Network Connectivity

1.1 VPC Configuration

For domains in different VPCs, set up VPC peering:

# Create VPC peering connection
aws ec2 create-vpc-peering-connection \
  --vpc-id vpc-12345678 \
  --peer-vpc-id vpc-87654321 \
  --peer-region us-west-2

# Accept the peering connection
aws ec2 accept-vpc-peering-connection \
  --vpc-peering-connection-id pcx-12345678

# Update route tables to route traffic between VPCs
aws ec2 create-route \
  --route-table-id rtb-12345678 \
  --destination-cidr-block 10.0.0.0/16 \
  --vpc-peering-connection-id pcx-12345678

1.2 Security Group Configuration

Configure security groups to allow cross-cluster communication:

# Create security group rule for OpenSearch transport
aws ec2 authorize-security-group-ingress \
  --group-id sg-12345678 \
  --protocol tcp \
  --port 9300 \
  --source-group sg-87654321

# Create security group rule for OpenSearch HTTP
aws ec2 authorize-security-group-ingress \
  --group-id sg-12345678 \
  --protocol tcp \
  --port 443 \
  --source-group sg-87654321

1.3 Domain Access Policy

Ensure your domains have appropriate access policies:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/OpenSearchCCRRole"
      },
      "Action": "es:*",
      "Resource": "arn:aws:es:us-east-1:123456789012:domain/leader-domain/*"
    }
  ]
}

Step 2: Set Up IAM Permissions

2.1 Create IAM Role for CCR

Create an IAM role with appropriate permissions for CCR operations:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "es:DescribeElasticsearchDomain",
        "es:ListTags",
        "es:DescribeElasticsearchDomains",
        "es:ESHttpGet",
        "es:ESHttpPut",
        "es:ESHttpPost",
        "es:ESHttpDelete"
      ],
      "Resource": [
        "arn:aws:es:us-east-1:123456789012:domain/leader-domain/*",
        "arn:aws:es:us-west-2:123456789012:domain/follower-domain/*"
      ]
    }
  ]
}

2.2 Create IAM Policy

# Create IAM policy
aws iam create-policy \
  --policy-name OpenSearchCCRPolicy \
  --policy-document file://opensearch-ccr-policy.json

# Create IAM role
aws iam create-role \
  --role-name OpenSearchCCRRole \
  --assume-role-policy-document file://trust-policy.json

# Attach policy to role
aws iam attach-role-policy \
  --role-name OpenSearchCCRRole \
  --policy-arn arn:aws:iam::123456789012:policy/OpenSearchCCRPolicy

Step 3: Configure Fine-Grained Access Control

3.1 Enable Fine-Grained Access Control

Ensure fine-grained access control is enabled on both domains:

# Update domain configuration
aws es update-elasticsearch-domain-config \
  --domain-name leader-domain \
  --advanced-security-options '{
    "Enabled": true,
    "InternalUserDatabaseEnabled": true,
    "MasterUserOptions": {
      "MasterUserName": "admin",
      "MasterUserPassword": "SecurePassword123!"
    }
  }'

aws es update-elasticsearch-domain-config \
  --domain-name follower-domain \
  --advanced-security-options '{
    "Enabled": true,
    "InternalUserDatabaseEnabled": true,
    "MasterUserOptions": {
      "MasterUserName": "admin",
      "MasterUserPassword": "SecurePassword123!"
    }
  }'

3.2 Create CCR-Specific Roles

Create roles with appropriate permissions for CCR operations:

# On leader domain - create role for follower domain access
curl -X POST "https://leader-domain-endpoint:443/_security/role/ccr_leader_role" \
  -H "Content-Type: application/json" \
  -u admin:SecurePassword123! \
  -d '{
    "cluster": [
      "cluster:monitor/state",
      "cluster:monitor/health",
      "cluster:admin/remote_info"
    ],
    "indices": [
      {
        "names": ["*"],
        "privileges": [
          "read",
          "view_index_metadata"
        ]
      }
    ]
  }'

# On follower domain - create role for CCR operations
curl -X POST "https://follower-domain-endpoint:443/_security/role/ccr_follower_role" \
  -H "Content-Type: application/json" \
  -u admin:SecurePassword123! \
  -d '{
    "cluster": [
      "cluster:monitor/state",
      "cluster:admin/remote_info",
      "cluster:admin/ccr/follow_index",
      "cluster:admin/ccr/auto_follow_pattern",
      "cluster:admin/ccr/pause_follow",
      "cluster:admin/ccr/resume_follow",
      "cluster:admin/ccr/forget_follower"
    ],
    "indices": [
      {
        "names": ["*"],
        "privileges": [
          "create_index",
          "write",
          "read",
          "view_index_metadata"
        ]
      }
    ]
  }'

3.3 Create Users and Assign Roles

# Create user on leader domain
curl -X POST "https://leader-domain-endpoint:443/_security/user/ccr_user" \
  -H "Content-Type: application/json" \
  -u admin:SecurePassword123! \
  -d '{
    "password": "SecureCCRPassword123!",
    "roles": ["ccr_leader_role"],
    "full_name": "CCR User"
  }'

# Create user on follower domain
curl -X POST "https://follower-domain-endpoint:443/_security/user/ccr_user" \
  -H "Content-Type: application/json" \
  -u admin:SecurePassword123! \
  -d '{
    "password": "SecureCCRPassword123!",
    "roles": ["ccr_follower_role"],
    "full_name": "CCR User"
  }'

Step 4: Configure Remote Cluster Connection

4.1 Add Remote Cluster on Follower

On the follower domain, add the leader domain as a remote cluster:

curl -X PUT "https://follower-domain-endpoint:443/_cluster/settings" \
  -H "Content-Type: application/json" \
  -u ccr_user:SecureCCRPassword123! \
  -d '{
    "persistent": {
      "cluster.remote.leader_cluster.seeds": ["leader-domain-endpoint:443"],
      "cluster.remote.leader_cluster.skip_unavailable": true
    }
  }'

4.2 Verify Remote Cluster Connection

# Check remote cluster info
curl -X GET "https://follower-domain-endpoint:443/_remote/info" \
  -u ccr_user:SecureCCRPassword123!

# Expected response should show leader_cluster as connected

Step 5: Set Up Cross-Cluster Replication

5.1 Create Index on Leader Domain

First, create an index on the leader domain that you want to replicate:

# Create test index on leader
curl -X PUT "https://leader-domain-endpoint:443/test-index" \
  -H "Content-Type: application/json" \
  -u ccr_user:SecureCCRPassword123! \
  -d '{
    "settings": {
      "index.number_of_shards": 3,
      "index.number_of_replicas": 1
    }
  }'

# Add some test data
curl -X POST "https://leader-domain-endpoint:443/test-index/_doc" \
  -H "Content-Type: application/json" \
  -u ccr_user:SecureCCRPassword123! \
  -d '{
    "message": "Test document for CCR",
    "timestamp": "2024-01-01T00:00:00Z",
    "region": "us-east-1"
  }'

5.2 Start Following the Index

On the follower domain, start following the leader index:

curl -X PUT "https://follower-domain-endpoint:443/test-index/_ccr/follow" \
  -H "Content-Type: application/json" \
  -u ccr_user:SecureCCRPassword123! \
  -d '{
    "remote_cluster": "leader_cluster",
    "leader_index": "test-index",
    "settings": {
      "index.number_of_replicas": 0
    }
  }'

5.3 Verify Replication Status

Check the status of the CCR follow operation:

# Check follow stats
curl -X GET "https://follower-domain-endpoint:443/test-index/_ccr/stats" \
  -u ccr_user:SecureCCRPassword123!

# Check index status
curl -X GET "https://follower-domain-endpoint:443/test-index/_stats" \
  -u ccr_user:SecureCCRPassword123!

Step 6: Advanced CCR Configuration

6.1 Auto-Follow Patterns

Set up auto-follow patterns to automatically replicate new indices:

curl -X PUT "https://follower-domain-endpoint:443/_ccr/auto_follow/leader_cluster" \
  -H "Content-Type: application/json" \
  -u ccr_user:SecureCCRPassword123! \
  -d '{
    "remote_cluster": "leader_cluster",
    "leader_index_patterns": ["logs-*", "metrics-*", "events-*"],
    "follow_index_pattern": ""
  }'

6.2 Configure Replication Settings

Customize replication behavior with advanced settings:

curl -X PUT "https://follower-domain-endpoint:443/test-index/_ccr/follow" \
  -H "Content-Type: application/json" \
  -u ccr_user:SecureCCRPassword123! \
  -d '{
    "remote_cluster": "leader_cluster",
    "leader_index": "test-index",
    "settings": {
      "index.number_of_replicas": 0
    },
    "max_read_request_operation_count": 5120,
    "max_read_request_size": "32mb",
    "max_outstanding_read_requests": 12,
    "max_read_request_retries": 3,
    "read_poll_timeout": "1m"
  }'

6.3 Cross-Region Replication

For cross-region replication, ensure proper configuration:

# Configure remote cluster with region-specific endpoint
curl -X PUT "https://follower-domain-endpoint:443/_cluster/settings" \
  -H "Content-Type: application/json" \
  -u ccr_user:SecureCCRPassword123! \
  -d '{
    "persistent": {
      "cluster.remote.leader_cluster.seeds": ["leader-domain.us-east-1.es.amazonaws.com:443"],
      "cluster.remote.leader_cluster.skip_unavailable": true,
      "cluster.remote.leader_cluster.proxy": "leader-domain.us-east-1.es.amazonaws.com:443"
    }
  }'

Step 7: Monitoring and Management

7.1 Monitor CCR Status

Regularly check the health of your CCR setup:

# Get all CCR stats
curl -X GET "https://follower-domain-endpoint:443/_ccr/stats" \
  -u ccr_user:SecureCCRPassword123!

# Get specific index CCR info
curl -X GET "https://follower-domain-endpoint:443/test-index/_ccr/info" \
  -u ccr_user:SecureCCRPassword123!

7.2 CloudWatch Integration

Set up CloudWatch monitoring for CCR metrics:

# Create CloudWatch dashboard for CCR monitoring
aws cloudwatch put-dashboard \
  --dashboard-name "OpenSearch-CCR-Monitoring" \
  --dashboard-body file://ccr-dashboard.json

Example dashboard configuration:

{
  "widgets": [
    {
      "type": "metric",
      "properties": {
        "metrics": [
          ["AWS/ES", "ClusterStatus", "DomainName", "leader-domain"],
          ["AWS/ES", "ClusterStatus", "DomainName", "follower-domain"]
        ],
        "period": 300,
        "stat": "Average",
        "region": "us-east-1",
        "title": "Cluster Status"
      }
    }
  ]
}

7.3 Pause and Resume Replication

# Pause replication
curl -X POST "https://follower-domain-endpoint:443/test-index/_ccr/pause_follow" \
  -u ccr_user:SecureCCRPassword123!

# Resume replication
curl -X POST "https://follower-domain-endpoint:443/test-index/_ccr/resume_follow" \
  -u ccr_user:SecureCCRPassword123!

7.4 Stop Following

To stop following an index and make it a regular index:

curl -X POST "https://follower-domain-endpoint:443/test-index/_ccr/unfollow" \
  -u ccr_user:SecureCCRPassword123!

Step 8: Troubleshooting Common Issues

8.1 Connection Issues

If remote cluster connection fails:

# Check domain health
aws es describe-elasticsearch-domain \
  --domain-name leader-domain

# Check security group rules
aws ec2 describe-security-groups \
  --group-ids sg-12345678

# Test connectivity
telnet leader-domain-endpoint 443

8.2 Authentication Issues

If authentication fails:

# Test authentication
curl -X GET "https://leader-domain-endpoint:443/_cluster/health" \
  -u ccr_user:SecureCCRPassword123!

# Check user permissions
curl -X GET "https://leader-domain-endpoint:443/_security/user/ccr_user" \
  -u admin:SecurePassword123!

8.3 Replication Lag

Monitor replication lag and performance:

# Check detailed CCR stats
curl -X GET "https://follower-domain-endpoint:443/test-index/_ccr/stats?pretty" \
  -u ccr_user:SecureCCRPassword123!

# Look for these key metrics:
# - "follower_global_checkpoint": Current checkpoint
# - "leader_global_checkpoint": Leader checkpoint
# - "last_requested_seq_no": Last requested sequence number

Step 9: Best Practices for AWS

9.1 Performance Optimization

  1. Use appropriate instance types for your workload
  2. Monitor CloudWatch metrics for performance bottlenecks
  3. Adjust batch sizes based on your data volume
  4. Use dedicated master nodes for production workloads
  5. Consider using UltraWarm for cost optimization

9.2 Security Considerations

  1. Use IAM roles instead of access keys
  2. Enable encryption at rest and in transit
  3. Use VPC endpoints for secure communication
  4. Implement least privilege access control
  5. Regularly rotate passwords and certificates

9.3 Cost Optimization

  1. Use Reserved Instances for predictable workloads
  2. Monitor storage usage and adjust EBS volumes
  3. Use UltraWarm for read-heavy workloads
  4. Consider multi-AZ for high availability
  5. Monitor data transfer costs between regions

9.4 Monitoring and Alerting

  1. Set up CloudWatch alarms for CCR metrics
  2. Monitor domain health and performance
  3. Track CCR metrics in your monitoring system
  4. Set up dashboards for CCR status visualization
  5. Use AWS X-Ray for tracing cross-region requests

Step 10: Disaster Recovery Planning

10.1 Multi-Region Setup

For disaster recovery, consider setting up CCR across multiple regions:

# Set up CCR between us-east-1 and us-west-2
curl -X PUT "https://follower-domain.us-west-2.es.amazonaws.com:443/_cluster/settings" \
  -H "Content-Type: application/json" \
  -u ccr_user:SecureCCRPassword123! \
  -d '{
    "persistent": {
      "cluster.remote.leader_cluster.seeds": ["leader-domain.us-east-1.es.amazonaws.com:443"],
      "cluster.remote.leader_cluster.skip_unavailable": true
    }
  }'

10.2 Failover Procedures

Document and test your failover procedures:

  1. Monitor leader domain health
  2. Promote follower to leader when needed
  3. Update application endpoints
  4. Reconfigure CCR for new topology
  5. Test failback procedures

Conclusion

Cross-cluster replication in Amazon OpenSearch Service provides a robust, managed solution for data replication across domains and regions. By following this guide, you can set up a reliable CCR infrastructure that leverages AWS's managed services for high availability, security, and scalability.

Key advantages of using Amazon OpenSearch Service for CCR include:

  • Managed infrastructure with automatic updates and patches
  • Built-in security with fine-grained access control
  • CloudWatch integration for comprehensive monitoring
  • Multi-region support for disaster recovery
  • Cost optimization with various instance types and storage options

For self-managed OpenSearch clusters, consider reading our guide on OpenSearch Cross-Cluster Replication which covers manual configuration and management.

Pulse - Elasticsearch Operations Done Right

Pulse can solve your OpenSearch issues

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.