NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

Elasticsearch Error: Too many open files - Common Causes & Fixes

java.io.IOException: Too many open files (or EMFILE in syscall traces) is logged when the Elasticsearch JVM hits the per-process file descriptor limit imposed by the operating system. Each Lucene segment, each network socket, and each open index contributes to the count; once the limit is reached, no new index, shard, or client connection can be opened. Elasticsearch recommends setting nofile to at least 65535, and most modern packages do so automatically.

What This Error Means

Linux enforces two file-descriptor limits per process: a soft limit and a hard limit (ulimit -n shows the soft limit). Each open shard typically uses tens to hundreds of file descriptors for its Lucene segment files; each client HTTP/transport connection uses one. When the running JVM has more open file handles than the soft limit, the kernel returns EMFILE and the JVM surfaces it as IOException: Too many open files. The system-wide fs.file-max sysctl is a separate, much higher ceiling - exhausting it would affect every process on the host.

Common Causes

  1. nofile limit not raised for the Elasticsearch process. How to confirm: cat /proc/$(pgrep -f elasticsearch)/limits | grep 'open files'. Production should show 65535 or higher in the soft column.
  2. Systemd unit overrides drop the limit back to a default. How to confirm: systemctl show elasticsearch | grep LimitNOFILE should print LimitNOFILE=65535.
  3. Excessive shard count on the node. How to confirm: GET _cat/allocation?v&h=node,shards - rule-of-thumb cap is ~20 shards per GB of heap.
  4. Long-lived connections leaking from a misbehaving client. How to confirm: lsof -p <pid> -aTCP | wc -l and compare to total lsof -p <pid> | wc -l.
  5. System-wide fs.file-max set too low on tiny hosts. How to confirm: sysctl fs.file-max should be in the hundreds of thousands.

How to Fix Too Many Open Files

  1. Verify the current limits for the running Elasticsearch process:

    cat /proc/$(pgrep -f elasticsearch)/limits | grep -E 'Max open files'
    

    The output names both soft and hard limits.

  2. Check the limit Elasticsearch sees from inside the JVM:

    GET /_nodes/stats/process?filter_path=**.max_file_descriptors
    
  3. Raise the limit via systemd (preferred for tarball/RPM/DEB installs): create /etc/systemd/system/elasticsearch.service.d/override.conf:

    [Service]
    LimitNOFILE=65535
    

    Then reload and restart:

    sudo systemctl daemon-reload
    sudo systemctl restart elasticsearch
    
  4. Or set via /etc/security/limits.conf when starting Elasticsearch from a shell:

    elasticsearch soft nofile 65535
    elasticsearch hard nofile 65535
    

    Log out and back in (or restart the service) for changes to apply.

  5. Verify system-wide ceiling:

    sysctl fs.file-max
    sudo sysctl -w fs.file-max=1048576   # if too low
    

    Persist by editing /etc/sysctl.conf.

  6. Reduce shard count if descriptors are saturated by shards. Consolidate small indices and force-merge read-only indices to fewer segments:

    POST /<index>/_forcemerge?max_num_segments=1
    
  7. Hunt for connection leaks. Look at lsof -p <es_pid> and group by remote address to find clients that aren't pooling connections.

Resolve Too Many Open Files Automatically with Pulse

Pulse is an AI DBA for Elasticsearch and OpenSearch. When IOException: Too many open files (or EMFILE) shows up in your cluster, Pulse:

  • Reads process.open_file_descriptors and process.max_file_descriptors from _nodes/stats/process, cross-checks /proc/<pid>/limits, systemctl show elasticsearch | grep LimitNOFILE, /etc/security/limits.conf, and sysctl fs.file-max so the three layers (kernel, systemd, JVM) are reconciled in one place
  • Identifies which of the five causes applies: missing LimitNOFILE systemd directive, shard-count saturation (against the ~20-shards-per-GB-heap rule of thumb), leaked long-lived client connections (visible in lsof -p <pid> -aTCP), or system-wide fs.file-max set too low
  • Generates the exact remediation: the /etc/systemd/system/elasticsearch.service.d/override.conf snippet setting LimitNOFILE=65535, the matching limits.conf lines, the sysctl -w fs.file-max=1048576 value, the _forcemerge?max_num_segments=1 plan for read-only indices, or the client-side connection-pool fix
  • Applies the systemd drop-in and sysctl changes automatically with operator approval; leaves the rolling restart and shard consolidation as a one-click PR

Pulse runs trend-based exhaustion forecasts on descriptor usage per node, alerting before saturation rather than after the bootstrap check fails or new indices stop opening.

Start a free trial to connect your cluster.

Frequently Asked Questions

Q: How many open files does Elasticsearch need?
A: Elasticsearch documentation recommends a minimum of 65535 (the nofile soft limit). Production nodes with many shards or high client concurrency may need higher. Below this, the bootstrap check fails and the node refuses to start in production mode.

Q: Why does Elasticsearch open so many file descriptors?
A: Each Lucene segment uses multiple file descriptors (one per file in the segment), each open shard holds segments, and each HTTP/transport connection adds one. A node with 200 shards and 4 clients can easily use 10,000+ descriptors.

Q: Does raising the file descriptor limit hurt performance?
A: No. The limit caps the count but does not reserve resources. Raising it from 65535 to 1048576 has no measurable runtime cost.

Q: Can I check the limit without restarting Elasticsearch?
A: Yes. cat /proc/<pid>/limits shows the live limits for the running process, and the _nodes/stats/process API reports max_file_descriptors. The limit applied at process start cannot be changed without a restart.

Q: Will reducing shards fix "Too many open files"?
A: Often, yes. Each open shard contributes tens to hundreds of file descriptors. Consolidating small indices and force-merging read-only indices to one segment drops descriptor usage substantially.

Q: Is ulimit -n enough on systemd-managed installs?
A: No. Systemd does not honor /etc/security/limits.conf for service units. Set LimitNOFILE in the systemd unit (or a drop-in .conf) for systemd-managed Elasticsearch.

Q: What's the fastest way to diagnose "Too many open files" in production?
A: Pulse, the AI DBA for Elasticsearch and OpenSearch, reconciles /proc/<pid>/limits, the systemd unit's LimitNOFILE, /etc/security/limits.conf, and the JVM's reported max_file_descriptors in one view, then names whether the cause is misconfigured limits, shard saturation, or a client connection leak. It applies the systemd drop-in fix after approval and tracks descriptor usage trends so the next exhaustion event triggers an alert in advance.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.