NEW

Pulse 2025 Product Roundup: From Monitoring to AI-Native Control Plane

ClickHouse DB::Exception: Child process did not exit normally

The "DB::Exception: Child process did not exit normally" error in ClickHouse means a child process spawned by the server terminated unexpectedly -- either by crashing, receiving a fatal signal, or returning a non-zero exit code. The CHILD_WAS_NOT_EXITED_NORMALLY error is most commonly encountered when using executable UDFs, executable table functions, or dictionary sources that invoke external programs.

Impact

When this error appears, the query or operation that depended on the child process will fail. If the external program is used for a dictionary source, the dictionary may fail to load or refresh, potentially affecting all queries that rely on it. Repeated occurrences may indicate a systemic issue with the external program, causing ongoing query failures.

Common Causes

  1. The external program invoked by ClickHouse crashed due to a bug or unhandled exception
  2. The child process was killed by the OOM killer due to excessive memory usage
  3. A fatal signal (SIGSEGV, SIGABRT, SIGKILL) was sent to the child process
  4. The external script has a syntax error or runtime error (e.g., Python, Bash)
  5. Required dependencies or environment variables for the external program are missing
  6. Resource limits (memory, CPU time) caused the child process to be terminated

Troubleshooting and Resolution Steps

  1. Check the ClickHouse error log for the exit code or signal:

    grep -i "CHILD_WAS_NOT_EXITED_NORMALLY\|child process" /var/log/clickhouse-server/clickhouse-server.err.log | tail -10
    

    The log often includes the signal number or exit code that terminated the process.

  2. Test the external program manually:

    # Run the program as the clickhouse user with sample input
    echo "test_input" | sudo -u clickhouse /path/to/your/script.sh
    echo $?
    

    Check whether it runs successfully and returns exit code 0.

  3. Check for OOM kills:

    dmesg | grep -i "oom\|killed process" | tail -10
    

    If the child process was OOM-killed, you need to either optimize its memory usage or increase available memory.

  4. Verify dependencies and environment:

    sudo -u clickhouse which python3
    sudo -u clickhouse env
    

    Ensure all required interpreters, libraries, and environment variables are available to the ClickHouse user.

  5. Check file permissions:

    ls -la /path/to/your/script.sh
    sudo -u clickhouse test -x /path/to/your/script.sh && echo "Executable" || echo "Not executable"
    
  6. Add error handling to your script to produce meaningful output on failure:

    #!/bin/bash
    set -euo pipefail
    exec 2>/tmp/clickhouse_udf_errors.log
    # Your logic here
    
  7. Review resource limits for the ClickHouse process, as child processes inherit these:

    cat /proc/$(pidof clickhouse-server)/limits
    

Best Practices

  • Always test external scripts thoroughly outside of ClickHouse before configuring them as UDFs or dictionary sources.
  • Include robust error handling and logging in external scripts so failures produce actionable diagnostics.
  • Set memory and timeout limits for external programs to prevent them from consuming excessive resources.
  • Use lightweight, simple scripts that do one thing well rather than complex programs with many dependencies.
  • Monitor ClickHouse logs for CHILD_WAS_NOT_EXITED_NORMALLY errors as part of routine health checks.

Frequently Asked Questions

Q: How can I tell whether the child process was killed by a signal or exited with an error code?
A: The ClickHouse error log includes this information. A signal-based termination will mention a signal number (e.g., signal 9 for SIGKILL, signal 11 for SIGSEGV), while a normal exit with error will show a non-zero exit code.

Q: Can I increase the timeout for child processes?
A: Yes. For executable UDFs and table functions, you can set the command_termination_timeout and command_read_timeout parameters in the function configuration. Adjust these if your program needs more time.

Q: Why does my Python script work from the command line but fail when called from ClickHouse?
A: The ClickHouse process runs as the clickhouse user, which has a different environment than your regular user. Common issues include missing Python packages, different PATH settings, and missing environment variables. Test with sudo -u clickhouse to replicate the exact execution context.

Q: Does ClickHouse retry when a child process fails?
A: No, ClickHouse does not automatically retry failed child process executions. The query will fail immediately with the CHILD_WAS_NOT_EXITED_NORMALLY error. Implement retry logic at the application level if needed.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.