The "DB::Exception: Async load failed" error in ClickHouse occurs when the server's asynchronous loading mechanism for tables or databases encounters a failure during startup or lazy initialization. The error code is ASYNC_LOAD_FAILED. ClickHouse loads tables asynchronously to speed up server start time, and this error means one or more tables could not be loaded successfully in the background.
Impact
The consequences of this error depend on which tables failed to load:
- Affected tables will be inaccessible -- queries referencing them will fail
- Views or materialized views that depend on the failed table will also be broken
- If the failure involves system tables or critical metadata tables, broader server functionality may be degraded
- The server itself continues running and can serve queries against tables that loaded successfully
Common Causes
- Corrupted table metadata -- The
.sqlfile defining the table in the metadata directory is malformed or references a nonexistent engine or setting. - Missing data files -- The table's data directory or critical files within it (e.g.,
format_version.txt, part directories) have been deleted or moved. - Incompatible schema changes -- A server upgrade introduced breaking changes in how certain table engines or data types are handled.
- Disk errors -- I/O errors when reading metadata or data files from storage prevent the table from loading.
- Dependency on unavailable resources -- Tables using external dictionaries, remote databases, or specific configurations that are not available at load time.
- Out of memory during loading -- Loading a large number of tables simultaneously can exhaust available memory.
Troubleshooting and Resolution Steps
Identify which tables failed to load by checking the server logs:
grep -i "async load failed\|Cannot load table\|Failed to load" /var/log/clickhouse-server/clickhouse-server.log | tail -30The log entries will specify the database and table names.
Check the table metadata file for the failing table:
cat /var/lib/clickhouse/metadata/<database>/<table>.sqlVerify the SQL is syntactically valid and references supported engines and settings.
Verify the data directory exists and is readable:
ls -la /var/lib/clickhouse/data/<database>/<table>/Check for missing directories, permission issues, or empty part directories.
Try loading the table manually after the server starts:
SYSTEM RELOAD TABLE <database>.<table>;The error message from this command is often more descriptive than the startup log.
Check for disk errors:
dmesg | grep -i "error\|fault\|i/o" smartctl -a /dev/sda -- or the relevant deviceIf the table metadata is corrupted, restore it from a backup or recreate the table:
-- If you have a backup of the CREATE TABLE statement DROP TABLE IF EXISTS <database>.<table>; CREATE TABLE <database>.<table> (...) ENGINE = ...; -- Then attach existing data parts if they are intact ALTER TABLE <database>.<table> ATTACH PARTITION ...;Disable async loading temporarily to get more detailed error output during startup:
<!-- config.xml --> <async_load_databases>false</async_load_databases>Restart the server. Synchronous loading will produce clearer error messages at the cost of a slower startup.
Best Practices
- Regularly back up table metadata (the
metadata/directory) in addition to data backups. - Monitor ClickHouse startup logs for loading failures, especially after server upgrades or configuration changes.
- Test server upgrades in a staging environment before applying them to production, paying attention to any deprecation warnings.
- Keep disk health monitoring in place and replace failing drives proactively.
- If managing many tables (thousands or more), ensure sufficient memory is available for the loading phase.
Frequently Asked Questions
Q: Does ASYNC_LOAD_FAILED mean all my data is lost?
A: No. The data parts on disk are typically unaffected. The error means ClickHouse could not initialize the in-memory representation of the table. Once the underlying cause is fixed, the table will load normally and all data will be accessible again.
Q: Can I access other tables while one table has failed to load?
A: Yes. Asynchronous loading is per-table. Tables that loaded successfully are fully operational. Only queries that reference the failed table will encounter errors.
Q: Does this error occur only during server startup?
A: It most commonly appears during startup, but it can also occur when a table is lazily loaded on first access (if deferred loading is enabled) or when using SYSTEM RELOAD TABLE.
Q: How do I know if the table's data parts are intact even though loading failed?
A: Inspect the data directory directly on disk. Each part should be a directory containing files like checksums.txt, columns.txt, and the actual column data files. You can also run CHECK TABLE once the table is loadable to verify data integrity.