ClickHouse DB::Exception: Refresh failed

The "DB::Exception: Refresh failed" error in ClickHouse indicates that a scheduled refresh of a refreshable materialized view did not complete successfully. The REFRESH_FAILED error code is raised when the periodic query execution that populates a refreshable MV encounters an error during its run.

Impact

The materialized view retains its data from the last successful refresh. The stale data remains queryable, but it does not reflect the latest source data. If refreshes continue to fail, the data grows increasingly out of date. Downstream dashboards and applications that depend on the MV will show stale results until the refresh succeeds.

Common Causes

  1. The underlying source table was dropped, renamed, or had its schema changed
  2. The refresh query itself fails due to resource limits (memory, execution time)
  3. A dependent dictionary or external table is unavailable during the refresh window
  4. The target table's schema is incompatible with the refresh query's output after an ALTER
  5. Disk space exhaustion preventing the refresh from writing new data
  6. Network errors when the refresh query involves remote tables or distributed queries
  7. Concurrent DDL operations that lock resources needed by the refresh

Troubleshooting and Resolution Steps

  1. Check the status of refreshable materialized views:

    SELECT database, view, status, last_success_time,
           last_refresh_time, next_refresh_time, exception
    FROM system.view_refreshes;
    
  2. Look at the specific error that caused the refresh failure (the exception column is populated when the last refresh attempt failed):

    SELECT database, view, status, exception
    FROM system.view_refreshes
    WHERE exception != '';
    
  3. Check the query log for the failed refresh query:

    SELECT event_time, query, exception, memory_usage, query_duration_ms
    FROM system.query_log
    WHERE query LIKE '%REFRESH%' OR query LIKE '%your_mv_name%'
    ORDER BY event_time DESC
    LIMIT 10;
    
  4. If the failure is due to resource limits, apply the relevant query settings to the refresh query. Regular query settings (such as max_execution_time or max_memory_usage) go in a SETTINGS clause at the end of the view's SELECT, not in the REFRESH ... SETTINGS clause (which only accepts refresh-specific settings like refresh_retries). Recreate the view with the limits applied:

    CREATE OR REPLACE MATERIALIZED VIEW your_database.your_mv
    REFRESH EVERY 1 HOUR
    ENGINE = MergeTree() ORDER BY (key_col)
    AS SELECT ... FROM source_table
    SETTINGS max_execution_time = 3600, max_memory_usage = 20000000000;
    

    To add automatic refresh retries, use the refresh-specific settings:

    ALTER TABLE your_database.your_mv
    MODIFY REFRESH EVERY 1 HOUR
    SETTINGS refresh_retries = 5, refresh_retry_initial_backoff_ms = 500;
    
  5. Manually trigger a refresh to test after fixing the underlying issue:

    SYSTEM REFRESH VIEW your_database.your_mv;
    
  6. If the source schema changed, recreate the MV with the updated definition:

    DROP TABLE your_database.your_mv;
    CREATE MATERIALIZED VIEW your_database.your_mv
    REFRESH EVERY 1 HOUR
    ENGINE = MergeTree() ORDER BY (key_col)
    AS SELECT ... FROM source_table;
    
  7. Check disk space if the refresh fails with write errors:

    df -h /var/lib/clickhouse/
    

Best Practices

  • Monitor system.view_refreshes and alert on views with a non-empty exception column or views whose last_success_time is older than expected.
  • Set resource limits on refresh queries to prevent them from consuming excessive memory or CPU during peak hours.
  • Schedule refreshes during off-peak hours if the refresh query is resource-intensive.
  • Use REFRESH EVERY ... OFFSET to stagger refreshes of multiple MVs and avoid resource contention.
  • Test refresh queries manually before setting up the schedule to ensure they complete successfully.
  • Keep refreshable MV definitions as simple as possible to reduce the chance of failure.

Frequently Asked Questions

Q: Does the old data remain available when a refresh fails?
A: Yes. Refreshable materialized views use an atomic swap mechanism. The old data remains intact and queryable until a new refresh succeeds. A failed refresh does not corrupt or remove existing data.

Q: Can I set up retry logic for failed refreshes?
A: Yes, and some retrying happens automatically. The refresh_retries setting defaults to 2, so a failed refresh is retried up to two times (with an exponential backoff starting at refresh_retry_initial_backoff_ms, default 100 ms, and capped at refresh_retry_max_backoff_ms, default 60000 ms) before the view waits for its next scheduled run. Set refresh_retries = 0 to disable retries or -1 for infinite retries. You can also manually trigger a retry using SYSTEM REFRESH VIEW.

Q: How do I check the refresh history?
A: The system.view_refreshes table shows the current status and last refresh times. For a complete history, check system.query_log for queries associated with the view's refresh operations.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.