Range-Hashed Dictionary Layout in ClickHouse

The range_hashed dictionary layout stores values that are valid only for a specific interval — typically a date range — and lets you look them up with dictGet(dict, attr, key, range_value). It is the right tool for slowly changing dimensions like price lists, tax rates, currency conversion rates, or feature flags that change over time.

Unlike the flat, hashed, or cache layouts (covered in What Is a ClickHouse Dictionary), range_hashed adds an ordered array of ranges alongside each key, so a single key can map to different values depending on the range value you pass at lookup time.

How range_hashed Works

The dictionary is stored in memory as a hash table keyed by the primary key, with an ordered array of (range_min, range_max, value) entries per key. At lookup time you supply both the key and a range value; ClickHouse finds the entry whose interval contains that value and returns the attribute.

The range columns can be any numeric type — Date, DateTime, UInt64, Int32, and others — but the values must fit within Int64. The intervals are closed on both ends: a row with range_min = '2024-01-01' and range_max = '2024-01-31' matches 2024-01-01 through 2024-01-31 inclusive.

Creating a range_hashed Dictionary

Start with a source table holding the keyed intervals and their values:

CREATE TABLE discounts
(
    advertiser_id       UInt64,
    discount_start_date Date,
    discount_end_date   Date,
    amount              Float64
)
ENGINE = MergeTree
ORDER BY advertiser_id;

Then define the dictionary. The RANGE(MIN ... MAX ...) clause names the two columns that bound each interval:

CREATE DICTIONARY discounts_dict
(
    advertiser_id       UInt64,
    discount_start_date Date,
    discount_end_date   Date,
    amount              Float64
)
PRIMARY KEY advertiser_id
SOURCE(CLICKHOUSE(TABLE 'discounts'))
LIFETIME(MIN 1 MAX 1000)
LAYOUT(RANGE_HASHED())
RANGE(MIN discount_start_date MAX discount_end_date);

Look up a value by passing the key and the date you want to resolve:

SELECT dictGet('discounts_dict', 'amount', toUInt64(1), toDate('2024-06-15')) AS amount;

ClickHouse returns the amount from whichever interval for advertiser_id = 1 contains 2024-06-15.

Open Intervals

Real data is rarely fully bounded. A price might apply "from launch, forever" (no end date) or "everything before the first change" (no start date). range_hashed supports these open-ended intervals.

The simplest approach is to make the range columns Nullable in the source table and leave the open side NULL:

CREATE TABLE rates
(
    id         UInt64,
    date_start Nullable(Date),
    date_end   Nullable(Date),
    rate       Decimal64(4)
)
ENGINE = Log;

INSERT INTO rates VALUES
    (1, NULL,         '2021-03-13', 99),   -- open start: everything up to 2021-03-13
    (1, '2021-03-14', '2021-03-16', 100),  -- closed interval
    (1, '2021-03-17', NULL,         101),  -- open end: from 2021-03-17 onward
    (2, '2021-03-14', NULL,         200),
    (3, NULL,         '2021-03-14', 300),
    (4, '2021-03-14', '2021-03-14', 400);  -- single-day interval

The dictionary definition itself uses non-nullable Date columns:

CREATE DICTIONARY rates_dict
(
    id         UInt64,
    date_start Date,
    date_end   Date,
    rate       Decimal64(4)
)
PRIMARY KEY id
SOURCE(CLICKHOUSE(TABLE 'rates'))
LIFETIME(MIN 1 MAX 1000)
LAYOUT(RANGE_HASHED())
RANGE(MIN date_start MAX date_end);

When the dictionary loads, ClickHouse interprets NULL range boundaries as open ends:

  • A NULL range_min is converted to 1970-01-01 (the Date epoch, i.e. 0), making the interval open on the lower end — it matches everything up to range_max.
  • A NULL range_max is treated as the maximal possible value, making the interval open on the upper end — it matches everything from range_min onward.

If you prefer not to rely on Nullable source columns, you can encode open intervals directly with sentinel values: use 1970-01-01 (or 0 for numeric ranges) for an open range_min, and a far-future date for an open range_max.

-- Resolve the rate for id=1 on three different dates
SELECT
    dictGet('rates_dict', 'rate', toUInt64(1), toDate('2021-03-10')) AS before,   -- 99  (open-start row)
    dictGet('rates_dict', 'rate', toUInt64(1), toDate('2021-03-15')) AS during,   -- 100 (closed row)
    dictGet('rates_dict', 'rate', toUInt64(1), toDate('2021-03-20')) AS after_;   -- 101 (open-end row)

Overlapping Ranges and range_lookup_strategy

When more than one interval for the same key contains the lookup value, range_lookup_strategy decides which one wins:

Strategy Behavior
min (default) Returns the matching range with the minimal range_min; if several tie, the one with the minimal range_max.
max Returns the matching range with the maximal range_min; if several tie, the one with the maximal range_max.

max is the common choice when newer, more specific intervals should override older catch-all intervals (for example, a default rate plus a temporary promotional rate). Set it in the LAYOUT clause:

CREATE DICTIONARY discounts_dict
(
    advertiser_id       UInt64,
    discount_start_date Date,
    discount_end_date   Date,
    amount              Float64
)
PRIMARY KEY advertiser_id
SOURCE(CLICKHOUSE(TABLE 'discounts'))
LIFETIME(MIN 1 MAX 1000)
LAYOUT(RANGE_HASHED(range_lookup_strategy 'max'))
RANGE(MIN discount_start_date MAX discount_end_date);

If your intervals are guaranteed non-overlapping, the strategy is irrelevant — every lookup matches at most one range.

Composite Keys: complex_key_range_hashed

When the lookup key spans more than one column, use the complex_key_range_hashed layout. The PRIMARY KEY lists all key columns:

CREATE DICTIONARY range_dictionary
(
    CountryID UInt64,
    CountryKey String,
    StartDate Date,
    EndDate   Date,
    Tax       Float64 DEFAULT 0.2
)
PRIMARY KEY CountryID, CountryKey
SOURCE(CLICKHOUSE(TABLE 'date_table'))
LIFETIME(MIN 1 MAX 1000)
LAYOUT(COMPLEX_KEY_RANGE_HASHED())
RANGE(MIN StartDate MAX EndDate);

With a composite key, pass the key columns as a tuple to dictGet:

SELECT dictGet('range_dictionary', 'Tax', (toUInt64(1), 'US'), toDate('2024-06-15'));

Best Practices

  1. Keep ranges non-overlapping when you can. Non-overlapping intervals make lookups unambiguous and let you ignore range_lookup_strategy entirely. Validate your source data before loading.

  2. Choose the strategy deliberately. If overlaps are intentional (a default tier plus overrides), set range_lookup_strategy 'max' so the most specific interval wins, and document the choice.

  3. Be explicit about open intervals. Decide up front whether you use Nullable source columns (NULL range_min loads as 1970-01-01; NULL range_max is treated as the maximal possible value) or explicit sentinel values. Mixing both in one table makes the data harder to reason about.

  4. Watch memory. range_hashed keeps every key and every range entry in memory. A dimension with many keys and many historical intervals per key can grow large. See What Is a ClickHouse External Dictionary for sourcing and lifetime tuning.

  5. Tune LIFETIME for how often the source changes. Price lists that change daily need a short LIFETIME; static historical tables can use a long one to avoid unnecessary reloads.

  6. Use the dictionary for backfills and enrichment. Range lookups pair naturally with the dictionary-based backfill pattern and the enrichment/update pattern when you need point-in-time values.

Common Issues

  • Lookup returns the default value. No interval contained the range value for that key. Confirm the date is within an interval and that the key exists. Remember intervals are inclusive on both ends.
  • Wrong value with overlapping ranges. You are likely getting the default min strategy. Switch to range_lookup_strategy 'max' if the newest interval should win.
  • Open end not matching. If you stored a real far-future date as a sentinel, make sure it is actually past your lookup values; prefer NULL in a Nullable source column so ClickHouse treats range_max as maximal.
  • Range type too large. Range values must fit in Int64. A DateTime64 with sub-second precision or a very large UInt64 can overflow this constraint.

How Pulse Helps

Range-hashed dictionaries are easy to define but easy to misconfigure: overlapping intervals, the wrong range_lookup_strategy, open-interval encoding mistakes, and dictionaries that quietly outgrow available memory all surface as wrong results or load failures rather than obvious errors. Pulse monitors dictionary load status, memory footprint, and reload health across your ClickHouse fleet, and our team helps design slowly-changing-dimension lookups — including open and overlapping intervals — so they return correct values and stay within resource budgets as your data grows.

Frequently Asked Questions

Q: What types can the range columns be?

Any numeric type whose values fit in Int64Date, DateTime, UInt64, Int32, and similar. Date and DateTime are by far the most common because the typical use case is time-based validity intervals.

Q: Are range intervals inclusive or exclusive?

Both bounds are inclusive. A row with range_min = 2024-01-01 and range_max = 2024-01-31 matches every date from January 1 through January 31, including both endpoints.

Q: How do I represent an interval with no end date?

Make the range_max column Nullable in the source table and store NULL — ClickHouse treats a NULL range_max as the maximal possible value, making the interval open-ended. Alternatively store an explicit far-future sentinel date.

Q: What happens when ranges overlap and a lookup matches several?

The range_lookup_strategy setting decides. The default min returns the range with the smallest range_min; max returns the range with the largest range_min. Set it inside the LAYOUT(RANGE_HASHED(...)) clause.

Q: How do I look up with a multi-column key?

Use the complex_key_range_hashed layout, list all key columns in PRIMARY KEY, and pass them as a tuple to dictGet, e.g. dictGet(dict, attr, (key1, key2), range_value).

Q: Is the whole dictionary kept in memory?

Yes. Like other hashed layouts, range_hashed stores all keys, ranges, and attributes in RAM. Size it against your interval count, and see What Is a ClickHouse External Dictionary for sourcing and lifetime options.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.