ClickHouse Named Collections: Securely Storing Credentials

Named collections let you store a reusable set of connection parameters -- host, port, user, password, URL, access keys -- under a single name, then reference that name in dictionaries, table functions, and integrations instead of inlining the secrets. For dictionaries this is the difference between a CREATE DICTIONARY statement that exposes a plaintext password to anyone who can run SHOW CREATE DICTIONARY and one that reveals nothing but a collection name.

This guide focuses on using named collections to secure dictionary credentials. For a broader troubleshooting reference on named collections (config vs. DDL, cluster consistency, the NAMED_COLLECTION_DOESNT_EXIST error), see Named collection doesn't exist.

The Problem: Credentials Leak Into Dictionary Definitions

A dictionary that connects to an external database normally embeds the credentials directly in its SOURCE clause:

CREATE DICTIONARY users_dict (id UInt64, name String)
PRIMARY KEY id
SOURCE(POSTGRESQL(
    host 'pg.internal' port 5432
    user 'ch_reader' password 'S3cr3tP@ss'
    db 'app' table 'users'
))
LIFETIME(MIN 300 MAX 360)
LAYOUT(HASHED());

Anyone with SHOW DICTIONARIES and SELECT on system.dictionaries -- or who can run SHOW CREATE DICTIONARY users_dict -- can read that password in clear text. Named collections move the secret out of the statement and behind an access-controlled name.

Creating a Named Collection

There are two ways to define a collection, and both work identically for dictionaries.

Via DDL (recommended)

DDL collections take effect immediately and persist across restarts. By default they use local storage and are not automatically propagated to other cluster nodes; cluster-wide propagation requires either the ON CLUSTER clause or configuring keeper/zookeeper as the storage backend:

CREATE NAMED COLLECTION pg_app AS
    host = 'pg.internal',
    port = 5432,
    user = 'ch_reader',
    password = 'S3cr3tP@ss',
    database = 'app';

Creating or modifying collections via DDL requires the NAMED COLLECTION ADMIN privilege (the legacy alias is NAMED COLLECTION CONTROL, also settable as named_collection_control in the user profile). Enable it in the user profile:

<clickhouse>
    <users>
        <etl_admin>
            <named_collection_control>1</named_collection_control>
        </etl_admin>
    </users>
</clickhouse>

Via XML configuration

Alternatively, define the collection in a config file such as /etc/clickhouse-server/config.d/named_collections.xml:

<clickhouse>
    <named_collections>
        <pg_app>
            <host>pg.internal</host>
            <port>5432</port>
            <user>ch_reader</user>
            <password>S3cr3tP@ss</password>
            <database>app</database>
        </pg_app>
    </named_collections>
</clickhouse>

Config-based collections require SYSTEM RELOAD CONFIG (or a restart) to load, and the file must be distributed to every node in a cluster yourself. DDL collections avoid both chores, which is why they are the recommended approach in current versions.

Using Named Collections in Dictionary Sources

Reference the collection by NAME inside the source function. The collection supplies the connection details; you only specify what differs per dictionary (the table, and optionally a query).

PostgreSQL dictionary source

CREATE DICTIONARY users_dict (id UInt64, name String)
PRIMARY KEY id
SOURCE(POSTGRESQL(NAME pg_app TABLE 'users'))
LIFETIME(MIN 300 MAX 360)
LAYOUT(HASHED());

SELECT dictGet('users_dict', 'name', 42);

The password never appears in the statement, so SHOW CREATE DICTIONARY users_dict shows only NAME pg_app.

MySQL dictionary source

CREATE NAMED COLLECTION mysql_app AS
    host = 'mysql.internal', port = 3306,
    user = 'ch_reader', password = 'S3cr3tP@ss',
    database = 'app';

CREATE DICTIONARY products_dict (id UInt64, sku String)
PRIMARY KEY id
SOURCE(MYSQL(NAME mysql_app TABLE 'products'))
LIFETIME(MIN 300 MAX 360)
LAYOUT(HASHED());

ClickHouse, MongoDB, and other sources

The same NAME pattern works for the CLICKHOUSE, MONGODB, and other supported dictionary sources:

CREATE NAMED COLLECTION remote_ch AS
    host = 'ch-2.internal', port = 9000,
    user = 'dict_reader', password = 'S3cr3tP@ss',
    database = 'default';

CREATE DICTIONARY lookup_dict (a Int64, b String)
PRIMARY KEY a
SOURCE(CLICKHOUSE(NAME remote_ch TABLE 'reference_data'))
LIFETIME(MIN 1 MAX 2)
LAYOUT(HASHED());

For background on dictionary mechanics and source types, see What is a ClickHouse external dictionary and What is a ClickHouse dictionary.

Overriding vs. Locking Parameters

By passing extra arguments alongside NAME, callers can override individual collection keys -- handy for reusing one collection across several tables, dangerous when the keys are secrets. For example, SOURCE(POSTGRESQL(NAME pg_app TABLE 'orders')) overrides only the table while reusing the stored credentials.

Whether overrides are allowed at all is governed by the server setting allow_named_collection_override_by_default (enabled by default) and per-key markers:

CREATE NAMED COLLECTION pg_app AS
    host = 'pg.internal' NOT OVERRIDABLE,
    port = 5432 NOT OVERRIDABLE,
    user = 'ch_reader' NOT OVERRIDABLE,
    password = 'S3cr3tP@ss' NOT OVERRIDABLE,
    database = 'app' OVERRIDABLE;

A key marked OVERRIDABLE can always be overridden in a query; NOT OVERRIDABLE can never be. Keys with no marker fall back to allow_named_collection_override_by_default.

Security implication: if override is allowed, a user can supply their own host/port and trick the server into sending the stored credentials to a destination they control, effectively exfiltrating the password. When a collection holds sensitive credentials, mark them NOT OVERRIDABLE -- or disable allow_named_collection_override_by_default server-wide.

Access Control: Who Can Use and Who Can See

Named collections are governed by their own privilege family. The key grants for securing dictionary credentials:

Privilege What it allows Notes
NAMED COLLECTION (alias USE NAMED COLLECTION) Reference the collection in a dictionary/query Added in 23.7; required to create a dictionary that uses the collection
SHOW NAMED COLLECTIONS See that the collection exists and list its keys Does not reveal secret values
SHOW NAMED COLLECTIONS SECRETS Read the stored secret values Grant this sparingly -- it exposes passwords
CREATE / ALTER / DROP NAMED COLLECTION Manage collections via DDL Covered by NAMED COLLECTION ADMIN / named_collection_control

The whole point of named collections for security is that a dictionary author needs only USE NAMED COLLECTION, not the ability to read the secret:

-- ETL user can build dictionaries from the collection...
GRANT NAMED COLLECTION ON pg_app TO etl_user;

-- ...but cannot read the password it contains.
-- (No SHOW NAMED COLLECTIONS SECRETS grant for etl_user.)

With this split, an analyst can run CREATE DICTIONARY ... SOURCE(POSTGRESQL(NAME pg_app ...)) and query the dictionary, while the PostgreSQL password stays invisible to them.

Modifying and Inspecting Collections

Update a DDL-based collection without recreating it:

ALTER NAMED COLLECTION pg_app SET password = 'newRotatedPass';
ALTER NAMED COLLECTION pg_app DELETE port;

List collections and (with the right grant) view their definitions:

SHOW NAMED COLLECTIONS;
SHOW CREATE NAMED COLLECTION pg_app;

SHOW NAMED COLLECTIONS lists keys but masks secret values unless you also hold SHOW NAMED COLLECTIONS SECRETS. After rotating a password with ALTER NAMED COLLECTION, dictionaries pick up the new credential on their next refresh (per their LIFETIME) or after SYSTEM RELOAD DICTIONARY.

Encrypting Collections at Rest

DDL collections are stored on disk (or in Keeper) and can be encrypted so the credentials are not readable from the filesystem. Configure the storage backend:

<clickhouse>
    <named_collections_storage>
        <type>local_encrypted</type>
        <key_hex>bebec0cabebec0cabebec0ca</key_hex>
        <algorithm>aes_128_ctr</algorithm>
        <path>named_collections</path>
    </named_collections_storage>
</clickhouse>

Supported type values include local, keeper, zookeeper, and their _encrypted variants. The keeper_encrypted / zookeeper_encrypted options are useful in a cluster, keeping a single encrypted source of truth in Keeper.

Best Practices

  1. Prefer DDL collections. They persist and take effect immediately without a SYSTEM RELOAD CONFIG. For cluster-wide availability, use the ON CLUSTER clause when creating them or configure keeper/zookeeper as the storage backend -- otherwise DDL collections are local to each node.
  2. Mark secrets NOT OVERRIDABLE (or disable allow_named_collection_override_by_default) so users cannot redirect stored credentials to a host they control.
  3. Separate USE from SECRETS. Grant NAMED COLLECTION to dictionary authors; reserve SHOW NAMED COLLECTIONS SECRETS for administrators.
  4. Encrypt at rest with a *_encrypted storage backend if the data directory or Keeper contents are not otherwise protected.
  5. Rotate via ALTER NAMED COLLECTION and reload affected dictionaries, rather than editing dozens of dictionary definitions.
  6. Keep definitions in version control -- migration scripts for DDL, or the XML files -- and use a consistent naming convention like source_env (e.g. pg_app_prod).

Common Issues

  • NAMED_COLLECTION_DOESNT_EXIST -- the name is misspelled, the collection was not created, or a config-based collection was added without SYSTEM RELOAD CONFIG. See Named collection doesn't exist.
  • Not enough privileges when creating a dictionary -- the user lacks the NAMED COLLECTION (USE) grant on that collection.
  • Dictionary loads with stale credentials -- after ALTER NAMED COLLECTION, run SYSTEM RELOAD DICTIONARY or wait for the next LIFETIME refresh.
  • Cluster node missing the collection -- config-based collections must be present on every node; DDL collections are local by default and must be created with ON CLUSTER or stored in keeper/zookeeper to be available cluster-wide.

How Pulse Helps

Misconfigured named collections surface in subtle ways: a dictionary that silently fails to refresh after a password rotation, an overridable secret that should have been locked down, or a config-based collection that exists on some cluster nodes but not others. Pulse monitors ClickHouse dictionary health and configuration consistency across your cluster, flagging dictionaries that are failing to load, credential settings that diverge between nodes, and access-control gaps -- so credential issues are caught before they break production lookups. Pulse is run by ClickHouse experts who can advise on securing integrations end to end.

Frequently Asked Questions

Q: Does using a named collection actually hide the password from dictionary authors?

Yes, if access is split correctly. A user with only the NAMED COLLECTION (USE) grant can create and query a dictionary that references the collection, but cannot read the stored password unless they also hold SHOW NAMED COLLECTIONS SECRETS. The SHOW CREATE DICTIONARY output shows only the collection name.

Q: Can a user override the stored credentials in a query?

Only for keys that are overridable. With allow_named_collection_override_by_default enabled (the default) and no per-key marker, a caller can override any key -- including secrets -- which is a real exfiltration risk. Mark sensitive keys NOT OVERRIDABLE, or disable the setting server-wide.

Q: Should I define collections in XML config or via DDL?

DDL is recommended for credentials: collections take effect immediately, persist across restarts, and can be encrypted at rest. For cluster-wide availability use ON CLUSTER when creating them, or configure keeper/zookeeper storage -- without either, DDL collections are local to each node. XML config collections require SYSTEM RELOAD CONFIG and manual distribution to every node.

Q: How do I rotate a credential used by many dictionaries?

Run ALTER NAMED COLLECTION <name> SET password = '...' once. Dictionaries referencing the collection pick up the new value on their next LIFETIME refresh or immediately after SYSTEM RELOAD DICTIONARY -- you never touch the individual dictionary definitions.

Q: Which dictionary sources support named collections?

The NAME argument works with PostgreSQL, MySQL, ClickHouse, MongoDB, and other supported sources. See What is a ClickHouse external dictionary for the full list of source types and how dictionaries refresh.

Q: Are named collections encrypted by default?

No. Stored DDL collections are readable from the data directory unless you configure a *_encrypted storage backend (local_encrypted, keeper_encrypted, or zookeeper_encrypted) with a key and algorithm. Encryption protects the credentials at rest, not in transit.

Subscribe to the Pulse Newsletter

Get early access to new Pulse features, insightful blogs & exclusive events , webinars, and workshops.

We use cookies to provide an optimized user experience and understand our traffic. To learn more, read our use of cookies; otherwise, please choose 'Accept Cookies' to continue using our website.