ClickHouse® is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP). It is a popular database with nearly 25,000 GitHub stars and stylizing itself as “a free analytics DBMS for big data,” ClickHouse has achieved widespread adoption and helped engineers everywhere perform real type analytics at scale. Benchmarked at 100x faster than Hive or MySQL, ClickHouse is adopted by many engineering teams to serve queries at very low latencies across large datasets. To achieve this performance at scale compared to standard data warehouses like Snowflake, it makes some architectural tradeoffs that users should keep in mind. For this reason, users should consider conforming the data to ClickHouse best practices before ingestion. Decodable makes it easy to transform and prepare data, allowing you to get the most out of it when the data lands in ClickHouse.
Connections come in two flavors: source and sink. Source connections read from an external system and write to a Decodable stream, while sink connections read from a stream and write to an external system. ClickHouse connectors can only be used in the
Configure As A Sink
To create and configure a connector for ClickHouse, sign in to the Decodable Web Console, navigate to the Connections tab, click on
New Connection, and follow the steps below. For examples of using the command line tools or scripting, see the How To guides.
The connector type will default to
sink, since that is the only option for ClickHouse connectors.
Specify the hostname and port to connect to your ClickHouse database. Note that these access parameters should specify the web address for ClickHouse, not the native TCP address.
Provide the Cluster, Database, and Table name that the sink should write to. Decodable does not create any of these entities for you, so they must already exist in your ClickHouse instance in order for Decodable to successfully write data.
Provide the username and password of the user on whose behalf the connection is being made.
Note that while Decodable may write APPEND-only data to both distributed and non-distributed tables, CHANGE stream data may only be written to non-distributed tables. As such it is generally recommended that you not use the distributed table engine for ClickHouse sink tables.
It is also strongly recommended that you avoid writing to buffer tables, as their flush strategy nullifies Decodable's delivery guarantees.
Configure the ClickHouse connector using the Decodable CLI
If you are using the Decodable CLI to create or edit a connection to a ClickHouse database, then use the following table as a reference for what properties are required and supported.
|hostname||Required||The hostname for your ClickHouse database. For example: |
|port||Required||The port number to use when connecting to the host.|
|cluster-name||Required||The name of the ClickHouse cluster to connect to.|
|database-name||Required||The name of the ClickHouse database to connect to.|
|table-name||Required||The name of the table to send data to.|
|username||Required||The name of the ClickHouse user with sufficient permissions to connect to the database.|
|password||Required||The password for the ClickHouse user.|
|Delivery guarantee||at least once|
The ClickHouse sink connects to a ClickHouse table through HTTP requests. ClickHouse sinks can materialize data from streams of both APPEND and CHANGE types; however, be advised that clickhouse sinks handling change streams (particularly those with frequent update and delete records) may operate significantly slower than pure append streams.
When a ClickHouse sink is connected to a CHANGE stream, a primary key must be specified in the schema.
Data Type Mappings
Below is a mapping from data types found in Decodable streams, to their corresponding type in ClickHouse. See the documentation for more info on ClickHouse's types, and Decodable's supported types.
|Decodable Type||ClickHouse Type|
|VARCHAR||String / IP / UUID|
|STRING||String / Enum|
|DECIMAL||Decimal / Int128 / Int256 / UInt64 / UInt128 / UInt256|
|SMALLINT||Int16 / UInt8|
|INT||Int32 / UInt16 / Interval|
|BIGINT||Int64 / UInt32|
|INTERVAL||Int32 / Int64|
Updated 2 months ago