ClickHouse sink connector

Features

Connector name

clickhouse

Delivery guarantee

At least once

Supported task sizes

S, M, L

Multiplex capability

A single instance of this connector can write to a single ClickHouse table

Supported stream types

A ClickHouse connection handling change streams, particularly change streams with frequent UPDATE and DELETE records, might operate significantly slower than pure append streams.

The ClickHouse Connector connects to a ClickHouse table through HTTP requests.

Configuration properties

Property Description Required Default

hostname

The host name for your ClickHouse database. Make sure you specify the web address for ClickHouse, not the native TCP address.

Yes

port

The port number where your database accepts connections.

Yes

cluster-name

The name of your ClickHouse cluster.

Yes

database-name

The name of your ClickHouse database.

Yes

table-name

The name of your ClickHouse table.

Yes

username

The username to use to authenticate to ClickHouse.

Yes

password

The password associated with the username.

This must be provided as a secret resource.

Yes

Prerequisites

  • Your ClickHouse database must be accessible from the Decodable network. Connectivity options include AWS PrivateLink, SSH tunnels, and allowing connections from the Decodable published IP addresses.

  • You must create the ClickHouse database, cluster, and table to send data to. Decodable doesn’t create any of these entities for you.

  • Your Decodable stream’s schema must match the schema of the target table. For information on how ClickHouse types map to Decodable types, see Data type mappings

  • If you want to send Change Data Capture (CDC) records to ClickHouse, then you must specify one or more fields in the source stream schema to use as a primary key.

    To do this, you must first explicitly tell Decodable that the type isn’t null explicitly by entering: <type> NOT NULL. For example: BIGINT NOT NULL. Then, you are able to specify the not null field as a primary key field.

Table types

If you want to send data from a change stream, then you must write to a non-distributed table in ClickHouse.

If you want to send data from an append stream, then you can write to a distributed or a non-distributed table. In addition, we recommend that you avoid sending data to buffer tables, as their flush strategy nullifies Decodable’s delivery guarantees.

Connector starting state and offsets

A new sink connection will start reading from the Latest point in the source Decodable stream. This means that only data that’s written to the stream when the connection has started will be sent to the external system. You can override this when you start the connection to Earliest if you want to send all the existing data on the source stream to the target system, along with all new data that arrives on the stream.

When you restart a sink connection it will continue to read data from the point it most recently stored in the checkpoint before the connection stopped. You can also opt to discard the connection’s state and restart it afresh from Earliest or Latest as described above.

Learn more about starting state here.

Data type mappings

The following table describes the mapping of Decodable data types to their ClickHouse data type counterparts.

Decodable Type ClickHouse Type

CHAR

String

VARCHAR

String / IP / UUID

STRING

String / Enum

BOOLEAN

UInt8

BYTES

FixedString

DECIMAL

Decimal / Int128 / Int256 / UInt64 / UInt128 / UInt256

TINYINT

Int8

SMALLINT

Int16 / UInt8

INT

Int32 / UInt16 / Interval

BIGINT

Int64 / UInt32

FLOAT

Float32

DOUBLE

Float64

DATE

Date

TIME

DateTime

TIMESTAMP

DateTime

TIMESTAMP_LTZ

DateTime

INTERVAL

Int32 / Int64

ARRAY

Array

MAP

Map

ROW

Not supported

MULTISET

Not supported