Datastax Astra Streaming sink connector

Features

Connector name

astra-streaming

Compatibility

Pulsar 2.9.x or above

Delivery guarantee

At least once

Supported task sizes

S, M, L

Multiplex capability

A single instance of this connector can write to a single topic.

Supported stream types

Append

Configuration properties

Property Description Required Default

Connection

service-url

The URL to connect to your Astra Streaming broker.

For example, pulsar+ssl://broker.example.com:6651.

Yes

admin-url

The URL to connect to your Astra Streaming admin endpoint.

For example, <https://broker.example.com>.

Yes

Authentication

token

The authentication token associated with your Astra stream. This must be provided as a secret resource.

Yes

Data

topic

The fully qualified name of the topic. For example, persistent://stream/namespace/topic-name.

Yes

key.fields

A list of fields, delimited by semicolons, that comprise the partition key.

For example: field1;field2.

key.format

The format used to serialize and deserialize the partition key. Must be one of the following:

  • JSON

  • Avro

  • Raw

format

The format for data in the topic. Must be one of the following:

  • JSON

  • Avro

  • Raw

  • Debezium (JSON)

If you want to send CDC data through this connector, then you must select Debezium (JSON).

value.fields-include

If set to ALL then the partition key columns will be included in the payload.

Set to EXCEPT_KEY if you don’t want the partition key columns to be included in the payload.

For an example of how the key.fields, key.format, and value.fields-include arguments work together, see the examples in the Key and Value Formats section in the Apache Flink documentation.

ALL

Connector starting state and offsets

A new sink connection will start reading from the Latest point in the source Decodable stream. This means that only data that’s written to the stream when the connection has started will be sent to the external system. You can override this when you start the connection to Earliest if you want to send all the existing data on the source stream to the target system, along with all new data that arrives on the stream.

When you restart a sink connection it will continue to read data from the point it most recently stored in the checkpoint before the connection stopped. You can also opt to discard the connection’s state and restart it afresh from Earliest or Latest as described above.

Learn more about starting state here.