Apache Hudi sink integration

This is a supported integration which requires manual configuration.

Contact Decodable support if you are interested in native support with a Decodable connector.

Sending a Decodable data stream to Hudi is accomplished in two stages:

  1. Creating a sink connector from Decodable to a data source that’s supported by Hudi

  2. Adding that data source to your Hudi configuration.

Decodable and Hudi mutually support several technologies, including Apache Kafka.

Add a Kafka sink connector

Follow the configuration steps provided for the Apache Kafka sink connector.

Create Kafka data source in Hudi

There are multiple ways of ingesting data streams into Hudi, including HoodieStreamer or Kafka Connect. For example, here are the steps for using Kafka Connect.

  1. Create the environment

  2. Set up the schema registry

  3. Create the Hudi Control Topic for coordination of the transactions

  4. Create the Hudi Topic for the Sink and insert data into the topic

  5. Run the Sink connector worker

  6. Add the Hudi Sink to the Connector

  7. Run async compaction and clustering if scheduled

  8. Query via Hive

For more detailed information, see Hudi’s Kafka Connect documentation.