Apache Druid
Apache Druid® is a real-time analytics database designed for fast slice-and-dice analytics ("OLAP" queries) on large data sets. Most often, Druid powers use cases where real-time ingestion, fast query performance, and high uptime are important. Druid is commonly used as the database backend for GUIs of analytical applications, or for highly-concurrent APIs that need fast aggregations. Druid works best with event-oriented data.
Common application areas for Druid include:
- Clickstream analytics including web and mobile analytics
- Network telemetry analytics including network performance monitoring
- Server metrics storage
- Supply chain analytics including manufacturing metrics
- Application performance metrics
- Digital marketing/advertising analytics
- Business intelligence/OLAP
Getting Started
Sending a Decodable data stream to Druid is accomplished in two stages, first by creating a sink connector to a data source that is supported by Druid, and then by adding that data source to your Druid configuration. Decodable and Druid mutually support several technologies, including the following:
- Amazon Kinesis
- Apache Kafka
Configure As A Sink
This example demonstrates using Kafka as the sink from Decodable and the source for Druid. Sign in to the Decodable Web Console and follow the configuration steps provided for the Kafka Connector to create a sink
connector. For examples of using the command line tools or scripting, see the How To guides.
Create Kafka Data Source
To ingest event data, also known as message data, from Kafka into Druid, you must submit a supervisor spec. When you enable the Kafka indexing service, you can configure supervisors on the Overlord to manage the creation and lifetime of Kafka indexing tasks. Kafka indexing tasks read events using Kafka's own partition and offset mechanism to guarantee exactly-once ingestion.
The Kafka indexing service supports transactional topics introduced in Kafka 0.11.x by default. The consumer for Kafka indexing service is incompatible with older Kafka brokers. If you are using an older version, refer to the Kafka upgrade guide. Additionally, you can set isolation.level
to read_uncommitted
in consumerProperties
if either:
- You don't need Druid to consume transactional topics.
- You need Druid to consume older versions of Kafka. Make sure offsets are sequential, since there is no offset gap check in Druid anymore.
If your Kafka cluster enables consumer-group based ACLs, you can set group.id
in consumerProperties
to override the default auto generated group id.
For more detailed information, please refer to Druid's Kafka documentation.
Reference
Connector name | druid |
Type | sink |
Delivery guarantee | exactly once |
Additional Druid Support
If you are interested in a direct Decodable Connector for Druid, please contact [email protected] or join our Slack community and let us know!
Updated 2 months ago