Amazon Redshift sink connector

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more.

The first step to create a data warehouse is to start a set of nodes, called an Amazon Redshift cluster. After you provision your cluster, you can upload your dataset and then run data analysis queries. Regardless of the size of the dataset, Amazon Redshift offers fast query performance using the same SQL-based tools and business intelligence applications that you use today.

Getting started

Sending a Decodable data stream to Redshift is accomplished in two stages, first by creating a sink connector to Kinesis, and then by materializing a view from a Kinesis stream and merging it into Redshift.

Configure as a sink

This example demonstrates using Kinesis as the sink from Decodable and the source for Redshift. Sign in to Decodable Web and follow the configuration steps provided for the Amazon Kinesis sink connector to create a sink connector. For examples of using the command line tools or scripting, see the How To guides.

Configure streaming ingestion

Previously, loading data from a streaming service like Amazon Kinesis Streams into Amazon Redshift included several steps. These included connecting the stream to an Amazon Kinesis Data Firehose and waiting for Kinesis Data Firehose to stage the data in Amazon S3, using various-sized batches at varying-length buffer intervals. After this, Kinesis Data Firehose triggered a COPY command to load the data from Amazon S3 to a table in Redshift.

Rather than including preliminary staging in Amazon S3, streaming ingestion provides low-latency, high-speed ingestion of stream data from Kinesis into Redshift by following these steps.

  1. Create a Decodable Kinesis sink connection

  2. Create an IAM role

  3. Assign the IAM to Redshift

  4. Define an external schema

  5. Create a materialized view

  6. Refresh the materialized view

  7. Merge the data

For more detailed information, see the Redshift example in the Decodable GitHub repository, or Redshift’s Streaming Ingestion documentation.