A Stream carries records within Decodable and can be read or written to. It is conceptually similar to a "topic" in Kafka, or a "stream" in AWS Kinesis. Streams always have a schema that specifies its fields. Once a stream is defined, it can be used as an input or output by any number of pipelines or connections.
If you're coming from a batch processing background, you can think of a Decodable stream like a database table. A stream's records are like a table's rows, and its fields are like a table's columns.
There are two types of streams in Decodable.
Append - Each record in the stream is an independent event (eg: records from a click event stream). To create a stream with
APPEND type, define a schema without any key fields.
Change - Records in the stream can either be a new record, or a record that updates/deletes a previous record based on the key field defined in the schema (eg: database change logs). To create a stream with
CHANGE type, define a schema with a key field using
--field key_field="string primary key".
All streams have these common elements:
id (system-generated) - A unique id generated by the system to track this stream. See Concepts - Resource IDs for more information.
name (required) - A name to help you identify this stream which can be used in pipelines when specifying a source or sink.
Stream names are typically short, all lowercase, and with words separated by underscores (
schema (required) - A schema describing all records in the stream. See Schema below for more information.
description - A longer description of the stream. Use this to help you and your team know when and how to use this stream.
watermark - An expression specifying a field to use as event time and an optional interval to allow for late arriving data. The field must be of a supported type. To remove a watermark from an existing Stream, update the value to "", an empty String.
"timestamp_field AS timestamp_field"
"timestamp_field AS timestamp_field - INTERVAL '0.001' SECOND"
"timestamp_field AS timestamp_field - INTERVAL '5' MINUTE"
Supported types for watermark fields
Each stream has a schema that defines the structure of records: their field names, the types of those fields, and additional metadata.
Creating a stream with a single field
decodable stream create \ --name my_stream \ --description "raw data from prod Kafka" \ --field raw_data=STRING # Created stream my_stream (2fc5dc5a)
Viewing the stream we just created
decodable stream get 2fc5dc5a # my_stream # id 2fc5dc5a # description raw data from prod Kafka # schema # 0 raw_data STRING # 1 new_field STRING # create time 2021-11-03T20:28:48Z # update time 2021-11-03T21:39:34Z
Updating the stream to add another field
decodable stream update 2fc5dc5a \ --field raw_data=STRING \ --field new_field=STRING # Updated stream "2fc5dc5a"
Updating a schema
Be sure to supply the entire new schema, not just the new fields!
The following field types are supported.
If there's a specific type that's missing, let us know!
Fixed length character string.
Variable length character string with a max length of n.
Synonym for VARCHAR(2147483647).
Fixed length binary string.
Variable length binary string with a max length of n.
Synonym for VARBINARY(2147483647).
Exact numeric type with a fixed precision (number of digits) of p and a scale (number of digits to the right of the decimal) of s.
p must be in the range of 1 to 31, inclusive. If omitted, it defaults to 10.
s must be in the range of 0 to p, inclusive. If s is specified, p must also be specified. When omitted, it defaults to 0.
Synonym for DECIMAL(p,s).
Synonym for DECIMAL(p,s).
A 1 byte signed integer.
A 2 byte signed integer.
A 4 byte signed integer.
An 8 byte signed integer.
An approximate floating point integer.
Synonym for FLOAT.
Date and Time Types
A date containing the year, month, and day.
A time without a timezone. If specified, p indicates the precision of fractional seconds up to nanosecond-level precision. Does not support leap second adjustments. Values must be within the range of
TIMESTAMP(p) [[WITH | WITHOUT] TIMEZONE]
A date and time with or without a timezone. As with TIME, p specifies the fractional second precision, if it is provided. Values must be within the range of
A synonym for
An array of a type T, and a maximum size of 2^31-1.
Synonym for ARRAY<T>.
An associative array with keys of type K and values of type V. Both K and V support any type. Each unique key can only have one value.
ROW<name type ['description'], ...>
A structure of named fields. name is the field name, type is its type, and description is an optional text description of the field. Field definitions can be repeated as many times as necessary.
A true / false value.
Fields may optional provide one or more SQL constraints on fields.
To mark a field as required.
To indicate that a field is used to uniquely identify a row. This means that the stream created will have
Additional constraints are coming soon!
This page contains copied contents from the Apache Flink® documentation.
See Credits page for details.
Updated about 2 months ago