Accounts

An account is an isolated tenant in Decodable. It exists within a single cloud provider and region (although it can connect to data services anywhere). All resources - connections, streams, pipelines and so on - are created within an account.

Accounts have one or more identity providers (IDPs) which determine from where logins are allowed. The list of users from those IDPs can be further restricted, as can the privileges of those users.

See Accounts for more information.

Streams

A stream is a partitioned, partially-ordered sequence of records on the Decodable platform. It's conceptually similar to a "topic" in Kafka, or a "stream" in AWS Kinesis. Streams always have a schema that defines how records are structured. Once streams are defined, they can be used as an input or output by any number of pipelines or connections. Streams present a universal interface to data from different sources, simplifying pipeline development.

See Streams for more information.

👍

Batch Analogy

If you're coming from a batch processing background, you can think of a Decodable stream like a database table: it holds records with a well-defined schema and can be read or written to. There are important differences between the streaming and batch worlds that we'll get to later, but this is a great place to start.

Connections

A connection is a configured instance of a connector that allows Decodable to talk to a messaging system, database, storage system, or other systems you control. Connections include technology-specific configuration that allows them to communicate with these systems, typically including hostnames, ports, authentication information, and other settings.

Once configured, a connection can be activated to stream data to or from the Decodable platform. Connections come in two flavors: source and sink. Source connectors read from an external system and write to a Decodable stream, while sink connectors read from a stream and write to an external system.

See Connections for more information.

Pipelines

A pipeline is a streaming SQL query that processes data from one (or more) input streams, and writes the results to an output stream. Pipelines are versioned and must be activated to begin processing. Every pipeline maintains information called pipeline state about what data has already been processed, intermediate operator state, and many other details. In most cases, you don't need to worry about this state: Decodable takes care of state management, consistent checkpointing, and recovery.

See Pipelines for more information.

Tasks

Both pipelines and connections are given a fixed number of tasks that determine the physical resources they receive for processing. Tasks run in parallel allowing processing to scale out, as needed. Decodable's stream processing engine allocates up to the number of tasks you specify, although it may allocate fewer if it determines a task would always be idle (e.g. there isn't enough work that can be performed in parallel). The amount of processing capacity a task can perform (measured in bytes and records per second) varies based on the size of a record, the complexity of the query, and the speed of source- and sink-connected systems. Typically tasks can process 2-8MiB or 1000-10,000 records per second.

Resource IDs

All resources in Decodable have a generated resource ID that is guaranteed to be unique within your account and never change. In many places, resource IDs are used to allow you to freely rename resources without worrying about breaking your pipelines and other configuration. Resource IDs are short (typically 8 character) strings of letters and numbers similar to Git SHAs. Just like SHAs, you should treat Decodable resource IDs as opaque UTF-8 strings.


Did this page help you?