Postgres CDC

PostgreSQL is a powerful, open source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads. The origins of PostgreSQL date back to 1986 as part of the POSTGRES project at the University of California at Berkeley and has more than 30 years of active development on the core platform.

PostgreSQL has earned a strong reputation for its proven architecture, reliability, data integrity, robust feature set, extensibility, and the dedication of the open source community behind the software to consistently deliver performant and innovative solutions. PostgreSQL runs on all major operating systems, has been ACID-compliant since 2001, and has powerful add-ons such as the popular PostGIS geospatial database extender. It is no surprise that PostgreSQL has become the open source relational database of choice for many people and organizations.

The traditional approach to syncing PostgreSQL with complementary data stores is batch-based, where pipelines extract data from the PostgreSQL database system and send it to downstream data stores. Change data capture (CDC) is a modern alternative to inefficient bulk imports. CDC extracts change events (INSERTs, UPDATEs, and DELETEs) in real-time from data stores, such as PostgreSQL, and provides them to a data pipeline.

Getting Started

Connections come in two flavors: source and sink. Source connections read from an external system and write to a Decodable stream, while sink connections read from a stream and write to an external system. Postgres CDC connectors can only be used in the source role.

Configure As A Source

To create and configure a connector for Postgres, sign in to the Decodable Web Console, navigate to the Connections tab, click on New Connection, and follow the steps below. For examples using the command line tools or scripting, see the How To guides.

  1. The connector type will default to source, since that is the only option for Postgres CDC connectors.

  2. Specify the hostname to connect to your Postgres database. For example, postgres-server.

  3. Optionally specify the port number to use when connecting to the host. By default, the port number is 5432.

  4. Provide the name of the database for the connection.

  5. Optionally provide the name of the database schema that will be used. By default, the schema name is public.

  6. Specify the name of the database table for the connection.

  7. Provide the username of the database on whose behalf the connection is being made.

  8. Provide the database user's password.

  9. Optionally specify the plugin name that decodes Postgres's write-ahead log. By default, the plugin name is pgoutput.

There are multiple ways to implement change data capture (CDC) with Postgres. For more detailed information, see the Postgres Getting Started guide and related documentation.

Reference

Connector namepostgres-cdc
Typesource
Delivery guaranteeexactly once

Properties

The following properties are supported by the Postgres CDC connector.

PropertyRequiredDescription
hostnamerequiredHost where the Postgres DB can be reached
portoptionalPort where the DB can be reached (default: 5432)
database-namerequiredThe name of the database
schema-nameoptionalThe schema containing the table (default: public)
usernamerequiredUsername to use for authentication
passwordrequiredPassword to use for authentication
decoding.plugin.nameoptionalPlugin to use for reading DB data (default: pgoutput)
slot.nameoptionalThe unique name to use for the replication slot *

* can only start with a letter and contain lower-case letters, numbers, and underscore character

Postgres Connectivity
Currently the Postgres connector requires your database to be publicly accessible and uses username/password to authenticate. If you'd like to have support for other authentication models, please contact [email protected] or join our Slack community and let us know!