Postgres CDC
PostgreSQL is a powerful, open source object-relational database system that uses and extends the SQL language combined with many features that safely store and scale the most complicated data workloads. The origins of PostgreSQL date back to 1986 as part of the POSTGRES project at the University of California at Berkeley and has more than 30 years of active development on the core platform.
PostgreSQL has earned a strong reputation for its proven architecture, reliability, data integrity, robust feature set, extensibility, and the dedication of the open source community behind the software to consistently deliver performant and innovative solutions. PostgreSQL runs on all major operating systems, has been ACID-compliant since 2001, and has powerful add-ons such as the popular PostGIS geospatial database extender. It is no surprise that PostgreSQL has become the open source relational database of choice for many people and organizations.
The traditional approach to syncing PostgreSQL with complementary data stores is batch-based, where pipelines extract data from the PostgreSQL database system and send it to downstream data stores. Change data capture (CDC) is a modern alternative to inefficient bulk imports. CDC extracts change events (INSERTs, UPDATEs, and DELETEs) in real-time from data stores, such as PostgreSQL, and provides them to a data pipeline.
Getting Started
Connections come in two flavors: source and sink. Source connections read from an external system and write to a Decodable stream, while sink connections read from a stream and write to an external system. Postgres CDC connectors can only be used in the source
role.
Configure As A Source
To create and configure a connector for Postgres, sign in to the Decodable Web Console, navigate to the Connections tab, click on New Connection
, and follow the steps below. For examples using the command line tools or scripting, see the How To guides.
-
The connector type will default to
source
, since that is the only option for Postgres CDC connectors. -
Specify the hostname to connect to your Postgres database. For example,
postgres-server
. -
Optionally specify the port number to use when connecting to the host. By default, the port number is
5432
. -
Provide the name of the database for the connection.
-
Optionally provide the name of the database schema that will be used. By default, the schema name is
public
. -
Specify the name of the database table for the connection.
-
Provide the username of the database on whose behalf the connection is being made.
-
Provide the database user's password.
-
Optionally specify the plugin name that decodes Postgres's write-ahead log. By default, the plugin name is
pgoutput
.
There are multiple ways to implement change data capture (CDC) with Postgres. For more detailed information, see the Postgres Getting Started guide and related documentation.
Reference
Connector name | postgres-cdc |
Type | source |
Delivery guarantee | exactly once |
Properties
The following properties are supported by the Postgres CDC connector.
Property | Required | Description |
---|---|---|
hostname | required | Host where the Postgres DB can be reached |
port | optional | Port where the DB can be reached (default: 5432) |
database-name | required | The name of the database |
schema-name | optional | The schema containing the table (default: public) |
username | required | Username to use for authentication |
password | required | Password to use for authentication |
decoding.plugin.name | optional | Plugin to use for reading DB data (default: pgoutput) |
slot.name | optional | The unique name to use for the replication slot * |
* can only start with a letter and contain lower-case letters, numbers, and underscore character
Postgres Connectivity
Currently the Postgres connector requires your database to be publicly accessible and uses username/password to authenticate. If you'd like to have support for other authentication models, please contact [email protected] or join our Slack community and let us know!
Updated 9 months ago