MySQL source connector

Use the MySQL CDC (Change Data Capture) Connector to get data from MySQL into Decodable. The MySQL CDC Connector is powered by Debezium and able to extract and send change events (INSERTS, UPDATES, and DELETES) through Decodable.

Features

Delivery guarantee

Exactly once

Write to multiple streams

Yes

Prerequisites

Before you can get data from MySQL, the following requirements must be met:

  • Your MySQL database must be accessible from the Decodable network. Contact Decodable support for more information about enabling network access to your infrastructure.

  • Your MySQL instance must be configured to send change records. See Setting up MySQL from the Debezium documentation.

    • If you are using Amazon RDS, Automated Backups must be enabled. Otherwise, binlog will be not be enabled regardless of any other settings. See How To: Set up a MySQL CDC connection on Amazon RDS for detailed instructions on how to configure MySQL on Amazon RDS.

  • Your MySQL user must have sufficient permissions for CDC replication.

Steps

If you want to use the Decodable CLI or API to create the connection, you can refer to the Property Name column for information about what the underlying property names are. The connector name is mysql-cdc.
  1. From the Connections page, select MySQL CDC and complete the following fields.

    UI Field Property Name Description

    Connection Type

    N/A

    Select Source to use this connector to get data into Decodable.

    Host

    hostname

    The host name for your MySQL database. For example, mysql-server.

    Port

    port

    Optional. The port number to use when connecting to the host.

    Defaults to 3306.

    Database

    database-name

    The name of the database containing the tables that you want to ingest data from. If empty, this connection connects to all existing databases.

    Username

    username

    The username to use to authenticate to MySQL.

    Password

    password

    The password associated with the username. This must be provided as a secret resource. If you are using the Decodable CLI, run decodable secret list to view available secrets or decodable secret --help for help with creating a new secret.

    Note: For security purposes, Decodable will never display secret values in plaintext. You can manage which users have permissions to create, delete, or modify secrets in the Access Control management view. See Roles, groups, and permissions for more information.

    Scan Startup Mode

    scan.startup.mode

    Optional. Specifies where in the collection to start reading data when the connection is first started, or when it’s restarted with the state discarded. Must be one of the following:

    • initial (default): At startup, takes an initial snapshot of monitored database tables, then continuously reads the latest binlog entries thereafter.

    • latest-offset: Avoids taking an initial snapshot of monitored database tables upon startup. Instead, reads changes from the end of the binlog, capturing only the modifications made since the connector was initiated or restarted.

    • timestamp: Bypasses the snapshot phase and initiates reading binlog events directly from a specified timestamp.

    Starting Timestamp

    scan.startup.timestamp-millis

    The timestamp to start reading data from, specified in milliseconds since Unix time.

    Only applicable if scan.startup.mode is timestamp.

    SSL Mode

    ssl-mode

    Optional. Specifies how the connector should handle secure connection to the database server. Must be one of the following:

    • DISABLED: Establish unencrypted connections.

    • PREFERRED: Establish encrypted connections if the server enabled them, otherwise fall back to unencrypted connections.

    • REQUIRED: Establish secure connections if the server enabled them, fail otherwise;

    • VERIFY_CA: Like REQUIRED but additionally verify the server TLS certificate against the configured Certificate Authority (CA) certificates.

    • VERIFY_IDENTITY: Like VERIFY_CA, but additionally verify that the server certificate matches the host to which the connection is attempted.

    Defaults to PREFERRED.

    Server Timezone

    server.timeZone

    Optional. The global time zone of the database server. This controls how time-related values are read from your database.

    This must be equivalent to the output of the following query on your database: SELECT @@global.time_zone;

    Defaults to UTC.

  2. Select Next. The next screen shows a list of tables that Decodable has detected from the MySQL host.

  3. Select the tables that you want to ingest data from. Decodable automatically creates a stream for each table selected. Each stream’s schema is automatically translated from MySQL, including the field names, the data types of those fields, and which fields to use as the primary key. See Data types mapping for how MySQL types map to Decodable types.

    1. By default, Decodable uses the same name for the stream as the input table. If a stream already exists with that name and the schema of the stream matches the schema of the input table, Decodable uses that stream instead.

  4. (Optional) If you want to rename the stream created, you can override the generated stream name.

    1. For example, if you want to limit access to certain streams, you might choose to prefix specific streams and use role-based access control to grant access to certain prefixes. See Roles, groups, and permissions.

  5. Select Next.

  6. Give the newly created connection a Name and Description. Then, select Save.

Binlog retention

If the connection is stopped or in a failed state for longer than the binlog’s retention period, the connection will fail when it’s restarted. This is because for CDC to work it needs a contiguous series of binlog entries.

If you want to restart the connection in this situation you must discard its current state. By doing this, the initial snapshot of the required tables will be taken again and then the binlog used for subsequent reads.

To do this do, one of the following:

  1. In the Decodable Web UI, select Start and under Starting State select Reset current state and start from the initial state

  2. In the Decodable CLI, do one of the following:

    1. Use connection activate and add the --force flag, for example:

      decodable connection activate cef0e708 --force

      or

    2. Use query with a suitable specifier for the connection (such as --name) and add the --operation reset-state argument, for example:

      decodable query --name customers-source --operation reset-state

Connector starting state and offsets

When you create a connection, or restart it and discard state, it will read from the database based on the configuration of the scan startup mode. By default this is initial and will therefore snapshot the set of monitored tables and read the binlog thereafter.

Learn more about starting state here.

Data types mapping

The following table shows the Decodable data types that are generated from the corresponding MySQL data types.

If you are creating this connection with Decodable Web, then you will see a preview of how the MySQL data types are translated to Decodable data types before your connection is created. This preview isn’t shown when you are using the Decodable CLI, so use this table as a reference for what your MySQL data types will look like in Decodable.

MySQL Type Decodable Type

TINYINT

TINYINT(>1)

TINYINT

SMALLINT

TINYINT

UNSIGNED

SMALLINT

INT

MEDIUMINT

SMALLINT

UNSIGNED

INT

BIGINT

INT

UNSIGNED

BIGINT

BIGINT

UNSIGNED

DECIMAL(20,0)

FLOAT

FLOAT

DOUBLE

DOUBLE

DECIMAL(p, s)

NUMERIC(p, s)

DEC(p, s)

FIXED(p, s)

DECIMAL(p,s)

BOOLEAN

TINYINT(1)

BIT(1)

BOOLEAN

BIT(>1)

BYTES

YEAR

INT

DATE

DATE

TIME[(p)]

TIME[(p)]

TIMESTAMP [(p)]

DATETIME [(p)]

TIMESTAMP[(p)]

CHAR(n)

CHAR(n)

VARCHAR(n)

VARCHAR(n)

BINARY(n)

BINARY(n)

VARBINARY(n)

VARBINARY(n)

TINYTEXT

TEXT

MEDIUMTEXT

LONGTEXT

STRING

TINYBLOB

BLOB

MEDIUMBLOB

LONGBLOB

BYTES

JSON

ENUM

SET

STRING

GEOMETRY

POINT

LINESTRING

POLYGON

MULTIPOINT

MULTILINESTRING

MULTIPOLYGON

GEOMETRYCOLLECTION

STRING

Note on MySQL Spatial Data Types

Types GEOMETRY, POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, and MULTIPOLYGON are converted to the following String representation:

{"srid": <> , "type": "<>", "coordinates": [<>, <>, …​]}

where:

  • srid: the Spatial Reference System (SRS) in which the geometry is defined (defaults to 0 if no srid is specified in the spatial value)

  • type: the spatial data type (GEOMETRY, LINESTRING, etc)

  • coordinates: the coordinates of the spatial data

Type GEOMETRYCOLLECTION is converted to the following String representation:

{"srid": <> , "type": "GeometryCollection", "geometries": [{"type":"<>","coordinates":[<>, <>, …​]}]}

Resource specifier keys

When using the connection scan command of the Decodable CLI to create a connection specification, the following resource specifier keys are available:

Name Description

database-name

The database name

table-name

The table name

Connection resource definition options

When defining connection resources using connection scan for the MySQL CDC source the following options can be specified with the --opt argument:

Name Description

with-metadata-fields

Inject columns with special metadata values. Will add _decodable_database, _decodable_table and _decodable_op_ts to the stream schemas when set to true. Useful when mapping multiple tables to a single Stream, and when tracking record update times. The _decodable_database and _decodable_table metadata fields are added to the Stream primary key as well. Note that _decodable_op_ts defaults to 1970-01-01 00:00:00.000 for all existing records in the upstream table, upon first starting the connection. All records inserted or updated following the initial connection activation will be populated with the UTC-based timestamp of the operation.

Example usage:

--opt with-metadata-fields=true