MongoDB source connector Features Connector name mongodb-cdc Delivery guarantee Exactly once Supported task sizes M, L Multiplex capability A single instance of this connector can read from a single collection. Supported stream types Change stream Configuration properties Property Description Required Default Basic hosts One or more comma-separated host names for your database in Standard Connection String Format. If you have a DNS Seed List URL, you must find the underlying MongoDB instance host names and add them here. For example: mongodb-1.tld:27017,mongodb-2.tld:27017:mongodb-3.tld:27017 If you are using MongoDB Atlas, the list of host names to include can be found in the Overview page for your database. Yes database The name of the database containing the collection. Yes collection The name of the collection that you want to capture CDC records from. Yes username The username to use to authenticate to MongoDB. Yes password The password associated with the username. This must be provided as a secret resource. Yes copy.existing Whether to copy existing data from the collection. See Migrating existing collections for more information. — true Advanced scan.startup.mode Specifies where in the collection to start reading data when the connection is first started, or when it’s restarted with the state discarded. Must be one of the following: initial: At startup, takes an initial snapshot of monitored database tables, then continuously reads the latest oplog entries thereafter. latest-offset: Avoids taking an initial snapshot of monitored database tables upon startup. Instead, reads changes from the end of the oplog, capturing only the modifications made since the connector was initiated or restarted. — initial connection.options Any additional configuration options needed to connect to the MongoDB cluster. See Connection String Options in the MongoDB documentation for a full list of connection string options. — Prerequisites Your MongoDB instance must be publicly accessible. Decodable uses the username and password provided during connection creation to authenticate to the database. You must have a MongoDB user with privileges changeStream and read. Your MongoDB instance must be configured for change stream replication. See Change Streams in the MongoDB documentation for more information. The incoming data must contain a field named _id, and that field must be specified as a primary key in the Decodable stream. To specify a primary key, you must first explicitly tell Decodable that the type isn’t null explicitly by entering: <type> NOT NULL. For example: BIGINT NOT NULL. Oplog retention If the connection is stopped or in a failed state for longer than the oplog’s retention period, the connection will fail when it’s restarted. This is because for CDC to work it needs a contiguous series of oplog entries. If you want to restart the connection in this situation you must discard its current state. By doing this, the initial snapshot of the required tables will be taken again and then the oplog used for subsequent reads. To do this do, one of the following: In the Decodable Web UI, select Start and under Starting State select Reset current state and start from the initial state In the Decodable CLI, do one of the following: Use connection activate and add the --force flag, for example: decodable connection activate cef0e708 --force or Use query with a suitable specifier for the connection (such as --name) and add the --operation reset-state argument, for example: decodable query --name customers-source --operation reset-state Connector starting state and offsets When you create a connection, or restart it and discard state, it will read from the database based on the configuration of the scan startup mode. By default this is initial and will therefore snapshot the set of monitored tables and read the oplog thereafter. Learn more about starting state here. Data types mapping The following table shows the Decodable data types that are generated from the corresponding MongoDB data types. Decodable Type MongoDB Type INT Int BIGINT Long FLOAT Long DOUBLE Double DECIMAL Decimal128 BOOLEAN Boolean DATE Date Timestamp TIME Date Timestamp TIMESTAMP(3) Date TIMESTAMP_LTZ(3) Date TIMESTAMP(0) Timestamp TIMESTAMP_LTZ(0) Timestamp STRING String STRING ObjectId STRING UUID STRING Symbol STRING MD5 STRING JavaScript STRING Regex BYTES BinData ROW Object ARRAY Array ROW<$ref STRING, $id STRING> DBPointer (Point) ROW<type STRING, coordinates ARRAY<DOUBLE>></DOUBLE> GeoJson (Line) ROW<type STRING, coordinates ARRAY<DOUBLE>></DOUBLE> GeoJson