Connection starting state and offsets Source connections A source connection reads data, usually from an external system. The connection writes data into a Decodable stream. There are two concepts that determine from where in the source data the connector will begin processing: The checkpoint that holds details of any existing processing already done. For a new connection, there is no checkpoint. Configuration parameters within the connection itself, such as scan.startup.mode. This configuration may vary by connector and in some connectors may not be present. Consult the individual connector documentation for details. New source connections The point in the source system from which a new source connection will start reading data is defined by the specific connector. In some connectors you can override this behavior with a connector-specific configuration parameter such as scan.startup.mode. Consult the individual connector documentation for details. Restarting a source connection If a source connection is stopped or has failed, you can override the starting state when you restart it. By default, the source connection will continue to read data from the point it most recently stored in the checkpoint before the connection stopped. You can also opt to discard the connection’s state and restart it afresh. In this case, the connection will behave as a new connection in terms of the position in the source system from which it will read data. Using this option risks duplicating or skipping data as the source connection might read data that has already been read or miss data that hasn’t yet been read. Sink connection A sink connection writes data to an external system. The connection reads data from a Decodable stream. There are two concepts that determine where in the source data a sink connector will begin processing from: The checkpoint that holds details of any existing processing already done. For a new connection, there is no checkpoint. The specified offset in the Decodable stream. This can be Earliest or Latest. Earliest writes all of the existing data from the source stream to the external system. New data as it arrives on the source stream is then written to the external system. Latest only writes new data as it arrives on the source stream to the external system. Existing data on the stream is ignored. New sink connections A new sink connection will start reading from the Latest point in the source Decodable stream. This means that only data that’s written to the stream when the connection has started will be sent to the external system. You can override this when you start the connection to Earliest if you want to send all the existing data on the source stream to the target system, along with all new data that arrives on the stream. Restarting a sink connection If a source connection is stopped or has failed, you can override the starting state when you restart it. By default, the sink connection will continue to read data from the point it most recently stored in the checkpoint before the connection stopped. You can also opt to discard the connection’s state and restart it afresh. As with a new connection, you can select Earliest or Latest to change whether existing data on the Decodable stream is written to the external system. Using this option risks duplicating or skipping data as the sink connection might read data that has already been read or miss data that hasn’t yet been read.