Delta Lake

Connector namedelta-lake
Typesink
Delivery guaranteeexactly once

The Delta Lake connector streams data in Delta Lake format to an S3 bucket in your AWS account. To use it, configure an AWS IAM Role as described below, with specific permissions to write to the bucket.

Properties

The following properties are supported by the Delta Lake connector.

PropertyDispositionDescription
table-pathrequiredPath to of S3 bucket using s3a scheme
Example: s3a://my-bucket/table_name
s3.role-arnrequiredAWS ARN of the IAM Role configured as described below.
Example: arn:aws:iam::111222333444:role/decodable-delta-access.

IAM Role, permissions, and security

To be secure, you, AWS, and Decodable work together to ensure only Delta Lake connections in your Decodable Account can put data to your S3 bucket.

How?

AWS IAM provides a special mechanism — called ExternalId — that you and Decodable will use as described here, which ensures access from Decodable to your bucket happens only for your Decodable Account. Like this:

  • You'll create and configure an IAM Role with two Policies:
    • A Trust Policy allowing access from Decodable's AWS account — but only with an ExternalId matching your (unique) Decodable account name.
    • A Permissions Policy with the needed permissions on your bucket.
  • You'll provide us the ARN of this Role via your Decodable Delta Lake connection's s3.role-arn property.
  • Our servers will assume that Role using an ExternalId value matching only your Decodable Account name — never any other. We'll use that to talk to your bucket.

Note that the values here are not treated as secret (by us, AWS, or you): not ExternalId (your account name), not the Role ARN, not the bucket name.

Specifically, your IAM Role (per-roleArn) must:

  • have an AssumeRole Trust Policy that:
    • names Decodable's AWS account ID (671293015970) as Principal.
    • has a Condition requiring sts:ExternalId to equal your Decodable Account name.
  • have a Permissions Policy allowing needed operations on the bucket (not Role) ARN and (wildcardable) S3 key (path).
    The Policy Actions are:
    • s3:GetObject
    • s3:PutObject
    • s3:DeleteObject
    • s3:ListBucket
    • s3:PutObjectAcl

For example

Here's an example IAM Trust Policy. Replace my-decodable-account. Note that 671293015970 is Decodable's AWS account ID and must match exactly.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::671293015970:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "my-decodable-account"
        }
      }
    }
  ]
}

Note: to allow several Decodable Accounts (say, in different AWS Regions) to write to the same bucket, use an array of Account names for the ExternalId value:

{ "sts:ExternalId": ["my-acct-1", "my-acct-2"] }

Here's an example IAM Permissions Policy. Replace your-bucket (twice) and /some/dir appropriately. Note that the path (here: /some/dir) can be blank to put S3 objects to bucket root path, but the trailing /* is required.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:DeleteObject",
        "s3:PutObjectAcl",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-bucket/some/dir/*"
      ]
    }
  ]
}

Further reading — from AWS

For full discussion from AWS of the security problem this solves, and its AWS-recommended solution using ExternalId, we recommend reading:
AWS Identity and Access Management • The confused deputy problem
.

Supported types

Only the following SQL data types are supported.


Did this page help you?