BYOC setup These are the cloud resources required by Decodable Bring Your Own Cloud (BYOC) deployments on AWS. The recommended route is using Terraform. For customers not using the reference Terraform modules, these resources can be provisioned and the ARNs can be added to the Helm chart values manually. VPC Decodable recommends a VPC with subnets in 3 Availability Zones. We recommend using a /20 for the VPC and allocating a /24 for each private and public AZ subnet. VPCs must have a NAT Gateway or Internet Gateway to allow access to the Internet. We recommend creating an S3 gateway endpoint in the VPC. EKS Decodable software is deployed on Kubernetes using Helm charts. The EKS cluster must: Have an OIDC provider with IRSA enabled Support Persistent Volumes using EBS Support LoadBalancer services and/or Ingresses Auto-Scale using Karpenter or node-autoscaler Decodable expects a dedicated nodegroup with the taint flink-app:NoSchedule, which is used to run Flink TaskManagers. We recommend using m7gd class instances and putting EmptyDir volumes on the instance-local SSD for best performance. MSK Decodable uses MSK to store internal topics with pipeline state. These are distinct from “external” topics which are connected with the Decodable Apache Kafka connectors. Guidelines for the Decodable MSK: Use Kafka 3.5.1 if possible Use Zookeeper for state (don’t enable KRaft) Use brokers of size kafka.m5.large or kafka.m7g.large (or larger) Enable volume auto-scaling for the brokers Ensure there is network connectivity from the EKS cluster to MSK The EKS and MSK clusters should be in the same region to avoid excess latency Configure Security Groups to allow connectivity on port 9098 and 9096 Enable both IAM and SASL/SCRAM authentication S3 Decodable uses S3 to store job state, crash data and job logs. Create 3 distinct buckets for these purposes. Guidelines for Decodable S3 buckets: Enable retention policies for the crash data and job log buckets to limit bucket size Disable public access for all of these buckets Create these buckets in the same region as the Decodable deployment to reduce costs Secrets Manager For Decodable to authenticate to MSK using SASL/SCRAM credentials, it’s necessary to create a KMS key and encrypt a Secret in AWS Secrets Manager. This secret must be associated with the MSK cluster created above, so the username/password can be used for authentication. The secret should be of the form: { "username": "vector" "password": "<16 character random string>" } IAM Policies These policies reference the resources created above. decodable_kafka_data Give Decodable access to create topics, consumer groups and transactions in MSK. This is necessary for Decodable to manage state stored in internal topics. { "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Action": [ "kafka-cluster:DescribeClusterDynamicConfiguration", "kafka-cluster:DescribeCluster", "kafka-cluster:Connect" ], "Resource": "arn:aws:kafka:<region>:<account>:cluster/<cluster>" }, { "Sid": "", "Effect": "Allow", "Action": [ "kafka-cluster:WriteData", "kafka-cluster:ReadData", "kafka-cluster:*Topic*" ], "Resource": "arn:aws:kafka:<region>:<account>:topic/<cluster>/*" }, { "Sid": "", "Effect": "Allow", "Action": [ "kafka-cluster:DescribeGroup", "kafka-cluster:AlterGroup" ], "Resource": "arn:aws:kafka:<region>:<account>:group/<cluster>/*" }, { "Sid": "", "Effect": "Allow", "Action": [ "kafka-cluster:DescribeTransactionalId", "kafka-cluster:AlterTransactionalId" ], "Resource": "arn:aws:kafka:<region>:<account>:transactional_id/<cluster>/*" } ] } decodable_msk_sasl_scram Give Decodable access to Secrets Manager to access SASL/SCRAM credentials. These are used to access MSK in components which don’t support IAM authentication. { "Statement": [ { "Action": "secretsmanager:GetSecretValue", "Effect": "Allow", "Resource": "arn:aws:secretsmanager:<region>:<account>:secret:<sasl_scram_secret_id>", "Sid": "" }, { "Action": "kms:Decrypt", "Effect": "Allow", "Resource": "arn:aws:secretsmanager:<region>:<account>:key:<encryption_key>" "Sid": "" } ], "Version": "2012-10-17" } decodable_s3 Give Decodable access to the S3 buckets where the job state is stored, and where heap dumps and job logs are uploaded. { "Statement": [ { "Action": "s3:ListBucket", "Effect": "Allow", "Resource": [ "arn:aws:s3:::<state bucket>", "arn:aws:s3:::<debug bucket>", "arn:aws:s3:::<log bucket>" ], "Sid": "" }, { "Action": "s3:*Object*", "Effect": "Allow", "Resource": [ "arn:aws:s3:::<state bucket>/*", "arn:aws:s3:::<debug bucket>/*", "arn:aws:s3:::<log bucket>/*" ], "Sid": "" } ], "Version": "2012-10-17" } decodable_secrets_manager Decodable stores internal secrets (such as for authenticating to sources) in AWS Secrets Manager. { "Statement": [ { "Action": "secretsmanager:*", "Effect": "Allow", "Resource": "arn:aws:secretsmanager:<region>:<account>:secret:decodable/user/account/*", "Sid": "" }, { "Action": [ "secretsmanager:List*", "secretsmanager:Get*", "secretsmanager:Describe*" ], "Effect": "Allow", "Resource": "arn:aws:secretsmanager:<region>:<account>:secret:decodable/system/*", "Sid": "" } ], "Version": "2012-10-17" } Roles Decodable uses IAM Roles for Service Accounts to authenticate pods to AWS services. This table describes the roles and the required policies: AWS Role Kubernetes Service Account Policies data_plane_api <namespace>:data-plane-api decodable_kafka decodable_s3 decodable_secrets_manager data_plane_controller <namespace>:data-plane-controller decodable_kafka decodable_s3 decodable_secrets_manager decodable_msk_sasl_scram flink <namespace>:flink decodable_kafka decodable_s3 flink_cupi byoj-<account ID>-*:flink decodable_kafka decodable_s3 vector <namespace>:vector decodable_msk_sasl_scram decodable_s3