Scale to zero

The feature is currently in Tech Preview. Breaking changes can be made to the feature. It’s not recommended for production use.

Scale to zero allows Decodable connections and pipelines jobs to automatically pause and resume based on user specified autoscale configuration.

Currently scale to zero works for SQL pipelines and connections that don’t use the Data Generator or REST connectors.

A valid scale to zero configuration requires two types of conditions:

  • Pause conditions - when a pause condition is met, the system automatically pauses a running job.

  • Resume conditions - when a resume condition is met, the system automatically resumes the paused job.

Pause conditions

Pause conditions are checked every 30 seconds. If multiple pause conditions exist, the job pauses if any condition is met. Two pause conditions are supported:

Max records read rate (max_records_read_rate)

Pause a job if the job reads fewer than a specified number of records over a defined time window. If the job is back pressured the condition will never be met.

Settings:

  • record_count: the number of records processed.

  • duration_seconds: the duration to evaluate the number of records processed.

Example:

Pause the job when it reads fewer than 100 records in a 5-minute (300 seconds) period:

spec:
   […]
   autoscale_configuration:
       pause_conditions:
           max_records_read_rate:
               record_count: "100"
               duration_seconds: "300"

Max job runtime (max_job_runtime)

Pause a job after it has been in the running state for a defined duration.

Settings:

  • duration_seconds: The total time the job has been running before it should pause.

Example:

Pause the job after it has been running for over 1 hour (3600 seconds).

spec:
   […]
   autoscale_configuration:
       pause_conditions:
           max_job_runtime:
               duration_seconds: "3600"

Resume conditions

Resume conditions are evaluated based on the configured schedule when a job is in the paused state.

Schedule (schedule)

Resume a paused job based on a cron schedule. The cron schedule is evaluated based on UTC time.

You can refer to https://crontab.guru/ for how to structure your cron schedule.

Settings:

  • cron_expression: A cron expression that configures the schedule to resume a job.

Example:

Resume the job every morning at 8:00 AM UTC:

spec:
   […]
   autoscale_conditions:
       resume_conditions:
           schedule:
               cron_expression: "0 8 * * *"

Example Pipeline configuration

The pipeline pauses under either of the following conditions:

  • Fewer than 100 messages have been read in the last 5 minutes (300 seconds).

  • The job has been running for over 1 hour (3600 seconds).

If paused, the job automatically resumes each day at 8:00 AM UTC.

---
kind: pipeline
metadata:
   name: pipeline-with-autoscale-conditions
   description: A pipeline with autoscale conditions
spec_version: v2
spec:
   type: SQL
   sql: INSERT INTO metrics_copy SELECT * FROM _metrics
   autoscale_configuration:
       pause_conditions:
           max_records_read_rate:
               record_count: "100"
               duration_seconds: "300"
           max_job_runtime:
               duration_seconds: "3600"
       resume_conditions:
           schedule:
               cron_expression: "0 8 * * *"