What automations do
Orchestrate tasks, run SQL, chain pipelines and Databricks jobs.
Where a pipeline moves data, an automation orchestrates work. It's how you say "every morning at 6am, run these two pipelines, then a SQL rollup, then rebuild a semantic model." Automations are the scheduler and coordinator for the rest of Databasin.
What an automation is
An automation is a named job that runs on a schedule and contains one or more tasks. Tasks are grouped into stages — stages run one after another, and the tasks inside a single stage run in parallel. (More on that in Stages and parallel tasks.)
The pieces you'll set:
- Name — a short identifier (3–20 characters; lowercase letters, digits, and hyphens).
- Schedule — a cron expression for when it runs. Leave it blank to run only on demand.
- Tasks — the work, organized into stages.
- Configuration — email notifications, plus advanced settings for cluster size, timeout, visibility, and tags.
Where automations live
Automations are a tab inside the Integrations hub, alongside Pipelines and Connectors. To build one, open Integrations and start from the Create integration chooser. See The Integrations hub for the full layout.
The task types
The task picker shows two groups — Databasin Tasks (work that runs on Databasin infrastructure) and Databricks Tasks (work that runs on your Databricks workspace). Eight task types are active today.
Databasin Tasks
| Task | What it does |
|---|---|
| Pipeline | Run an existing Databasin pipeline to ingest or transform data. |
| File Drop | Extract data from a file source and load it into a destination. |
| Unzip File | Extract a zip archive from a file source into a target file location. |
| SQL Script | Execute SQL queries on a target database connection. |
| Databasin Notebook (Beta) | Execute a saved Databasin notebook. |
| Semantic Model (Beta) | Build and deploy a semantic data model to your warehouse. |
Databricks Tasks
| Task | What it does |
|---|---|
| Databricks Job | Trigger an existing Databricks job. |
| Databricks Notebook | Execute a Databricks notebook. |
Databasin Notebook and Semantic Model carry a Beta badge in the picker. They work, but expect rougher edges than the rest.
Stages and parallel tasks
Tasks don't just run top to bottom. They sit in stages: every task in a stage runs at the same time, and stages execute in order with a barrier between them — the next stage waits for the previous one to finish. A stage holds up to 5 tasks. That's how you express "ingest from three sources at once, and only once they're all done, run the rollup." Full details in Stages and parallel tasks.
Scheduling
Automations use the same cron scheduler as pipelines — see Scheduling and triggers for the presets, the timezone picker, and the syntax. Leave the schedule blank if the automation should only run when you kick it off manually.
Email notifications
Each automation has an Email Notifications list. Addresses on it receive the run result — success or failure — every time the automation fires. It's the main way to know a scheduled job finished or broke.
Advanced settings
Behind the Advanced section you'll find four more controls:
- Cluster size — the compute the automation runs on. The default is Medium; sizes run Small → Extra Large.
- Job timeout — a ceiling so a runaway job doesn't burn compute forever. The default is 12 hours; options are 1 hour, 5 hours, 12 hours, 1 day, 2 days, 7 days, or None.
- Visibility — Private, or Visible to project.
- Tags — free-form labels for filtering and grouping.
Bigger clusters run faster but cost more per run. Medium handles most automations — size up only if the job legitimately needs more.
Example: "Morning data refresh"
A typical automation, organized into three stages:
- Schedule: Daily at 06:00.
- Stage 1 (in parallel): Pipeline —
salesforce_to_warehouse, and Pipeline —stripe_to_warehouse. - Stage 2: SQL Script — refresh a rollup table.
- Stage 3: Semantic Model — redeploy the "Finance" model.
- Notifications:
data-team@example.com.
It runs on a Medium cluster, times out after 12 hours, and emails the team on success or failure.