Build your first pipeline
Move data from source to destination on a schedule.
A pipeline moves data from one place to another. You pick a source connector and a target connector, choose which tables or datasets to sync, set how each one is loaded, and (optionally) put it on a schedule. Databasin handles the reading, the batching, the retries, and the monitoring.
This guide takes you through your first one, end to end.
Before you start
You need two connectors already saved — one for the source and one for the target. If you only have one so far, pause and build another. Common pairings:
- Prod Postgres → Lakehouse (classic analytics sync)
- Salesforce → Lakehouse (SaaS into your warehouse)
- S3 CSV drops → Lakehouse (ingesting vendor files)
Step 1 — Start from "Create integration"
Pipelines live in the Integrations hub. Open Integrations and click Create integration. A short chooser opens with three options:
- Pipeline — the full source-to-target flow.
- Automation — a multi-step workflow (a different tool; see Pipelines vs. Automations).
- Data sync — a guided, streamlined version of the pipeline flow.
Pick Pipeline (or Data sync if you'd rather be walked through it in guided mode).
Step 2 — Pick source and target together
On the first screen you choose both the source and the target connector, plus a name for the pipeline. (In older versions these were separate steps — now they're set side by side before you continue.)
Step 3 — Choose what to sync
Databasin introspects the source and walks you through it:
- Catalog / schema — narrow to the part of the source you care about.
- Artifacts — the tables, objects, endpoints, or files to bring across. Toggle each one on or off.
- Columns — keep all of them (the default) or trim to what you need.
Step 4 — Set the ingestion mode
For each artifact, you choose how it's loaded. There are five modes:
| Mode | What it does |
|---|---|
| Snapshot | Full refresh — replaces the target table on every run. Best for small reference tables. |
| Delta | Incremental — adds new and changed rows each run using a watermark column. |
| Historical | Append-only — each run adds new rows; existing rows are never touched. |
| CDC | Change data capture — reads inserts, updates, and deletes from the source's log. |
| Stored Procedure | Calls a stored procedure on the source to produce the extract. |
Snapshot and Delta cover most jobs. For the full picture of when to reach for each, see Ingestion modes.
The pipeline wizard lands your data faithfully — there's no rename / cast / filter / join stage inside it. To reshape, model, or join data, that's an Automations job. Databasin even points you there: "Use Databasin Automations to model and transform data."
Step 5 — Final config and schedule
The last screen confirms the configuration and lets you decide when the pipeline runs, via the cron scheduler:
- Manual only — runs when you click Run now or an automation triggers it.
- Presets — common cadences like hourly, daily, or weekly, one click each.
- Custom cron — a cron expression when the presets don't fit.
See Scheduling and triggers for the details. Save, and your pipeline is live.
Step 6 — Run it and watch
Open the pipeline and click Run now. The detail page shows run status, run history, and per-run details; if a run fails you'll get a notification on the bell in the top bar. Monitoring and alerts covers what to watch.
A Delta pipeline merges on the keys you configured, so re-running a completed run won't create duplicates. A Snapshot pipeline replaces the whole table each run, so it's dup-safe by construction.