Cron-style scheduled jobs with explicit dependency graphs. Jobs wait for the jobs they depend on. No more guessing whether the customers sync finished before the orders sync started. Replaces Airflow for the 80 % of teams who never needed a separate orchestration platform.
The animation above isn't a marketing mock-up of a concept — it's how the scheduler actually works. Here's the real dependency graph rendered live in the Datavor Web UI, showing four sync jobs and the three dependencies between them.
Live at localhost:3000/scheduler — see the full dashboard on the Web UI page.
The naive approach: schedule each sync at a fixed time and hope they don't collide. It works until your data grows, a sync runs long, and suddenly your reports are built on yesterday's half-loaded tables.
You schedule each job at a guessed offset and pray the upstream one finished first:
One slow day — a big batch, a locked table, a network blip — and the 30-minute buffer isn't enough. Orders syncs against half-loaded customers. The BI refresh runs on incomplete data. No error fires. The numbers are just quietly wrong.
You declare what depends on what. Datavor runs each job the moment its dependencies finish — not a guessed minute later:
Independent jobs run in parallel. Dependent jobs wait. If customers takes 90 minutes one night, orders simply starts at minute 91 — correct, just later. The BI refresh never sees half-loaded data.
You never write cron expressions unless you want to. Tell your AI when, and it translates. Under the hood it's standard cron and interval syntax — visible and editable if you prefer.
A dependency-aware scheduler's real test is what happens when something breaks. A failed upstream job shouldn't silently let downstream jobs run on bad data — but it also shouldn't bring down unrelated branches.
The orders sync errors out — say the source DB was briefly unreachable. Datavor records the failure, captures the error in the ErrorLearner, and marks the job failed.
Anything depending on orders — the refresh_bi job — does not run. It moves to blocked state. Independent branches (a separate inventory sync) keep running normally. No false-positive reports.
Datavor retries the failed job on a backoff schedule. On success, blocked downstream jobs automatically release and run. Or your AI surfaces it: get_suggestions → "orders failed 3×, here's the error — retry or investigate?"
Want to be told without asking? External Alerting fires a job_failure event to Slack or any webhook the instant a scheduled job fails.
The scheduler is exposed as eight MCP tools. Create, control, and visualize jobs entirely through conversation. Full reference in the docs.
| Tool | Purpose |
|---|---|
scheduler_create_job | Save a sync recipe as a scheduled job — cron expression or interval. |
scheduler_list_jobs | List all jobs with schedule, last run, last result, run count, dependencies. |
scheduler_add_dependency | Make one job wait for another to finish before running. Builds the DAG. |
scheduler_show_graph | Render the full dependency graph — which jobs wait for which, execution order. |
scheduler_run_job | Run a job once, immediately, outside its normal schedule. |
scheduler_pause_job | Pause a job — the daemon skips it without deleting it. |
scheduler_resume_job | Resume a previously paused job. |
scheduler_delete_job | Delete a job permanently. Dependents are flagged, not silently orphaned. |
Airflow and Dagster are powerful — and heavy. A separate service, a database, a scheduler process, a web server, DAGs written in Python. For full-blown data engineering, worth it. For "run these four syncs in order each night," wildly overkill.
| Capability | Datavor | Airflow | Dagster | plain cron |
|---|---|---|---|---|
| Dependency-aware (DAGs) | ✓ | ✓ | ✓ | — |
| Set up via natural language | ✓ | — | — | — |
| Zero extra infrastructure | ✓ | — | — | ✓ |
| No Python DAG files to write | ✓ | — | — | ✓ |
| Built-in sync / CDC / transform | ✓ | plugins | plugins | — |
| Visual dependency graph | ✓ | ✓ | ✓ | — |
| Runs locally, no account | ✓ | self-host | self-host | ✓ |
| Learns from past failures | ✓ | — | — | — |
Datavor's scheduler isn't trying to replace Airflow for a 200-DAG data platform. It's for the team that's been gluing together cron jobs and praying — and deserves dependency-awareness without standing up a whole orchestration stack.
"Sync customers and products at 2am, then orders after both, then refresh BI." Datavor builds the DAG, runs it in order, and shows you the graph.