Documentation — Datavor

Quick Start

Three steps to your first sync. Total time: 60 seconds.

1

Install

No download. Datavor runs through npx, so MCP clients launch it on demand.

Terminalbash

# Optional — pre-warm the cache so the first call is instant
npx -y datavor@latest --version

Requires Node.js 20+. Works on macOS, Linux, and Windows.

2

Add Datavor to your AI tool

The fastest path is Claude Desktop — open your config file and add the datavor server. See the AI Integrations page for Cursor, Cline, opencode, Gemini, and Hermes.

~/Library/Application Support/Claude/claude_desktop_config.jsonjson

{
  "mcpServers": {
    "datavor": {
      "command": "npx",
      "args": ["-y", "datavor"]
    }
  }
}

Restart Claude Desktop. You should see 47 new tools in the tool list.

3

Talk to it

Ask Claude to connect a database in plain English:

→ Try this

"Connect to my Postgres at localhost:5432 with user postgres, database postgres, then list the tables."

Claude will call connect_postgres and list_tables in sequence. You're done.

Configuration

Datavor reads three things in order: environment variables, then a config file at ~/.datavor/config.json, then defaults.

Environment variables

Variable	Default	Purpose
`DATAVOR_HOME`	`~/.datavor`	Where Datavor stores its Context DB, license, and logs.
`DATAVOR_LOG_LEVEL`	`info`	One of `debug`, `info`, `warn`, `error`.
`DATAVOR_LICENSE`	—	Pro activation code. Same as running `datavor activate <CODE>`.
`DATAVOR_WEBUI_PORT`	`3000`	Port the Web UI binds to. Change if 3000 is taken.
`DATAVOR_HTTP_PORT`	`4747`	Port for the HTTP MCP transport when using `datavor serve`.
`DATAVOR_TELEMETRY`	`off` (Free), `on` (Pro)	License-validation heartbeat for Pro. See Privacy.

CLI flags

Command	Purpose
`datavor`	Start the MCP server on stdio (the default; MCP clients invoke this).
`datavor serve`	Start the HTTP MCP transport on `:4747`. See HTTP Transport.
`datavor webui`	Open the Web UI at `localhost:3000` in your browser.
`datavor activate <CODE>`	Activate Pro license with a code from `register_pro`.
`datavor version`	Print current version. Equivalent to `npx datavor@latest --version`.

The 47 MCP Tools

Every Datavor capability is exposed as an MCP tool. Grouped here by purpose. AI agents calling these through API keys are billed per tool tier — see Agent Pricing.

Connections (7)

Open and close database connections. One connect_* tool per supported engine, all returning a connection ID that subsequent calls reference.

Tool	What it does
`connect_mysql`	MySQL 5.7+, MariaDB. Standard connection params.
`connect_postgres`	PostgreSQL 12+. Also works with Supabase, Neon, RDS.
`connect_sqlserver`	SQL Server 2017+. Azure SQL supported.
`connect_sqlite`	Local SQLite file. Just pass the path.
`connect_snowflake`	Snowflake. Account, warehouse, and role required.
`list_connections`	Show all active connections with their IDs.
`close_connection`	Close one connection by ID.

Example — connect_postgres

tool inputjson

{
  "host": "db.acme.com",
  "port": 5432,
  "database": "production",
  "user": "datavor",
  "password": "********",
  "name": "prod-pg"
}

↪ Returns

connection_id — pass this to every subsequent call that operates on this database (e.g. list_tables, execute_query, sync_table). Connections are kept open until close_connection is called or Datavor restarts.

Schema & Discovery (7)

Inspect what's in a database. These are read-only metadata tools — they don't read row data, just structure.

Tool	What it does
`list_tables`	All tables in a database with row counts.
`describe_table`	Columns, types, nullable, primary key, foreign keys.
`compare_table_schemas`	Side-by-side column comparison between two databases.
`show_database_tree`	Hierarchical view of tables and columns.
`analyze_schema_diff`	All differences between two schemas (missing tables, type changes, etc.).
`recommend_sync_order`	AI-suggested table sync priority based on size, FKs, change frequency.
`explain_database`	Natural-language summary of what a database contains.

Sync (6)

Move data between connected databases. Sync calls are the only tools that actually transfer rows.

Tool	What it does
`execute_query`	Run arbitrary SQL on a connected database.
`get_table_data`	Fetch rows from a table with optional filtering.
`sync_table`	Full sync between source and target table.
`sync_table_partial`	Sync rows matching a WHERE clause.
`sync_table_incremental`	Sync only new/updated rows using a timestamp column.
`sync_table_with_transforms`	Apply column-level transforms while syncing.

Per-record fault tolerance

Sync errors don't abort the run. A failing row is logged to the Context Engine with its row identifier and reason, then sync continues. After completion, you get a single report: {success: 12,847, failed: 3, error_breakdown: {...}}.

Change Data Capture (3)

Real-time replication driven by Postgres WAL or MySQL binlog. CDC tools start a persistent stream — the connection stays open and emits changes as they happen at the source.

Tool	What it does
`start_cdc`	Begin a CDC stream from source tables to a target.
`stop_cdc`	End a running CDC stream cleanly, flush pending events.
`cdc_status`	Inspect a stream: events processed, current lag, last event time.

CDC requires source configuration

Postgres needs wal_level = logical and a replication slot. MySQL needs binlog_format = ROW and REPLICATION SLAVE grants. Datavor's start_cdc will error clearly if either isn't set up.

Scheduler (8)

Save sync configurations as recurring jobs with optional dependencies. Jobs run on a cron schedule and can wait for upstream jobs to finish first.

Tool	What it does
`scheduler_create_job`	Save a sync recipe as a scheduled job (cron or interval).
`scheduler_list_jobs`	List all jobs with schedule, last run, run count.
`scheduler_pause_job`	Pause a job without deleting it.
`scheduler_resume_job`	Resume a paused job.
`scheduler_run_job`	Run a job once, immediately, outside its schedule.
`scheduler_delete_job`	Remove a job permanently.
`scheduler_add_dependency`	Make one job wait for another to finish (DAGs).
`scheduler_show_graph`	Render dependency graph showing which jobs wait for which.

Context Engine (11)

The persistent local SQLite knowledge store that makes Datavor smart over time. Rules, recipes, suggestions, and learned errors all live here.

Tool	What it does
`get_context`	Everything Datavor knows: databases, rules, relationships, recipes.
`add_rule`	Save a business rule ("never sync rows with status=test").
`update_rule`	Modify an existing rule.
`remove_rule`	Delete a rule.
`save_recipe`	Save a transform configuration as a reusable named recipe.
`apply_recipe`	Use a saved recipe in a new sync.
`list_recipes`	List all saved recipes with their tags.
`get_suggestions`	Get pending suggestions from the Suggestion Engine.
`accept_suggestion`	Apply a suggestion automatically.
`dismiss_suggestion`	Reject a suggestion (it won't be re-surfaced).
`transform_preview`	See what a transform will produce before running it.

Dashboard (3)

Aggregate views for the Web UI and for AI-driven status checks.

Tool	What it does
`dashboard_summary`	Overview: success rate, 7-day chart, totals, active CDC.
`dashboard_table_history`	Run log for a specific table with timing and row counts.
`dashboard_failures`	All failed runs with error messages.

Pro Licensing (2)

New in v3.1. Lets users register for and activate a Pro license from inside their AI tool — no separate website signup. See Pro Licensing for details.

Tool	What it does
`register_pro`	Email-based Pro signup. Returns an activation code by email.
`activate_pro`	Apply an activation code to unlock Pro features.

Web UI v3.0+

A local React dashboard at localhost:3000 that shows everything Datavor knows. Read-only — Claude still drives writes via MCP.

Start it with:

Terminalbash

datavor webui

Or visit http://localhost:3000 directly if Datavor is already running.

What's inside

Section	What you'll see
`/` Dashboard	7-day sync activity chart, success rate, connection count, active CDC streams, pending suggestions. Auto-refreshes every 30s.
`/connections`	Card grid of every connected database with host, tables, last connected.
`/context-graph`	Visual graph of tables and relationships across all connected databases.
`/cdc`	Live monitor for CDC streams — events processed, lag, last event. Refreshes every 5 s.
`/scheduler`	All scheduled jobs with status, schedule, last result, dependency graph.
`/settings`	License status, telemetry preferences, port configuration.

Read-only by design

The Web UI never writes to your databases. It reads from the same Context Engine that Claude uses, so what you see is what Claude sees. To change anything, talk to Claude.

HTTP Transport v3.0+

Run Datavor as a long-lived HTTP MCP server instead of stdio. Useful for remote MCP clients, containerized deploys, or sharing one Datavor instance across multiple machines.

Terminalbash

# Start the HTTP transport on :4747 (default)
datavor serve

# Or specify a port
DATAVOR_HTTP_PORT=8080 datavor serve

The HTTP transport uses Streamable HTTP MCP with bearer-token auth, session management, and SSE for tool results. Connect by pointing your MCP client at http://<host>:4747/mcp.

Claude.ai web doesn't accept localhost URLs

Claude.ai's web connectors reject http://localhost. If you want the HTTP transport with Claude.ai, run Datavor on a publicly reachable host with HTTPS, or use a tunnel like Cloudflare Tunnel. For local use, stdio remains the simpler path.

Pro Licensing v3.1+

Pro adds a commercial-use license and priority support. Every feature in Datavor is available on Free — Pro is licensing, not feature gating. See Pricing.

register_pro

Initiate Pro signup from inside your AI tool. Returns nothing visible — but you'll receive an email at the address provided with the activation code.

tool inputjson

{
  "email": "dev@acme.com",
  "plan": "annual"      // or "monthly"
}

You'll be redirected to Stripe checkout. After payment, the activation code arrives by email within seconds.

activate_pro

Apply the activation code to unlock Pro features on this machine and any other you install Datavor on with the same code.

tool inputjson

{
  "code": "dvpro-XXXXXX-XXXXXX-XXXXXX"
}

Equivalent to the CLI: datavor activate dvpro-XXXXXX-XXXXXX-XXXXXX.

Per-developer, not per-machine

Your activation code is tied to you, not to a specific machine. Install Datavor on your laptop, home server, and prod box — one code unlocks them all. License terms restrict use to a single human developer; agent workloads need Agent API keys.

Agent Billing API v3.2

For autonomous AI agents calling Datavor through API keys. Metered per tool call, billed in USDC or Stripe. See Agent Pricing for tier rates and the cost model.

Authentication

Every request needs an Authorization: Bearer dvpro_agent_* header. Get an API key at datavor.ai/agent.

Example requestbash

curl -X POST https://datavor.ai/mcp/call \
  -H "Authorization: Bearer dvpro_agent_a1b2c3d4..." \
  -H "Content-Type: application/json" \
  -d '{
    "tool": "sync_table",
    "input": {
      "source": "prod-mysql",
      "target": "analytics-pg",
      "table": "orders"
    }
  }'

Response format

Every tool response includes a billing block alongside the tool result.

Responsejson

{
  "result": { "rows_synced": 12847 },
  "billing": {
    "tier": "heavy",
    "charged_usd": 0.05,
    "balance_usd": 9.32
  }
}

Billing endpoints

Endpoint	Purpose
`GET /api/billing`	Current balance, recent usage, tier breakdown.
`POST /api/billing/topup`	Add credits via Stripe or USDC. Returns checkout/wallet URL.
`GET /api/billing/limits`	Get current spend caps.
`POST /api/billing/limits`	Set monthly spend cap. Hit the cap, calls 402.
`GET /api/billing/keys`	List all API keys on your account.
`POST /api/billing/keys`	Create a new API key. Optionally isolate balance per key.

Error handling

Code	Meaning
`401`	Missing or invalid API key.
`402`	Insufficient balance. Top up or enable auto-recharge.
`429`	Rate limit on a per-key basis (default 100 calls/sec, raise via support).
`503`	Downstream database unavailable. Not charged.

Agent traffic requires Agent API keys

Pro licenses are for human developers and include fair-use limits. Sustained machine-pattern traffic on a Pro license will be flagged and the account suspended. See the FAQ on the pricing page for the why.

External Alerting v3.0+

Push pipeline events to Slack or any webhook endpoint. Configure on the Web UI Settings page; delivery is live the moment you save.

Events

Six event types fire alerts. Each carries the relevant context — job name, connection, error detail — in its payload.

Event	Fires when
`sync_failure`	A sync job fails (after per-record fault tolerance — i.e. the whole job, not a single quarantined row).
`cdc_error`	A running CDC stream hits an apply error or source problem.
`cdc_stopped`	A CDC stream stops — cleanly via `stop_cdc`, or because it was interrupted.
`schema_change`	A source schema change is detected (new column, type change, dropped table).
`suggestion_new`	The SuggestionEngine surfaces a new suggestion.
`job_failure`	A scheduled job fails — distinct from `sync_failure` in that it covers any job type, including query and transform jobs.

Slack delivery

Paste a Slack incoming-webhook URL and Datavor formats events as Slack Block Kit messages — structured, readable, with the event type, affected resource, and detail laid out as blocks rather than a wall of JSON.

Burst protection

Slack rate-limits incoming webhooks to roughly 1 message per second. Datavor applies per-URL serial throttling — events to the same Slack URL are queued and delivered in series, so a burst of schema changes or sync failures won't trip Slack's limit and drop your alerts.

Generic webhooks

Any non-Slack URL receives a stable JSON envelope — the same shape for every event type, so you can route it to PagerDuty, a custom handler, or your own logging pipeline without special-casing each event.

Webhook payload envelopejson

{{
  "event": "sync_failure",
  "timestamp": "2026-05-19T02:14:08Z",
  "datavor_version": "3.1.0",
  "data": {{
    // event-specific fields — job name,
    // connection, error detail, etc.
  }}
}}

Configure in the Web UI

Open Settings in the Web UI, toggle Enable alerting, and add one or more webhook URLs. Saved webhooks fire immediately — no restart needed.

Telemetry & Privacy

Datavor handles your data. We take that seriously and try to be transparent about exactly what leaves your machine.

Free tier — 100% local

The Free tier sends nothing to Datavor servers. No telemetry, no analytics, no usage pings. The Context Engine SQLite database lives at ~/.datavor/context.db and you can inspect, copy, or delete it any time.

Verify it yourself

Free Datavor makes zero outbound HTTPS calls. Verify with lsof -i -p $(pgrep -f datavor) or your firewall of choice — you'll only see connections to your own databases.

Pro tier — daily license-validation ping

Pro installations send one HTTPS request per day to https://datavor.ai/api/license/heartbeat. The payload contains:

Heartbeat payload — the entire thingjson

{
  "activation_id": "dvpro-XXXXXX",
  "day": "2026-05-19",
  "fingerprint_hash": "sha256:a1b2...",
  "tool_calls": {
    "light": 47,
    "standard": 12,
    "heavy": 3
  }
}

That's the entire payload. Specifically not included:

Database hostnames, names, schemas, or table names
Query content (SQL statements, parameter values, results)
Row data of any kind
Connection credentials
Tool inputs or outputs beyond the per-tier count
Personally identifiable information beyond the activation_id and a machine fingerprint hash

Why the heartbeat exists

Two reasons:

License enforcement. Pro is licensed per-developer. Tool-call patterns help us distinguish legitimate developer workflows from agent traffic that should be on Agent API keys instead. We can't enforce the license without seeing some kind of aggregate signal.
Capacity planning. Knowing tier distribution across the user base helps us prioritize what to optimize. A user base hitting Heavy tools 10× more than expected tells us where to invest engineering time.

Opt-out

Set DATAVOR_TELEMETRY=off in your environment to disable the heartbeat. Note: disabling telemetry on Pro is a license-violation by default. If you have a privacy-sensitive deployment (air-gapped, defense, healthcare) and need to run Pro without telemetry, email support@datavor.ai for an offline-licensed variant.

Fail-open

The heartbeat is informational only. If it fails (network down, our server down), Pro keeps working locally. No "trial expired" lockouts, no online-required activation. Once you've activated, you're activated.

Troubleshooting

The handful of things that go wrong most often, and how to fix them.

"Claude doesn't see Datavor tools"

Verify the config file path. On macOS it's ~/Library/Application Support/Claude/claude_desktop_config.json, not ~/.claude/.
Validate the JSON. A missing comma or bracket silently disables MCP. Use jq . config.json to check.
Restart Claude Desktop completely (Cmd+Q on macOS, not just close the window).
Check Claude's developer logs: ~/Library/Logs/Claude/mcp-server-datavor.log.

"MySQL connection refused on localhost"

On macOS, MySQL requires 127.0.0.1 rather than localhost due to socket vs TCP defaults. Try connecting with host: "127.0.0.1" explicitly.

"Postgres connection hangs"

Check your pg_hba.conf allows the connection. If you're connecting to a stale daemon, remove postmaster.pid and restart Postgres.

"Web UI shows blank page"

Port 3000 may be taken by another service. Either stop that service or set DATAVOR_WEBUI_PORT=3001 and try localhost:3001.

"CDC stream stops emitting events"

Most commonly: the source's replication slot filled up. Postgres has a fixed WAL retention; if your CDC consumer falls behind, the slot is dropped. Restart with start_cdc to recreate it. For MySQL, check that binlog_expire_logs_seconds isn't aggressively rotating before you can consume.

FAQ

Does Datavor store my database credentials?

Connection credentials are held in memory for the duration of the MCP session and discarded when Datavor exits. They are not written to the Context Engine, the license file, or anywhere on disk. If you want persistent connections, save them as environment variables or in a ~/.datavor/connections.json file (read at startup, never transmitted).

Can I run Datavor without an internet connection?

Yes — Free tier is fully offline. npx datavor downloads once and runs from cache after that. Pro requires intermittent connectivity for the daily heartbeat, but tolerates extended offline periods (the heartbeat is fail-open). For fully air-gapped Pro use, contact support for an offline-licensed variant.

What happens to my Context Engine data when I uninstall?

Nothing — it stays at ~/.datavor/context.db until you delete it. Reinstalling Datavor picks up where you left off. To wipe completely, rm -rf ~/.datavor.

Can I use Datavor in an enterprise with strict data-residency rules?

Free tier yes — nothing leaves your network. Pro's daily heartbeat goes to datavor.ai servers (US, EU regions). For environments where even the heartbeat is a problem, email support for an enterprise offline-licensed variant.

Does Datavor support more than 5 databases?

The current 5 are MySQL, PostgreSQL, SQL Server, SQLite, and Snowflake. Big Query, MongoDB, DuckDB, and Redshift are on the v3.x roadmap. Other engines: open an issue on GitHub or vote on the roadmap from Pro Annual.

How do I report a bug?

Email support@datavor.ai with steps to reproduce, your datavor version output, and (if relevant) the Claude MCP log at ~/Library/Logs/Claude/mcp-server-datavor.log. Pro users get a 24h SLA.

Datavor Documentation.

Quick Start

Install

Add Datavor to your AI tool

Talk to it

Configuration

Environment variables

CLI flags

The 47 MCP Tools

Connections (7)

Example — connect_postgres

Schema & Discovery (7)

Sync (6)

Change Data Capture (3)

Scheduler (8)

Context Engine (11)

Dashboard (3)

Pro Licensing (2)

Web UI v3.0+

What's inside

HTTP Transport v3.0+

Pro Licensing v3.1+

register_pro

activate_pro

Agent Billing API v3.2

Authentication

Response format

Billing endpoints

Error handling

External Alerting v3.0+

Events

Slack delivery

Generic webhooks

Telemetry & Privacy

Free tier — 100% local

Pro tier — daily license-validation ping

Why the heartbeat exists

Opt-out

Troubleshooting

"Claude doesn't see Datavor tools"

"MySQL connection refused on localhost"

"Postgres connection hangs"

"Web UI shows blank page"

"CDC stream stops emitting events"

FAQ