Datavor Documentation.
Install, connect databases, and call all 47 MCP tools from Claude, Cursor, or any MCP client. Covers the v3.0 Web UI, v3.1 Pro licensing, and v3.2 Agent Billing API.
Quick Start
Three steps to your first sync. Total time: 60 seconds.
Install
No download. Datavor runs through npx, so MCP clients launch it on demand.
# Optional — pre-warm the cache so the first call is instant
npx -y datavor@latest --version
Requires Node.js 20+. Works on macOS, Linux, and Windows.
Add Datavor to your AI tool
The fastest path is Claude Desktop — open your config file and add the datavor server. See the AI Integrations page for Cursor, Cline, opencode, Gemini, and Hermes.
{
"mcpServers": {
"datavor": {
"command": "npx",
"args": ["-y", "datavor"]
}
}
}
Restart Claude Desktop. You should see 47 new tools in the tool list.
Talk to it
Ask Claude to connect a database in plain English:
localhost:5432 with user postgres, database postgres, then list the tables."
Claude will call connect_postgres and list_tables in sequence. You're done.
Configuration
Datavor reads three things in order: environment variables, then a config file at ~/.datavor/config.json, then defaults.
Environment variables
| Variable | Default | Purpose |
|---|---|---|
DATAVOR_HOME | ~/.datavor | Where Datavor stores its Context DB, license, and logs. |
DATAVOR_LOG_LEVEL | info | One of debug, info, warn, error. |
DATAVOR_LICENSE | — | Pro activation code. Same as running datavor activate <CODE>. |
DATAVOR_WEBUI_PORT | 3000 | Port the Web UI binds to. Change if 3000 is taken. |
DATAVOR_HTTP_PORT | 4747 | Port for the HTTP MCP transport when using datavor serve. |
DATAVOR_TELEMETRY | off (Free), on (Pro) | License-validation heartbeat for Pro. See Privacy. |
CLI flags
| Command | Purpose |
|---|---|
datavor | Start the MCP server on stdio (the default; MCP clients invoke this). |
datavor serve | Start the HTTP MCP transport on :4747. See HTTP Transport. |
datavor webui | Open the Web UI at localhost:3000 in your browser. |
datavor activate <CODE> | Activate Pro license with a code from register_pro. |
datavor version | Print current version. Equivalent to npx datavor@latest --version. |
The 47 MCP Tools
Every Datavor capability is exposed as an MCP tool. Grouped here by purpose. AI agents calling these through API keys are billed per tool tier — see Agent Pricing.
Connections (7)
Open and close database connections. One connect_* tool per supported engine, all returning a connection ID that subsequent calls reference.
| Tool | What it does |
|---|---|
connect_mysql | MySQL 5.7+, MariaDB. Standard connection params. |
connect_postgres | PostgreSQL 12+. Also works with Supabase, Neon, RDS. |
connect_sqlserver | SQL Server 2017+. Azure SQL supported. |
connect_sqlite | Local SQLite file. Just pass the path. |
connect_snowflake | Snowflake. Account, warehouse, and role required. |
list_connections | Show all active connections with their IDs. |
close_connection | Close one connection by ID. |
Example — connect_postgres
{
"host": "db.acme.com",
"port": 5432,
"database": "production",
"user": "datavor",
"password": "********",
"name": "prod-pg"
}
connection_id — pass this to every subsequent call that operates on this database (e.g. list_tables, execute_query, sync_table). Connections are kept open until close_connection is called or Datavor restarts.
Schema & Discovery (7)
Inspect what's in a database. These are read-only metadata tools — they don't read row data, just structure.
| Tool | What it does |
|---|---|
list_tables | All tables in a database with row counts. |
describe_table | Columns, types, nullable, primary key, foreign keys. |
compare_table_schemas | Side-by-side column comparison between two databases. |
show_database_tree | Hierarchical view of tables and columns. |
analyze_schema_diff | All differences between two schemas (missing tables, type changes, etc.). |
recommend_sync_order | AI-suggested table sync priority based on size, FKs, change frequency. |
explain_database | Natural-language summary of what a database contains. |
Sync (6)
Move data between connected databases. Sync calls are the only tools that actually transfer rows.
| Tool | What it does |
|---|---|
execute_query | Run arbitrary SQL on a connected database. |
get_table_data | Fetch rows from a table with optional filtering. |
sync_table | Full sync between source and target table. |
sync_table_partial | Sync rows matching a WHERE clause. |
sync_table_incremental | Sync only new/updated rows using a timestamp column. |
sync_table_with_transforms | Apply column-level transforms while syncing. |
{success: 12,847, failed: 3, error_breakdown: {...}}.
Change Data Capture (3)
Real-time replication driven by Postgres WAL or MySQL binlog. CDC tools start a persistent stream — the connection stays open and emits changes as they happen at the source.
| Tool | What it does |
|---|---|
start_cdc | Begin a CDC stream from source tables to a target. |
stop_cdc | End a running CDC stream cleanly, flush pending events. |
cdc_status | Inspect a stream: events processed, current lag, last event time. |
wal_level = logical and a replication slot. MySQL needs binlog_format = ROW and REPLICATION SLAVE grants. Datavor's start_cdc will error clearly if either isn't set up.
Scheduler (8)
Save sync configurations as recurring jobs with optional dependencies. Jobs run on a cron schedule and can wait for upstream jobs to finish first.
| Tool | What it does |
|---|---|
scheduler_create_job | Save a sync recipe as a scheduled job (cron or interval). |
scheduler_list_jobs | List all jobs with schedule, last run, run count. |
scheduler_pause_job | Pause a job without deleting it. |
scheduler_resume_job | Resume a paused job. |
scheduler_run_job | Run a job once, immediately, outside its schedule. |
scheduler_delete_job | Remove a job permanently. |
scheduler_add_dependency | Make one job wait for another to finish (DAGs). |
scheduler_show_graph | Render dependency graph showing which jobs wait for which. |
Context Engine (11)
The persistent local SQLite knowledge store that makes Datavor smart over time. Rules, recipes, suggestions, and learned errors all live here.
| Tool | What it does |
|---|---|
get_context | Everything Datavor knows: databases, rules, relationships, recipes. |
add_rule | Save a business rule ("never sync rows with status=test"). |
update_rule | Modify an existing rule. |
remove_rule | Delete a rule. |
save_recipe | Save a transform configuration as a reusable named recipe. |
apply_recipe | Use a saved recipe in a new sync. |
list_recipes | List all saved recipes with their tags. |
get_suggestions | Get pending suggestions from the Suggestion Engine. |
accept_suggestion | Apply a suggestion automatically. |
dismiss_suggestion | Reject a suggestion (it won't be re-surfaced). |
transform_preview | See what a transform will produce before running it. |
Dashboard (3)
Aggregate views for the Web UI and for AI-driven status checks.
| Tool | What it does |
|---|---|
dashboard_summary | Overview: success rate, 7-day chart, totals, active CDC. |
dashboard_table_history | Run log for a specific table with timing and row counts. |
dashboard_failures | All failed runs with error messages. |
Pro Licensing (2)
New in v3.1. Lets users register for and activate a Pro license from inside their AI tool — no separate website signup. See Pro Licensing for details.
| Tool | What it does |
|---|---|
register_pro | Email-based Pro signup. Returns an activation code by email. |
activate_pro | Apply an activation code to unlock Pro features. |
Web UI v3.0+
A local React dashboard at localhost:3000 that shows everything Datavor knows. Read-only — Claude still drives writes via MCP.
Start it with:
datavor webui
Or visit http://localhost:3000 directly if Datavor is already running.
What's inside
| Section | What you'll see |
|---|---|
/ Dashboard | 7-day sync activity chart, success rate, connection count, active CDC streams, pending suggestions. Auto-refreshes every 30s. |
/connections | Card grid of every connected database with host, tables, last connected. |
/context-graph | Visual graph of tables and relationships across all connected databases. |
/cdc | Live monitor for CDC streams — events processed, lag, last event. Refreshes every 5 s. |
/scheduler | All scheduled jobs with status, schedule, last result, dependency graph. |
/settings | License status, telemetry preferences, port configuration. |
HTTP Transport v3.0+
Run Datavor as a long-lived HTTP MCP server instead of stdio. Useful for remote MCP clients, containerized deploys, or sharing one Datavor instance across multiple machines.
# Start the HTTP transport on :4747 (default) datavor serve # Or specify a port DATAVOR_HTTP_PORT=8080 datavor serve
The HTTP transport uses Streamable HTTP MCP with bearer-token auth, session management, and SSE for tool results. Connect by pointing your MCP client at http://<host>:4747/mcp.
http://localhost. If you want the HTTP transport with Claude.ai, run Datavor on a publicly reachable host with HTTPS, or use a tunnel like Cloudflare Tunnel. For local use, stdio remains the simpler path.
Pro Licensing v3.1+
Pro adds a commercial-use license and priority support. Every feature in Datavor is available on Free — Pro is licensing, not feature gating. See Pricing.
register_pro
Initiate Pro signup from inside your AI tool. Returns nothing visible — but you'll receive an email at the address provided with the activation code.
{
"email": "dev@acme.com",
"plan": "annual" // or "monthly"
}
You'll be redirected to Stripe checkout. After payment, the activation code arrives by email within seconds.
activate_pro
Apply the activation code to unlock Pro features on this machine and any other you install Datavor on with the same code.
{
"code": "dvpro-XXXXXX-XXXXXX-XXXXXX"
}
Equivalent to the CLI: datavor activate dvpro-XXXXXX-XXXXXX-XXXXXX.
Agent Billing API v3.2
For autonomous AI agents calling Datavor through API keys. Metered per tool call, billed in USDC or Stripe. See Agent Pricing for tier rates and the cost model.
Authentication
Every request needs an Authorization: Bearer dvpro_agent_* header. Get an API key at datavor.ai/agent.
curl -X POST https://datavor.ai/mcp/call \ -H "Authorization: Bearer dvpro_agent_a1b2c3d4..." \ -H "Content-Type: application/json" \ -d '{ "tool": "sync_table", "input": { "source": "prod-mysql", "target": "analytics-pg", "table": "orders" } }'
Response format
Every tool response includes a billing block alongside the tool result.
{
"result": { "rows_synced": 12847 },
"billing": {
"tier": "heavy",
"charged_usd": 0.05,
"balance_usd": 9.32
}
}
Billing endpoints
| Endpoint | Purpose |
|---|---|
GET /api/billing | Current balance, recent usage, tier breakdown. |
POST /api/billing/topup | Add credits via Stripe or USDC. Returns checkout/wallet URL. |
GET /api/billing/limits | Get current spend caps. |
POST /api/billing/limits | Set monthly spend cap. Hit the cap, calls 402. |
GET /api/billing/keys | List all API keys on your account. |
POST /api/billing/keys | Create a new API key. Optionally isolate balance per key. |
Error handling
| Code | Meaning |
|---|---|
401 | Missing or invalid API key. |
402 | Insufficient balance. Top up or enable auto-recharge. |
429 | Rate limit on a per-key basis (default 100 calls/sec, raise via support). |
503 | Downstream database unavailable. Not charged. |
External Alerting v3.0+
Push pipeline events to Slack or any webhook endpoint. Configure on the Web UI Settings page; delivery is live the moment you save.
Events
Six event types fire alerts. Each carries the relevant context — job name, connection, error detail — in its payload.
| Event | Fires when |
|---|---|
sync_failure | A sync job fails (after per-record fault tolerance — i.e. the whole job, not a single quarantined row). |
cdc_error | A running CDC stream hits an apply error or source problem. |
cdc_stopped | A CDC stream stops — cleanly via stop_cdc, or because it was interrupted. |
schema_change | A source schema change is detected (new column, type change, dropped table). |
suggestion_new | The SuggestionEngine surfaces a new suggestion. |
job_failure | A scheduled job fails — distinct from sync_failure in that it covers any job type, including query and transform jobs. |
Slack delivery
Paste a Slack incoming-webhook URL and Datavor formats events as Slack Block Kit messages — structured, readable, with the event type, affected resource, and detail laid out as blocks rather than a wall of JSON.
Generic webhooks
Any non-Slack URL receives a stable JSON envelope — the same shape for every event type, so you can route it to PagerDuty, a custom handler, or your own logging pipeline without special-casing each event.
{{
"event": "sync_failure",
"timestamp": "2026-05-19T02:14:08Z",
"datavor_version": "3.1.0",
"data": {{
// event-specific fields — job name,
// connection, error detail, etc.
}}
}}
Telemetry & Privacy
Datavor handles your data. We take that seriously and try to be transparent about exactly what leaves your machine.
Free tier — 100% local
The Free tier sends nothing to Datavor servers. No telemetry, no analytics, no usage pings. The Context Engine SQLite database lives at ~/.datavor/context.db and you can inspect, copy, or delete it any time.
lsof -i -p $(pgrep -f datavor) or your firewall of choice — you'll only see connections to your own databases.
Pro tier — daily license-validation ping
Pro installations send one HTTPS request per day to https://datavor.ai/api/license/heartbeat. The payload contains:
{
"activation_id": "dvpro-XXXXXX",
"day": "2026-05-19",
"fingerprint_hash": "sha256:a1b2...",
"tool_calls": {
"light": 47,
"standard": 12,
"heavy": 3
}
}
That's the entire payload. Specifically not included:
- Database hostnames, names, schemas, or table names
- Query content (SQL statements, parameter values, results)
- Row data of any kind
- Connection credentials
- Tool inputs or outputs beyond the per-tier count
- Personally identifiable information beyond the activation_id and a machine fingerprint hash
Why the heartbeat exists
Two reasons:
- License enforcement. Pro is licensed per-developer. Tool-call patterns help us distinguish legitimate developer workflows from agent traffic that should be on Agent API keys instead. We can't enforce the license without seeing some kind of aggregate signal.
- Capacity planning. Knowing tier distribution across the user base helps us prioritize what to optimize. A user base hitting Heavy tools 10× more than expected tells us where to invest engineering time.
Opt-out
Set DATAVOR_TELEMETRY=off in your environment to disable the heartbeat. Note: disabling telemetry on Pro is a license-violation by default. If you have a privacy-sensitive deployment (air-gapped, defense, healthcare) and need to run Pro without telemetry, email support@datavor.ai for an offline-licensed variant.
Troubleshooting
The handful of things that go wrong most often, and how to fix them.
"Claude doesn't see Datavor tools"
- Verify the config file path. On macOS it's
~/Library/Application Support/Claude/claude_desktop_config.json, not~/.claude/. - Validate the JSON. A missing comma or bracket silently disables MCP. Use
jq . config.jsonto check. - Restart Claude Desktop completely (Cmd+Q on macOS, not just close the window).
- Check Claude's developer logs:
~/Library/Logs/Claude/mcp-server-datavor.log.
"MySQL connection refused on localhost"
On macOS, MySQL requires 127.0.0.1 rather than localhost due to socket vs TCP defaults. Try connecting with host: "127.0.0.1" explicitly.
"Postgres connection hangs"
Check your pg_hba.conf allows the connection. If you're connecting to a stale daemon, remove postmaster.pid and restart Postgres.
"Web UI shows blank page"
Port 3000 may be taken by another service. Either stop that service or set DATAVOR_WEBUI_PORT=3001 and try localhost:3001.
"CDC stream stops emitting events"
Most commonly: the source's replication slot filled up. Postgres has a fixed WAL retention; if your CDC consumer falls behind, the slot is dropped. Restart with start_cdc to recreate it. For MySQL, check that binlog_expire_logs_seconds isn't aggressively rotating before you can consume.
FAQ
~/.datavor/connections.json file (read at startup, never transmitted).npx datavor downloads once and runs from cache after that. Pro requires intermittent connectivity for the daily heartbeat, but tolerates extended offline periods (the heartbeat is fail-open). For fully air-gapped Pro use, contact support for an offline-licensed variant.~/.datavor/context.db until you delete it. Reinstalling Datavor picks up where you left off. To wipe completely, rm -rf ~/.datavor.datavor.ai servers (US, EU regions). For environments where even the heartbeat is a problem, email support for an enterprise offline-licensed variant.datavor version output, and (if relevant) the Claude MCP log at ~/Library/Logs/Claude/mcp-server-datavor.log. Pro users get a 24h SLA.