Home Product Docs Pricing AI Integrations Blog About 𝕏 @Datavor_ai ▶ YouTube
Get Started — Free →
⬡ Our moat · The reason Datavor compounds

The Context Engine.
Datavor that gets smarter.

A persistent local SQLite store of everything Datavor learns about your data — relationships, business rules, transform recipes, error patterns, accepted suggestions. Every session starts smarter than the last. No competitor has this.

⬡ EVERY OTHER ETL TOOL

You explain the same things, every time.

Fivetran, Airbyte, integrate.io — they're all stateless toolchains. The agent or operator comes in fresh each session. Your rules ("never sync test rows"), your conventions ("orders use created_at, not timestamp"), your past mistakes — none of it persists. You re-explain them, every time, to every agent.

⬡ DATAVOR

Datavor remembers. Forever, locally.

A single SQLite file at ~/.datavor/context.db accumulates everything the AI learns. Schemas, rules, recipes, errors and how you fixed them, suggestions you accepted, suggestions you dismissed. Next session — even months later — your agent picks up exactly where the last one left off.

Four engines, one memory.

The Context Engine isn't a single store — it's four cooperating subsystems, each accumulating a different kind of knowledge.

RuleStore

business rules / data quality / policy

Persistent rules the AI applies to every sync, every query, every suggestion. "Never sync rows where status = 'test'." "Always exclude PII columns from analytics warehouse." Rules are scoped to tables, databases, or globally.

RecipeManager

named, versioned transforms

Save any transform configuration as a named recipe — "trim_whitespace_lowercase," "decimal_to_cents," "anonymize_emails." Apply by name in future syncs. Recipes version themselves; if you change one, old sync jobs keep using the version they were created with.

ErrorLearner

past failures + fixes

Every sync error gets logged with its full context — what was attempted, why it failed, what fixed it. Next time a similar error shows up, Datavor surfaces the past fix proactively. Time-zone bugs, encoding mismatches, lock conflicts — all things you fix once.

SuggestionEngine

proactive what-ifs

Watches your data and pipelines, surfaces ideas. "New column tier appeared in customers — add it to the warehouse sync?" "Orders sync has been failing for 3 days — restart with checkpoint?" Accept, dismiss, or modify. Accepted suggestions become rules.

How it compounds.

Three sessions across one month — same user, same database, same AI. Watch what Datavor learns and how each session starts smarter.

SESSION 01 Day 1

First contact. Datavor learns the schema.

You: "Sync the orders table from production Postgres to my analytics warehouse, daily at 2am."
Claude: Calls describe_table, then asks: "I see orders has created_at and updated_at — which should drive incremental sync?"
You: "Use updated_at. And skip rows where status = 'test' — those are QA fixtures."
Claude: Creates scheduler_create_job, adds add_rule.
⬡ STORED schema fingerprint · incremental column choice (updated_at) · rule: orders.status != 'test'
SESSION 07 Day 8

You add a new sync. Datavor doesn't have to ask.

You: "Add the customers table to the nightly sync too."
Claude: Reads the context: same source, same target, same nightly schedule pattern, your test-row preference. Suggests: "I'll mirror the orders setup — updated_at for incremental, excluding test rows. Run after the orders sync. OK?"
You: "Yes." — No re-asking. No re-explaining the test-row rule.
⬡ STORED customers schema · sync recipe applied: "production_to_warehouse_nightly" · dependency added: customers → orders
SESSION 23 Day 30

Datavor catches a problem you'd have missed.

Claude: "Two things from overnight:
1. Postgres schema changed — orders now has a tier column (varchar(20)). Want me to mirror it to the warehouse? Looks like enum data — I'd suggest a CHECK constraint.
2. The orders sync from Day 12 is your only failure in 30 days. Root cause was a timezone mismatch on created_at. I've added a rule to coerce timestamps to UTC before incremental comparison — should prevent recurrence." — Both pulled from the SuggestionEngine + ErrorLearner. No prompt needed.
⬡ STORED new column suggestion · timezone fix codified as rule · error pattern: PG-WH-TZ-mismatch

What's actually in the context.db.

The Context Engine is a single SQLite file. No proprietary format, no encryption layer, no service to call — just SQL tables you can sqlite3 into and read directly. Here's the schema, simplified.

~/.datavor/context.db — simplified schema sqlite 3
-- Schemas Datavor has seen, fingerprinted for change detection
CREATE TABLE schemas (
  id TEXT PRIMARY KEY,           -- connection_id + table
  schema_json JSON,              -- columns, types, FKs
  fingerprint TEXT,              -- hash for diff detection
  first_seen TIMESTAMP,
  last_seen TIMESTAMP
);

-- Business rules — applied automatically by relevant tools
CREATE TABLE rules (
  id TEXT PRIMARY KEY,
  scope TEXT,                    -- 'global', 'database:X', 'table:X.Y'
  predicate TEXT,                -- SQL or DSL
  description TEXT,
  created_at TIMESTAMP,
  source TEXT                    -- 'user' or 'accepted_suggestion'
);

-- Named, versioned transform recipes
CREATE TABLE recipes (
  id TEXT PRIMARY KEY,
  name TEXT UNIQUE,
  version INTEGER,
  transforms_json JSON,
  tags TEXT,
  created_at TIMESTAMP
);

-- Errors with their fixes, for proactive recall
CREATE TABLE errors (
  id TEXT PRIMARY KEY,
  pattern_hash TEXT,             -- for similarity matching
  context_json JSON,             -- what was attempted
  error_message TEXT,
  resolution_json JSON,          -- what fixed it
  occurred_at TIMESTAMP
);

-- Suggestions surfaced to user, with their disposition
CREATE TABLE suggestions (
  id TEXT PRIMARY KEY,
  type TEXT,                     -- 'schema_change', 'sync_recovery', etc.
  payload_json JSON,
  status TEXT,                   -- 'pending', 'accepted', 'dismissed'
  created_at TIMESTAMP,
  resolved_at TIMESTAMP
);

Run sqlite3 ~/.datavor/context.db .schema on any Datavor install to see your actual tables — these and a few internal ones for indexing.

What the Context Engine stores. And what it doesn't.

The Context Engine is on your machine. It stays on your machine. Free tier is 100% local — nothing leaves. Pro sends a tiny daily license-validation heartbeat (aggregate counts, no content). Here's the strict line, drawn:

Stored

  • Database schemas — table names, column names, types, foreign keys
  • Rules you've defined or accepted (predicates only, not data)
  • Recipe definitions — transform configurations by name
  • Error patterns — what failed, why, how it was fixed
  • Connection metadata — host (hashed), database name, last connected
  • Job history — what synced, when, how many rows, success/fail
  • Suggestion log — what was suggested, what you accepted/dismissed

Never stored

  • Row data. Never. Not in errors, not in logs, not in suggestions.
  • SQL parameter values. Queries are recorded as templates, not with bound values.
  • Database passwords. Credentials never touch disk inside Datavor.
  • PII columns. Even column names matching PII patterns get hashed.
  • External keys — API keys, OAuth tokens, secrets in env vars.
  • Personal data of any kind beyond what's needed for the schema fingerprint.

The 11 MCP tools that talk to the Context Engine.

The Context Engine is exposed entirely through MCP — your AI tool reads and writes it through these 11 tools. Full reference in the docs.

ToolPurposeComponent
get_context Everything Datavor knows: databases, rules, relationships, recipes, recent suggestions. All
add_rule Save a business rule with scope and predicate. Rules
update_rule Modify an existing rule's predicate or scope. Rules
remove_rule Delete a rule. Past job runs that used it remain unaffected. Rules
save_recipe Save a transform configuration as a named, versioned recipe. Recipes
apply_recipe Apply a saved recipe to a new sync configuration by name. Recipes
list_recipes List saved recipes, optionally filtered by connection, table, or tags. Recipes
get_suggestions Get pending suggestions for review — schema changes, sync recoveries, optimizations. Suggest
accept_suggestion Apply a suggestion. May silently create rules, recipes, or schedule jobs. Suggest
dismiss_suggestion Reject a suggestion. It won't be re-surfaced for the same pattern. Suggest
transform_preview Preview what transforms will produce on sample data before running. Recipes

Why no other tool has this.

Capability Datavor Fivetran Airbyte integrate.io
Persistent rules that apply across syncs
Named, versioned, reusable transform recipespartial
Error patterns surfaced from past failures
Proactive suggestions based on schema drift
State lives entirely on your machine
Inspectable as a plain SQL file
AI tool can read it without permissions

Competitors that have some of these features keep them in their cloud, behind their UI. None expose them as a flat SQLite file your AI can query as freely as it queries your databases. That gap is the Context Engine.

Start with fresh memory.
End the month with hindsight.

The Context Engine ships with every Datavor install — Free or Pro. No setup, no configuration. The file starts empty and fills itself.