A persistent local SQLite store of everything Datavor learns about your data — relationships, business rules, transform recipes, error patterns, accepted suggestions. Every session starts smarter than the last. No competitor has this.
Fivetran, Airbyte, integrate.io — they're all stateless toolchains. The agent or operator comes in fresh each session. Your rules ("never sync test rows"), your conventions ("orders use created_at, not timestamp"), your past mistakes — none of it persists. You re-explain them, every time, to every agent.
A single SQLite file at ~/.datavor/context.db accumulates everything the AI learns. Schemas, rules, recipes, errors and how you fixed them, suggestions you accepted, suggestions you dismissed. Next session — even months later — your agent picks up exactly where the last one left off.
The Context Engine isn't a single store — it's four cooperating subsystems, each accumulating a different kind of knowledge.
Persistent rules the AI applies to every sync, every query, every suggestion. "Never sync rows where status = 'test'." "Always exclude PII columns from analytics warehouse." Rules are scoped to tables, databases, or globally.
Save any transform configuration as a named recipe — "trim_whitespace_lowercase," "decimal_to_cents," "anonymize_emails." Apply by name in future syncs. Recipes version themselves; if you change one, old sync jobs keep using the version they were created with.
Every sync error gets logged with its full context — what was attempted, why it failed, what fixed it. Next time a similar error shows up, Datavor surfaces the past fix proactively. Time-zone bugs, encoding mismatches, lock conflicts — all things you fix once.
Watches your data and pipelines, surfaces ideas. "New column tier appeared in customers — add it to the warehouse sync?" "Orders sync has been failing for 3 days — restart with checkpoint?" Accept, dismiss, or modify. Accepted suggestions become rules.
Three sessions across one month — same user, same database, same AI. Watch what Datavor learns and how each session starts smarter.
describe_table, then asks: "I see orders has created_at and updated_at — which should drive incremental sync?"updated_at. And skip rows where status = 'test' — those are QA fixtures."scheduler_create_job, adds add_rule.
updated_at) · rule: orders.status != 'test'
updated_at for incremental, excluding test rows. Run after the orders sync. OK?"tier column (varchar(20)). Want me to mirror it to the warehouse? Looks like enum data — I'd suggest a CHECK constraint.created_at. I've added a rule to coerce timestamps to UTC before incremental comparison — should prevent recurrence."
— Both pulled from the SuggestionEngine + ErrorLearner. No prompt needed.
The Context Engine is a single SQLite file. No proprietary format, no encryption layer, no service to call — just SQL tables you can sqlite3 into and read directly. Here's the schema, simplified.
-- Schemas Datavor has seen, fingerprinted for change detection CREATE TABLE schemas ( id TEXT PRIMARY KEY, -- connection_id + table schema_json JSON, -- columns, types, FKs fingerprint TEXT, -- hash for diff detection first_seen TIMESTAMP, last_seen TIMESTAMP ); -- Business rules — applied automatically by relevant tools CREATE TABLE rules ( id TEXT PRIMARY KEY, scope TEXT, -- 'global', 'database:X', 'table:X.Y' predicate TEXT, -- SQL or DSL description TEXT, created_at TIMESTAMP, source TEXT -- 'user' or 'accepted_suggestion' ); -- Named, versioned transform recipes CREATE TABLE recipes ( id TEXT PRIMARY KEY, name TEXT UNIQUE, version INTEGER, transforms_json JSON, tags TEXT, created_at TIMESTAMP ); -- Errors with their fixes, for proactive recall CREATE TABLE errors ( id TEXT PRIMARY KEY, pattern_hash TEXT, -- for similarity matching context_json JSON, -- what was attempted error_message TEXT, resolution_json JSON, -- what fixed it occurred_at TIMESTAMP ); -- Suggestions surfaced to user, with their disposition CREATE TABLE suggestions ( id TEXT PRIMARY KEY, type TEXT, -- 'schema_change', 'sync_recovery', etc. payload_json JSON, status TEXT, -- 'pending', 'accepted', 'dismissed' created_at TIMESTAMP, resolved_at TIMESTAMP );
Run sqlite3 ~/.datavor/context.db .schema on any Datavor install to see your actual tables — these and a few internal ones for indexing.
The Context Engine is on your machine. It stays on your machine. Free tier is 100% local — nothing leaves. Pro sends a tiny daily license-validation heartbeat (aggregate counts, no content). Here's the strict line, drawn:
The Context Engine is exposed entirely through MCP — your AI tool reads and writes it through these 11 tools. Full reference in the docs.
| Tool | Purpose | Component |
|---|---|---|
get_context |
Everything Datavor knows: databases, rules, relationships, recipes, recent suggestions. | All |
add_rule |
Save a business rule with scope and predicate. | Rules |
update_rule |
Modify an existing rule's predicate or scope. | Rules |
remove_rule |
Delete a rule. Past job runs that used it remain unaffected. | Rules |
save_recipe |
Save a transform configuration as a named, versioned recipe. | Recipes |
apply_recipe |
Apply a saved recipe to a new sync configuration by name. | Recipes |
list_recipes |
List saved recipes, optionally filtered by connection, table, or tags. | Recipes |
get_suggestions |
Get pending suggestions for review — schema changes, sync recoveries, optimizations. | Suggest |
accept_suggestion |
Apply a suggestion. May silently create rules, recipes, or schedule jobs. | Suggest |
dismiss_suggestion |
Reject a suggestion. It won't be re-surfaced for the same pattern. | Suggest |
transform_preview |
Preview what transforms will produce on sample data before running. | Recipes |
| Capability | Datavor | Fivetran | Airbyte | integrate.io |
|---|---|---|---|---|
| Persistent rules that apply across syncs | ✓ | — | — | — |
| Named, versioned, reusable transform recipes | ✓ | partial | — | ✓ |
| Error patterns surfaced from past failures | ✓ | — | — | — |
| Proactive suggestions based on schema drift | ✓ | — | — | — |
| State lives entirely on your machine | ✓ | — | — | — |
| Inspectable as a plain SQL file | ✓ | — | — | — |
| AI tool can read it without permissions | ✓ | — | — | — |
Competitors that have some of these features keep them in their cloud, behind their UI. None expose them as a flat SQLite file your AI can query as freely as it queries your databases. That gap is the Context Engine.
The Context Engine ships with every Datavor install — Free or Pro. No setup, no configuration. The file starts empty and fills itself.