Skip to main content

Memory

Wren Engine includes a memory layer — a LanceDB-backed semantic index that gives AI agents the context they need to write accurate SQL. Instead of sending the entire schema to an LLM on every question, the memory layer provides targeted context: relevant tables, columns, and past query examples.

Why memory matters

Without memory, an AI agent must either:

  • Receive the full schema in every prompt — works for small databases but quickly exceeds context limits
  • Guess which tables are relevant — leads to hallucinated column names and wrong joins

The memory layer solves both problems by indexing the MDL schema and storing confirmed NL-SQL pairs. At query time it retrieves only what the agent needs.

What gets indexed

The memory layer manages two collections:

CollectionContentsSourceRebuildable?
schema_itemsModels, columns, relationships, views, instructionsMDL manifest + instructions.mdYes — wren memory index
query_historyNatural-language → SQL pairsStored after successful queriesNo — built up over time

Both collections live in <project>/.wren/memory/ (or ~/.wren/memory/ outside a project).

Installation

The memory system requires the memory extra:

pip install "wren-engine[memory]"

Indexing the schema

After creating or updating your MDL project, index the schema:

wren memory index

This parses the compiled target/mdl.json, generates local embeddings for every schema item, and stores them in LanceDB. Re-index whenever you change models, columns, relationships, or instructions:

wren context build
wren memory index

Check the index status:

wren memory status
# Path: /Users/you/my-project/.wren/memory
# schema_items: 47 rows
# query_history: 12 rows

Fetching schema context

wren memory fetch is the primary way agents get schema context. It automatically picks the best retrieval strategy based on schema size:

Schema sizeStrategyWhat the agent sees
Below 30,000 chars (~8K tokens)Full textComplete schema with all model-column relationships, join paths, and primary keys
Above 30,000 charsEmbedding searchTop-k most relevant fragments for the query
wren memory fetch -q "customer order price"
wren memory fetch -q "revenue" --type column --model orders
wren memory fetch -q "日期" --threshold 50000 --output json

Why hybrid?

Small schemas give better results as full text — the LLM sees the complete structure rather than isolated fragments. Large schemas don't fit in a single prompt, so embedding search retrieves only what's relevant.

The threshold is measured in characters (not tokens) because character counting is free. The 4:1 chars-to-tokens ratio holds for English; CJK text compresses less (~1.5:1), so CJK-heavy schemas switch to search sooner — the conservative direction.

Override with --threshold:

wren memory fetch -q "revenue" --threshold 50000   # raise for larger context windows

wren memory describe prints the entire schema as structured plain text — no embeddings or LanceDB required:

wren memory describe

Storing and recalling queries

Every successful query can be stored as a natural-language → SQL pair. These pairs serve as few-shot examples for future questions — the more you store, the better the agent gets at writing SQL for your domain.

Storing a query

wren memory store \
--nl "top 5 customers by revenue last quarter" \
--sql "SELECT c.first_name, SUM(o.amount) AS revenue FROM customers c JOIN orders o ON c.customer_id = o.customer_id WHERE o.order_date >= '2024-10-01' GROUP BY 1 ORDER BY 2 DESC LIMIT 5" \
--datasource duckdb

When to store:

  • Query executed successfully and the result is correct
  • There is a clear natural-language question behind the query

When NOT to store:

  • The query failed or returned wrong results
  • The query is exploratory / throwaway (SELECT * FROM orders LIMIT 5)
  • There is no natural-language question — just raw SQL

Recalling similar queries

Before writing new SQL, search for similar past queries:

wren memory recall -q "best customers"
wren memory recall -q "月度營收" --datasource mysql --limit 5 --output json

Results are returned ranked by semantic similarity. Use them as few-shot examples — adapt the SQL pattern to the current question.

Agent workflow

The memory layer fits into the agent's query workflow like this:

User asks a question

├── 1. wren memory recall -q "..." → find similar past queries (few-shot examples)
├── 2. wren memory fetch -q "..." → get relevant schema context
├── 3. Write SQL using examples + context
├── 4. wren --sql "..." → execute

└── 5. wren memory store --nl "..." --sql "..." → save for future recall

Each stored query improves future recall accuracy — the system learns from usage.

Housekeeping

wren memory status              # show index stats
wren memory reset # drop all tables (prompts for confirmation)
wren memory reset --force # drop without confirmation

Storage and version control

Memory files are binary (LanceDB format) and stored in <project>/.wren/memory/. By default this directory is gitignored.

  • schema_items — fully rebuildable from wren memory index, safe to delete
  • query_history — accumulated NL-SQL pairs from usage, not rebuildable

If your team wants to share confirmed query history as few-shot examples, you can commit .wren/memory/ — but be aware that LanceDB files may produce merge conflicts when multiple people store concurrently.