Build MDL from an OSI semantic model

Open Semantic Interchange (OSI) is a vendor-neutral specification for semantic models, backed by Snowflake, Salesforce, dbt Labs, Databricks, Cube, AtScale, and others. If your team already publishes one OSI YAML as the single source of truth for analytics definitions, Wren can read it directly and build target/mdl.json without forking the model into a parallel wren project.

wren context build --from-osi semantic_model.yaml --data-source postgres
# → ./target/mdl.json

OSI project vs. wren project

	Wren project (native)	OSI project, kept as source	OSI project, migrated once
Source files	`wren_project.yml` + `models/<name>/metadata.yml` + `views/` + `relationships.yml`	A single `*.yaml` OSI file	After migration: wren project layout
Author	You (or the `generate-mdl` agent guide)	OSI tooling / vendor / external team	You take ownership after `init`
Wren commands	`wren context init` → edit → `build`	`wren context build --from-osi <file>`	`wren context init --from-osi <file>` → edit → `build`
Editable inside Wren?	Yes — that's the point	No — Wren reads the file as-is	Yes after migration
Where wren-specific hints live	Each model's YAML	OSI's `custom_extensions[vendor_name=WREN]` block	Each model's YAML (lifted from OSI at migration time)
Stays in sync with OSI?	n/a	Yes — every build re-reads OSI	No — one-way snapshot

Three places you might land:

You own the semantic model from day one → use the native wren project flow.
The OSI file is shared with other tools (Snowflake Cortex, Salesforce, Cube, …) and you want Wren to be one of many consumers without forking → use wren context build --from-osi. Every build re-reads the OSI file.
You started with an OSI file but now want full control inside Wren (cubes, views, RLAC, custom calculations OSI doesn't model) → use wren context init --from-osi to migrate once, then edit the generated YAML directly. See Migrate to a native wren project.

All three paths produce the same target/mdl.json shape — Wren doesn't care how you got there.

Quick start

Suppose you have this OSI file (excerpted from a TPC-DS retail example):

# semantic_model.yaml
version: "0.2.0"
semantic_model:
  - name: tpcds_retail
    ai_context:
      instructions: "Retail analytics. Revenue is in USD."
    datasets:
      - name: store_sales
        source: tpcds.public.store_sales
        primary_key: [ss_item_sk, ss_ticket_number]
        fields:
          - name: ss_item_sk
            expression: ss_item_sk
          - name: ss_ext_sales_price
            expression: ss_ext_sales_price
      - name: customer
        source: tpcds.public.customer
        primary_key: [c_customer_sk]
        fields:
          - name: c_customer_sk
            expression: c_customer_sk
          - name: full_name
            expression:
              dialects:
                - dialect: ANSI_SQL
                  expression: c_first_name || ' ' || c_last_name
    relationships:
      - name: sales_to_customer
        from: store_sales
        to: customer
        from_columns: [ss_item_sk]
        to_columns: [c_customer_sk]

Run:

wren context build --from-osi semantic_model.yaml --data-source postgres

Wren reads the file, converts each dataset to a model, synthesizes relationship join conditions, packs OSI's ai_context.instructions into the manifest, and writes ./target/mdl.json — ready for wren --sql or any agent skill.

You'll also see warnings telling you what's missing for an ideal manifest — see What needs a WREN block below.

The `WREN` extension block

OSI has no file-system convention for vendor extensions. Its only sanctioned extension mechanism is the in-document custom_extensions: [{vendor_name, data}] field, present at every level of the document. Wren reads vendor_name: WREN entries and ignores everything else.

A typical extension block at the OSI document root:

custom_extensions:
  - vendor_name: WREN
    data: |
      {
        "dialect": "SNOWFLAKE",
        "metrics": "note"
      }

Per-dataset overrides, for the things OSI cannot express (most importantly: column types):

datasets:
  - name: store_sales
    source: tpcds.public.store_sales
    custom_extensions:
      - vendor_name: WREN
        data: |
          {
            "column_types": {
              "ss_sold_date_sk": "DATE",
              "ss_ext_sales_price": "DECIMAL(18,2)",
              "ss_quantity": "INTEGER"
            },
            "primary_key": "ss_ticket_number"
          }
    fields:
      - name: amount_eur
        expression: { dialects: [{dialect: ANSI_SQL, expression: amount * 0.92}] }
        custom_extensions:
          - vendor_name: WREN
            data: '{"type": "DECIMAL(18,2)", "not_null": true}'

The data field is per spec a JSON string. Wren also tolerates a raw YAML dict for hand-authored files.

Supported keys

Key	Scope	Effect
`dialect`	root, semantic_model	Which OSI dialect to extract from `expression.dialects[]`. Default: `ANSI_SQL`, with auto-inference for `snowflake` / `databricks` data sources.
`metrics`	root, semantic_model	How to handle OSI top-level `metrics`: `note` (default, append as instructions) or `skip`.
`default_semantic_model`	root	When the file has multiple `semantic_model[]` entries, which one to build.
`column_types`	semantic_model: `{dataset: {field: type}}` dataset: `{field: type}`	Column types — OSI has no native type system.
`primary_key`	semantic_model: `{dataset: column}` dataset: `column`	Pick one column when OSI defines a composite primary key.
`type`, `not_null`	field	Per-field overrides.

Precedence

Resolution order, highest to lowest:

CLI flag (--data-source, --semantic-model, --dialect, …)
semantic_model[i].custom_extensions[vendor_name=WREN]
custom_extensions[vendor_name=WREN] at the document root
Dataset / field custom_extensions[vendor_name=WREN] (for per-dataset / per-field keys only)
Built-in defaults

Two-level support (root + semantic_model) means you can write a global default once and only add overrides on individual semantic_model entries that need to differ — for example, when the file contains multiple semantic models.

Mapping summary

OSI	Wren MDL
`dataset.source: a.b.c`	`model.table_reference: {catalog: a, schema: b, table: c}`
`dataset.source` containing `SELECT`/`FROM`/newlines	`model.ref_sql`
`dataset.fields[*].name` + `expression`	`model.columns[]` — dialect-picked expression, type from WREN block or `is_time` hint
`dataset.primary_key: [x]`	`model.primary_key: x` (composite arrays warn + take the first column unless overridden)
`relationships[]` (always many-to-one in OSI)	wren `relationships` with synthesized `condition` and `MANY_TO_ONE`
`field.dimension.is_time: true`	`column.type: TIMESTAMP` (when no explicit type override)
`semantic_model.ai_context.instructions`	MDL `_instructions`
`metrics[]` (top-level OSI)	Rendered as markdown notes appended to instructions
`unique_keys`	Not converted — Wren has no equivalent
`custom_extensions` for other vendors	Ignored

What needs a `WREN` block

For each of these cases, validate --from-osi emits a warning with a copy-pasteable YAML snippet you can paste straight into your OSI file.

Untyped fields

OSI doesn't carry column types. Without a WREN override, every field defaults to VARCHAR (or TIMESTAMP if dimension.is_time: true). Validate prints, per dataset:

[WARNING] dataset 'store_sales': 8 field(s) have no type — defaulted to VARCHAR.
  Add to dataset 'store_sales' in the OSI file:

    custom_extensions:
      - vendor_name: WREN
        data: |
          {
            "column_types": {
              "ss_sold_date_sk": "DATE",
              "ss_ext_sales_price": "DECIMAL(18,2)",
              ...
            }
          }

Composite primary keys

OSI allows primary_key: [col_a, col_b]; Wren MDL takes one column. Wren picks the first column and emits a snippet so you can override:

[WARNING] dataset 'store_sales': composite primary_key
  ['ss_item_sk', 'ss_ticket_number'] — Wren MDL takes one column.
  Wren picked 'ss_item_sk'. To override, add to dataset 'store_sales':

    custom_extensions:
      - vendor_name: WREN
        data: '{"primary_key": "<one of: ss_item_sk, ss_ticket_number>"}'

Cross-dataset metrics

Wren cubes are bound to a single base object, so OSI top-level metrics that aggregate across multiple datasets cannot become first-class cubes. They are rendered as markdown notes appended to instructions, available to the LLM but not as queryable measures:

[WARNING] metric 'customer_lifetime_value': expression references 2 datasets
  (customer, store_sales) — emitted as instruction note only.

Metrics that reference a single dataset get the same treatment by default (metrics: note); set metrics: skip in the WREN block to drop them entirely.

Multiple `semantic_model` entries

OSI allows multiple semantic_model[] entries in one file. Wren builds one MDL per invocation, so an ambiguous file is a hard error:

[ERROR] OSI file has 2 semantic_models: model_a, model_b.
  Pass --semantic-model <name> or add at the OSI document root:

    custom_extensions:
      - vendor_name: WREN
        data: '{"default_semantic_model": "<name>"}'

CLI commands

# Build (writes ./target/mdl.json by default)
wren context build --from-osi semantic_model.yaml --data-source postgres
wren context build --from-osi semantic_model.yaml --data-source snowflake \
  --output build/mdl.json

# Migrate to a native wren project (one-way)
wren context init --from-osi semantic_model.yaml --data-source postgres \
  --path my_project

# Validate — lints the conversion and prints actionable snippets
wren context validate --from-osi semantic_model.yaml --data-source postgres
wren context validate --from-osi semantic_model.yaml --data-source postgres --strict
wren context validate --from-osi semantic_model.yaml --data-source postgres --verbose

# Show — preview the resulting manifest without writing it
wren context show --from-osi semantic_model.yaml --data-source postgres
wren context show --from-osi semantic_model.yaml --data-source postgres --output json
wren context show --from-osi semantic_model.yaml --data-source postgres --output yaml

--data-source is required because OSI deliberately does not carry connection or dialect environment information. Pair --from-osi with --semantic-model if the file contains more than one model.

Migrate to a native wren project

When OSI's surface area isn't enough — you need cubes, views, RLAC/CLAC, or calculated columns OSI doesn't model — convert the OSI file into a native wren project once and edit the YAML from there:

wren context init --from-osi semantic_model.yaml --data-source postgres --path my_project

This reuses the same converter as build --from-osi, then scaffolds the standard wren layout:

my_project/
├── wren_project.yml          # name lifted from OSI semantic_model.name
├── models/
│   ├── customers/metadata.yml
│   └── orders/metadata.yml
├── relationships.yml
├── instructions.md           # OSI ai_context.instructions + metrics notes
└── AGENTS.md

After migration:

The OSI file is no longer referenced. You can delete it, archive it, or keep it for diffing — Wren never reads it again.
Edit the generated YAML directly. Add cubes under cubes/, views under views/, RLAC under each model's properties, etc.
Use the standard flow: wren context validate → wren context build.

Any warnings the OSI conversion would normally emit (untyped fields, composite primary keys, cross-dataset metrics) are printed once during init so you know which spots in the generated YAML need a human review.

When not to migrate

If your OSI file is updated by an upstream team and you want Wren to stay in sync, do not migrate — use build --from-osi instead. A migrated project is a snapshot; later OSI edits won't reach Wren without manually merging or re-running init --from-osi --force (which loses your post-migration edits).

Limitations

No round-trip. Wren reads OSI but never writes back. Edits to the OSI file are made in your OSI tooling; Wren re-reads on the next build.
No cubes / measures. OSI metrics map to instruction notes, not Wren cubes. If you need first-class cubes, model them in a native wren project instead.
No views. OSI has no view concept; the generated MDL has an empty views array.
No unique_keys, no row/column-level access controls. OSI 0.2.x doesn't model these; if you need them, either author them in a sidecar wren project or wait for an upcoming OSI spec version.

For features OSI doesn't yet cover, the suggested pattern is to use the OSI file for what it does well (datasets, fields, relationships, instructions) and add a small wren-native overlay on top for the gaps. This is currently a manual merge — there is no automatic union of two sources.

OSI project vs. wren project​

Quick start​

The WREN extension block​

Supported keys​

Precedence​

Mapping summary​

What needs a WREN block​

Untyped fields​

Composite primary keys​

Cross-dataset metrics​

Multiple semantic_model entries​

CLI commands​

Migrate to a native wren project​

When not to migrate​

Limitations​

See also​