Skip to main content

Advanced Tutorial

What to expect from this tutorial?

In this tutorial, you will use many convenient features in the Modeling Definition Language (MDL).

NOTE: This tutorial assumes you have access to a running Wren Engine. If not, check out the QuickStart and Initialize.

Before we use Wren Engine, let us show you the most important part, Modeling Definition Language (MDL). We use the MDL to define references and relationships, it provides many convenient features. Learn more at Modeling Overview

Prepare MDL file

You must create a file name mdl.json into the path etc/mdl.

├── etc
│ ├── mdl
│ │ └── mdl.json
│ ├── config.properties

Set the property wren.directory with etc/mdl in the etc/config.properties.

wren.directory=etc/mdl

All MDL settings will be in the etc/mdl/mdl.json. Let’s modify the mdl.json for you now.

What is Modeling Definition Language (MDL)

What is Modeling Definition Language (MDL)?

The basic elements of MDL must have these fields.

  • catalog: You can define a new catalog name to abstract the original catalog.
  • schema: You can define a new schema name to abstract the original schema.
  • models: Define a new model to reference real data entities.
{
"catalog": "my_catalog",
"schema": "my_schema",
"models": [
{
"name": "Orders",
"properties": {},
"refSql": "select * from tpch.orders",
"columns": [
{
"name": "orderkey",
"expression": "o_orderkey",
"type": "integer"
},
{
"name": "custkey",
"type": "integer",
"expression": "o_custkey"
}
],
"primaryKey": "orderkey"
},
{
"name": "Customer",
"refSql": "select * from tpch.customer",
"columns": [
{
"name": "custkey",
"expression": "c_custkey",
"type": "integer"
},
{
"name": "name",
"expression": "c_name",
"type": "varchar"
}
],
"primaryKey": "custkey"
}
]
}

What is Model

Model is a logical layer for your table, allowing you to enhance its semantics without impacting the real data. You can only expose the specific data you intend to.

The model is a reference to your physical table. Wren Engine has some attributes used to describe a model. Typically, we can define a model as a JSON object:

More details about fields you can find them in the Model

Query Models

Let’s query your data through Wren Engine.

All datasets (model, metric, or view…) will be placed into the search path my_catalog.my_schema. Then, we can query the defined model with Wren Engine SQL:

SELECT orderkey FROM my_catalog.my_schema.Orders

You can query them just as you would with traditional RDBMS. The Wren Engine will then translate your SQL into executable SQL for your specific data source.

Add more different dataset

We provide many types of dataset

  • Model
  • Relationship
  • Metric
  • View

We will add Relationship in the mdl.json. Other data modeling descriptions you can find them in the Data Modeling - Overview

What is the Relationship

These relationships describe how to link two models together and enable seamless data integration and analysis across both entities. We will auto-join two models by joinType and condition when you query the related tables. More detail in the Relationship.

"relationships": [
{
"name": "OrdersCustomer",
"models": [ "Orders", "Customer" ],
"joinType": "MANY_TO_ONE",
"condition": "Orders.custkey = Customer.custkey"
}
]

After appending relationships, you can add a new field with relationship to relate another table in the model Orders.

{
"name": "Orders",
"refSql": "select * from tpch.orders",
"columns": [
{
"name": "orderkey",
"type": "integer",
"expression": "o_orderkey"
},
{
"name": "custkey",
"type": "integer",
"expression": "o_custkey"
},
{
"name": "customer",
"type": "Customer",
"relationship": "CustomerOrders"
}
],
"primaryKey": "orderkey"
}

Calculated field

And you should add new field with isCalculated to let you can query the field of the another table.

{
"name": "Orders",
"refSql": "select * from tpch.orders",
"columns": [
{
"name": "orderkey",
"type": "integer",
"expression": "o_orderkey"
},
{
"name": "custkey",
"type": "integer",
"expression": "o_custkey"
},
{
"name": "customer",
"type": "Customer",
"relationship": "CustomerOrders"
},
{
"name": "customer_name",
"type": "varhcar",
"isCalculated": true,
"expression": "customer.name"
}
],
"primaryKey": "orderkey"
}

You will build a MDL to query calculated field like below.

{
"catalog": "my_catalog",
"schema": "my_schema",
"models": [
{
"name": "Orders",
"refSql": "select * from tpch.orders",
"columns": [
{
"name": "orderkey",
"type": "integer",
"expression": "o_orderkey"
},
{
"name": "custkey",
"type": "integer",
"expression": "o_custkey"
},
{
"name": "customer",
"type": "Customer",
"relationship": "CustomerOrders"
},
{
"name": "customer_name",
"type": "varhcar",
"isCalculated": true,
"expression": "customer.name"
}
],
"primaryKey": "orderkey"
},
{
"name": "Customer",
"refSql": "select * from tpch.customer",
"columns": [
{
"name": "custkey",
"expression": "c_custkey",
"type": "integer"
},
{
"name": "name",
"expression": "c_name",
"type": "varchar"
}
],
"primaryKey": "custkey"
}
],
"relationships": [
{
"name": "OrdersCustomer",
"models": [ "Orders", "Customer" ],
"joinType": "MANY_TO_ONE",
"condition": "Orders.custkey = Customer.custkey"
}
]
}

Now, you can directly query the name of the Customer that related to Orders

SELECT customer_name FROM Orders;
   customer_name
--------------------
Customer#000000370
Customer#000000781
Customer#000001234
Customer#000001369
Customer#000000557
Customer#000000392
...

In the original database, the SQL that you want to get the name of the Customer via o_custkey of Orders may be like the below:

SELECT
Customer.c_name
FROM
tpch.orders AS Orders
JOIN
tpch.customer AS Customer ON Orders.o_custkey = Customer.c_custkey;

Via the calculated field of the Wren Engine, you can query related data more intuitively.

What’s Next?