Skip to main content

Dynamic Query

Overview

The Wren engine provides many ways to define your calculations, such as calculated fields or metrics. However, by default, all datasets in the Wren engine are static. They are generated automatically regardless of whether you use them or not.

In this section, we will introduce a query mode called dynamic query, which will make the Wren Engine smarter by generating only what you have used.

Enable Dynamic Query mode

Simply enable the specific config in the config file, then you can query what you want.

wren.experimental-enable-dynamic-fields=true

Dynamic Query the Calculated field in Model

For model, wren engine will generate the calculated field you used in query only. Let’s see an example quickly. Given two model, Orders and Customer .

{
"name": "Customer",
"refSql": "select * from tpch.customer",
"columns": [
{
"name": "custkey",
"type": "integer",
"expression": "c_orderkey"
},
{
"name": "name",
"type": "varchar",
"expression": "c_name"
},
{..}
{
"name": "count_orders",
"type": "int",
"isCalculated": true,
"expression": "sum(orders.orderkey)"
},
],
"primaryKey": "custkey"
},
{
"name": "Orders",
"refSql": "select * from tpch.orders",
"columns": [
{
"name": "orderkey",
"type": "integer",
"expression": "o_orderkey"
},
{
"name": "custkey",
"type": "integer",
"expression": "o_custkey"
},
{
"name": "totalprice",
"type": "integer",
"expression": "o_totalprice"
},
{
"name": "customer",
"type": "Customer",
"relationship": "CustomerOrders"
},
{
"name": "customer_name",
"type": "varchar",
"isCalculated": true,
"expression": "customer.name"
},
{
"name": "customer_name_lowercase",
"type": "varchar",
"isCalculated": true,
"expression": "lower(customer.name)"
}
],
"primaryKey": "orderkey"
}

By default, if you query Orders and select specific columns like

select customer_name from Orders

You would get the SQL like

WITH
"Customer" AS (
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"',count_orders
FROM
(
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
c_orderkey "custkey"
, c_name "name"
FROM
(
SELECT *
FROM
tpch.customer
) "Customer"
) "Customer"
) "Customer"
)
, "Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
, "Orders_relationsub"."customer_name_lowercase" "customer_name_lowercase"
FROM
((
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
FROM
(
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
) "Orders"
LEFT JOIN (
SELECT
"Orders"."orderkey"
, "Customer"."name" "customer_name"
, lower("Customer"."name") "customer_name_lowercase"
FROM
((
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
LEFT JOIN "Customer" ON ("Customer"."custkey" = "Orders"."custkey"))
) "Orders_relationsub" ON ("Orders"."orderkey" = "Orders_relationsub"."orderkey"))
)
SELECT customer_name
FROM
Orders

You can find the output of Orders model in generated SQL:

"Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
, "Orders_relationsub"."customer_name_lowercase" "customer_name_lowercase"
FROM

It includes all the columns, whether you've used them or not.

This may cause some issues:

  • It's easy to encounter cycling dependency errors.
  • Additional optimization effort is required for the query engine.
    • If the data source can't optimize the unused projection well, it may incur additional computing costs.

How about enable dynamic query?

After enable the config, submit the same query:

select customer_name from Orders

We can get the generated SQL

WITH
"Customer" AS (
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
c_orderkey "custkey"
, c_name "name"
FROM
(
SELECT *
FROM
tpch.customer
) "Customer"
) "Customer"
) "Customer"
)
, "Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
FROM
((
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
FROM
(
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
) "Orders"
LEFT JOIN (
SELECT
"Orders"."orderkey"
, "Customer"."name" "customer_name"
FROM
((
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
LEFT JOIN "Customer" ON ("Customer"."custkey" = "Orders"."custkey"))
) "Orders_relationsub" ON ("Orders"."orderkey" = "Orders_relationsub"."orderkey"))
)
SELECT customer_name
FROM
Orders

You can find the output of Orders is different.

"Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
FROM

It only includes the columns you need. This solves the problem of being unable to query the model when using relationships on both the left and right sides.

Also Work for View

In dynamic query mode, the SQL stored in a view also follows this behavior, whether the dynamic query is enabled or not when you create this view. The behavior of the view is according to the runtime configuration. You can expect it to be the same as querying the view statement directly.