Dynamic Query
Overview
The Wren engine provides many ways to define your calculations, such as calculated fields or metrics. However, by default, all datasets in the Wren engine are static. They are generated automatically regardless of whether you use them or not.
In this section, we will introduce a query mode called dynamic query, which will make the Wren Engine smarter by generating only what you have used.
Enable Dynamic Query mode
Simply enable the specific config in the config file, then you can query what you want.
wren.experimental-enable-dynamic-fields=true
Dynamic Query the Calculated field in Model
For model, wren engine will generate the calculated field you used in query only. Let’s see an example quickly. Given two model, Orders
and Customer
.
{
"name": "Customer",
"refSql": "select * from tpch.customer",
"columns": [
{
"name": "custkey",
"type": "integer",
"expression": "c_orderkey"
},
{
"name": "name",
"type": "varchar",
"expression": "c_name"
},
{..}
{
"name": "count_orders",
"type": "int",
"isCalculated": true,
"expression": "sum(orders.orderkey)"
},
],
"primaryKey": "custkey"
},
{
"name": "Orders",
"refSql": "select * from tpch.orders",
"columns": [
{
"name": "orderkey",
"type": "integer",
"expression": "o_orderkey"
},
{
"name": "custkey",
"type": "integer",
"expression": "o_custkey"
},
{
"name": "totalprice",
"type": "integer",
"expression": "o_totalprice"
},
{
"name": "customer",
"type": "Customer",
"relationship": "CustomerOrders"
},
{
"name": "customer_name",
"type": "varchar",
"isCalculated": true,
"expression": "customer.name"
},
{
"name": "customer_name_lowercase",
"type": "varchar",
"isCalculated": true,
"expression": "lower(customer.name)"
}
],
"primaryKey": "orderkey"
}
By default, if you query Orders
and select specific columns like
select customer_name from Orders
You would get the SQL like
WITH
"Customer" AS (
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"',count_orders
FROM
(
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
c_orderkey "custkey"
, c_name "name"
FROM
(
SELECT *
FROM
tpch.customer
) "Customer"
) "Customer"
) "Customer"
)
, "Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
, "Orders_relationsub"."customer_name_lowercase" "customer_name_lowercase"
FROM
((
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
FROM
(
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
) "Orders"
LEFT JOIN (
SELECT
"Orders"."orderkey"
, "Customer"."name" "customer_name"
, lower("Customer"."name") "customer_name_lowercase"
FROM
((
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
LEFT JOIN "Customer" ON ("Customer"."custkey" = "Orders"."custkey"))
) "Orders_relationsub" ON ("Orders"."orderkey" = "Orders_relationsub"."orderkey"))
)
SELECT customer_name
FROM
Orders
You can find the output of Orders model in generated SQL:
"Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
, "Orders_relationsub"."customer_name_lowercase" "customer_name_lowercase"
FROM
It includes all the columns, whether you've used them or not.
This may cause some issues:
- It's easy to encounter cycling dependency errors.
- Additional optimization effort is required for the query engine.
- If the data source can't optimize the unused projection well, it may incur additional computing costs.
How about enable dynamic query?
After setting up the config, submit the same query:
select customer_name from Orders
We can get the generated SQL
WITH
"Customer" AS (
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
"Customer"."custkey" "custkey"
, "Customer"."name" "name"
FROM
(
SELECT
c_orderkey "custkey"
, c_name "name"
FROM
(
SELECT *
FROM
tpch.customer
) "Customer"
) "Customer"
) "Customer"
)
, "Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
FROM
((
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
FROM
(
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
) "Orders"
LEFT JOIN (
SELECT
"Orders"."orderkey"
, "Customer"."name" "customer_name"
FROM
((
SELECT
o_orderkey "orderkey"
, o_custkey "custkey"
, o_totalprice "totalprice"
FROM
(
SELECT *
FROM
tpch.orders
) "Orders"
) "Orders"
LEFT JOIN "Customer" ON ("Customer"."custkey" = "Orders"."custkey"))
) "Orders_relationsub" ON ("Orders"."orderkey" = "Orders_relationsub"."orderkey"))
)
SELECT customer_name
FROM
Orders
You can find the output of Orders is different.
"Orders" AS (
SELECT
"Orders"."orderkey" "orderkey"
, "Orders"."custkey" "custkey"
, "Orders"."totalprice" "totalprice"
, "Orders_relationsub"."customer_name" "customer_name"
FROM
It only includes the columns you need. This solves the problem of being unable to query the model when using relationships on both the left and right sides.
Dynamic Query Also Work in View
In dynamic query mode, the SQL stored in a view also follows this behavior, whether the dynamic query is enabled or not when you create this view. The behavior of the view is according to the runtime configuration. You can expect it to be the same as querying the view statement directly.