Skip to main content

Use TO_MANY without Aggregation

Let’s check a case in the traditional RDBMS to know what happened if used TO_MANY without aggregation.

D select c_custkey, o_orderkey, o_totalprice 
from tpch.customer join tpch.orders
on customer.c_custkey = orders.o_custkey
order by 1,2
limit 5;
┌───────────┬────────────┬───────────────┐
│ c_custkey │ o_orderkey │ o_totalprice │
│ int32 │ int32 │ decimal(15,2)
├───────────┼────────────┼───────────────┤
19154357345.46
11465628599.83
124322231040.44
131653152411.41
13401989230.03
└───────────┴────────────┴───────────────┘

D select count(*) from tpch.customer;
┌──────────────┐
│ count_star()
│ int64 │
├──────────────┤
1500
└──────────────┘

D select count(*) from tpch.customer
join tpch.orders on customer.c_custkey = orders.o_custkey;
┌──────────────┐
│ count_star()
│ int64 │
├──────────────┤
15000
└──────────────┘

It’s easy to notice that a TO_MANY relationship will increase the number of rows. For example, the row count of Customer is 1500. However, if we join it with Orders, the row count will increase to 15000. It makes sense in a traditional RDBMS. However, as a column in a model, it shouldn’t impact the row count of a model.