AQLearn 2.5 / Concept
Pipe (2) and Source Table
Source lesson: AQLearn 2.5: Pipe (2). Tangible win: see the pipe as "first-argument sugar", and predict how the source table changes a metric's meaning.
Purpose of this lesson
Have you ever tried to answer "average product price among ordered items in 2022" and gotten a different number than expected?
You hit these pain points:
- Two queries look almost identical (
products | avg(products.price)vsorder_items | avg(products.price)), but the numbers differ. You cannot explain why on a call. - You read a long pipeline and lose track of which model AQL is iterating over.
- You want to read deeply nested AQL but the nesting is intimidating; the pipe and the function-call form look like different languages.
This lesson solves it: pipe is just first-argument sugar, and the left-side input is the source table that defines which rows the function iterates over. Same expression, different source, different number, by design.
Pipe is first-argument sugar
The pipe character does one thing: it takes the value on its left and uses it as the first argument of the function on its right.
left | f(arg2, arg3)
// is exactly the same as
f(left, arg2, arg3)
De-sugaring exercise
Practice rewriting pipe form as function form.
Pipe form
products
| select(double: products.price * 2)
| select(quadruple: double * 2)
Function form
select(
select(products, double: products.price * 2),
quadruple: double * 2,
)
Read right-to-left: the outermost call is the last step in the pipeline. Each previous step is the first argument of the next.
The deeper idea: source table semantics
The left-of-pipe value is more than a parameter. It is the source table. The function:
- iterates over the rows of the source table,
- can reach fields in other models through many-to-one relationships from the source,
- and aggregates with the source as the row counter.
Worked comparison: products vs order_items as source
Both pipelines compute avg(products.price) after filtering to 2022. The numbers will differ. Why?
Source = order_items
order_items
| filter(orders.created_at matches @2022)
| avg(products.price)
Iterates over each order item. Looks up the product's price for each item. Average gives extra weight to products that were ordered more often.
Reading: "average product price weighted by how often each product appears in order items in 2022".
Source = products
products
| filter(orders.created_at matches @2022)
| avg(products.price)
Iterates over each distinct product (filtered to those with 2022 orders). Each product contributes once. Pure unweighted average across the distinct ordered products.
Reading: "average price of distinct products ordered in 2022".
SQL parallel
In SQL the source table is whatever is in FROM. The same observations apply: counts, averages, and sums are weighted by the row count of the source.
-- order_items source
select avg(products.price)
from order_items
left join orders on order_items.order_id = orders.id
left join products on order_items.product_id = products.id
where date_trunc('year', orders.created_at) = '2022-01-01'
-- products source
select avg(products.price)
from products
where exists (
select 1
from order_items
left join orders on order_items.order_id = orders.id
where order_items.product_id = products.id
and date_trunc('year', orders.created_at) = '2022-01-01'
)
Cross-model aggregation: products | sum(order_items.quantity)
The source is products. The function reaches into order_items through the relationship order_items.product_id > products.id. AQL iterates per product, summing the quantity across that product's items.
explore {
dimensions {
products.name,
}
measures {
items_sold: products | sum(order_items.quantity),
}
}
One row per product, total quantity ordered for that product. Switching the source to order_items would change "iterate per item" semantics; the dimension grain dictates the rest.
Dataset diagram
Source-table choice determines which model is the "iterator" and which fields are reachable.
order_items) iterates per item, reaching into products and orders via N:1. Source B (products) iterates per distinct product. Same aggregate body, different number.
Retrieval practice
Do in AQLearn
- Open AQLearn Pipe (2).
- Rewrite the starter pipe expression as a single nested function call. Confirm the result is identical.
- Compute average product price across order items in 2022. Note the value.
- Compute average product price across products in 2022. Compare.
- Add
products | sum(order_items.quantity)as a measure and observe one row per product.
Next local lesson
Next: Function: select & group (2). The most common AQL pitfall and how to fix it with nested aggregations.