AQLearn 2.6 / Pitfall
select() and group() (2)
Source lesson: AQLearn 2.6: Function: select & group (2). Tangible win: avoid the most common AQL beginner mistake and replace it with a proper nested-aggregation pattern.
Purpose of this lesson
Have you ever tried to compute "average age per continent, but only for countries" and gotten the error "column may refer to multiple values"?
You hit these pain points:
- You wrote
users | group(countries.name) | avg(users.age)because it reads naturally. AQL refused. - The error message says "not grouped and may refer to multiple values", but you cannot map it back to the AQL you wrote.
- You do not know if you want "average across the whole continent" or "highest average age among countries". Without naming that intent, you cannot pick the right syntax.
This lesson solves it: name the rule (group() and select() always travel together), then choose between flattening the group or doing a real nested aggregation. Both options are short, both are unambiguous.
The trap, in one line
// Invalid. AQL will reject this.
users | group(countries.name) | avg(users.age)
Reads like "group by country, then average age". The problem: avg() is a Scalar-output aggregate (see Lesson 0012). It must produce exactly one value per row. But group(countries.name) packs many user ages into each country row. The aggregate cannot decide which set of ages to average.
The column users.age is not grouped and may refer to multiple values. AQL is asking you "which slice of users.age do you want me to average?"
Visualize the conflict
The rule
group() and select() go hand-in-hand. After grouping, the next step must be select() with explicit aggregates that say "for each group, give me this single value".
If you do not have a select() after group(), you are not building a derived table; you are confusing AQL.
Two valid alternatives
Pick based on what you actually want.
Option A: drop the group
If you want a single number across the whole filtered set (one average per explore dimension), just remove group():
users | avg(users.age)
Result: average age across all users in the current explore context. AQL respects the explore dimensions (for example, countries.continent_name) so you still get one average per continent.
Option B: nested aggregation
If you want "highest average age among countries in each continent", build the inner table with group + select, then aggregate again:
users
| group(countries.name)
| select(
countries.name,
avg_age: avg(users.age),
)
| max(avg_age)
Read bottom-up: average age per country, then take the max across countries. The explore dimension (continent) shapes the outer max.
How to read your own intent
| Intent (plain English) | AQL pattern |
|---|---|
| Average across the whole group set as defined by the explore | users | avg(users.age) |
| An aggregate of per-group aggregates (max of averages, average of counts, etc.) | users | group(X) | select(metric: avg(users.age)) | max(metric) |
| A derived table you want to keep using | users | group(X) | select(...) (Table out) |
SQL parallel
SQL emits the same warning, with different words. If you write a SELECT with non-aggregated columns that are not in GROUP BY, the database refuses. AQL is enforcing the same shape rule at metric-expression time.
-- SQL equivalent of Option B
select
countries.continent_name
, max(country_avg_age) as max_avg_age
from (
select
countries.continent_name
, countries.name as country_name
, avg(users.age) as country_avg_age
from users
left join cities on users.city_id = cities.id
left join countries on cities.country_code = countries.code
group by countries.continent_name, countries.name
) t
group by countries.continent_name
Dataset diagram
This lesson groups by countries.name while keeping countries.continent_name as the outer explore dimension. Both are reached from users via the relationship chain.
users. Inner group: countries.name (one row per country). Outer dimension: countries.continent_name (set by the explore). Inner aggregate: avg(users.age). Outer aggregate: max(avg_age).
Retrieval practice
Do in AQLearn
- Open AQLearn select & group (2).
- Run the starter expression and read the "may refer to multiple values" error.
- Replace
| avg(users.age)with| select(countries.name, avg_age: avg(users.age)). Run. - Append
| max(avg_age)to get the highest country average per continent. - Now try the simpler fix: remove
group()entirely and run. Compare.
Next local lesson
Next: Nested aggregations (1). Apply the group + select + outer aggregate pattern to real questions.