GROUP BY
The GROUP BY
clause groups rows that have the same values in specified columns into summary rows. This is typically used with the collect()
function to create collections of values from grouped data.
Elements
column
-
A column name or expression by which to group the results. Multiple columns can be specified, separated by commas.
Features
- Grouping
-
Groups rows with the same values in the specified columns into a single row in the result.
- Collection Creation
-
When combined with the
collect()
function, creates arrays or collections of values from the grouped rows. A collection will also be implicitly derived when a wildcard*
is aliased to a list field in the result object (for example,* AS someList
).
Examples
SELECT category, collect(*) AS products
FROM products
GROUP BY category
SELECT category, collect(name) AS productNames
FROM products
GROUP BY category
SELECT manufacturer, category, collect(*) AS products
FROM products
GROUP BY manufacturer, category
SELECT category, collect(distinct name) AS uniqueProductNames
FROM products
GROUP BY category
SELECT
category,
(
collect(name) AS productNames,
collect((price, manufacturer) AS productDetails) AS products,
collect(distinct manufacturer) AS uniqueManufacturers
) AS categoryData
FROM products
GROUP BY category
Notes
-
The
GROUP BY
clause groups rows with identical values in the specified columns -
All non-aggregated columns in the
SELECT
clause should appear in theGROUP BY
clause -
When using
GROUP BY
, each column in theSELECT
list should either be included in theGROUP BY
clause or be used with thecollect()
function -
The
collect()
function is the primary means to create collections of values from grouped rows -
Use
collect(distinct …)
to eliminate duplicate values in the resulting collections
Result Transformation
When using GROUP BY
with collect()
, the results are transformed into a hierarchical structure. The grouping columns become fields in the result objects, and the collected values become arrays or nested objects.
This transformation is particularly useful for representing one-to-many relationships in your query results, allowing you to construct complex data structures directly from your query.
Current Limitations
-
Currently, the only aggregation supported with
GROUP BY
is collection creation using thecollect()
function -
Traditional aggregation functions like
count()
,sum()
,avg()
, etc. are not yet supported within grouped results -
The
total_count()
function can be used to get a count of all matching rows, but not per group
Related Features
-
SELECT clause - Specifies what data to retrieve
-
collect() function - Creates collections from grouped rows
-
DISTINCT keyword - Eliminates duplicate values
-
total_count() function - Counts total matching rows