transformations:aggregate
This is an old revision of the document!
Aggregate
This transformation calculates the aggregates for selected columns grouped by another set of columns.
Available aggregations:
- Sum
- Count
- Count only distinct values
- Average
- Min
- Max
- Any
EXAMPLE
Source table: The longest rivers in the world
River | Length(km) | Continent |
---|---|---|
Nile | 6650 | Africa |
Amazon | 6400 | South America |
Yangtze | 6300 | Asia |
Mississippi | 6275 | North America |
Yenisei | 5539 | Asia |
Yellow River | 5464 | Asia |
Ob | 5410 | Asia |
Paraná | 4880 | South America |
Objective: Find out the longest river on each continent.
Transformation parameters:
- Calculate: Length(km)
- Aggregation: Max
- Group by: Continent
Output table:
River | Continent | Max of Length(km) |
---|---|---|
Nile | Africa | 6650 |
Yangtze | Asia | 6300 |
Mississippi | North America | 6275 |
Amazon | South America | 6400 |
Notes
The "Any" aggregation picks only 1 arbitrary (random) value from a group of values and discards the rest.
The "Any" aggregation is typically used for non-numeric values, where all values in a group are known to be the same. E.g. "Any" applied to "ABC", "ABC", "ABC" will return "ABC".
transformations/aggregate.1581351705.txt.gz · Last modified: 2020/02/10 11:21 by dmitry