User Tools

Site Tools


transformations:aggregate

Aggregate

This transformation calculates the aggregates for selected columns grouped by another set of columns.

Available aggregations:

  • Sum
  • Count
  • Count only distinct values
  • Average
  • Min
  • Max
  • Any

EXAMPLE

Source table: The longest rivers in the world

River Length(km) Continent
Nile 6650 Africa
Amazon 6400 South America
Yangtze 6300 Asia
Mississippi 6275 North America
Yenisei 5539 Asia
Yellow River 5464 Asia
Ob 5410 Asia
Paraná 4880 South America

Objective: Find out the longest river on each continent.

Transformation parameters:

  • Calculate: Length(km)
  • Aggregation: Max
  • Group by: Continent

Output table:

River Continent Max of Length(km)
Nile Africa 6650
Yangtze Asia 6300
Mississippi North America 6275
Amazon South America 6400

Notes

The “Any” aggregation picks only 1 arbitrary (random) value from a group of values and discards the rest.

The “Any” aggregation is typically used for non-numeric values, where all values in a group are known to be the same. E.g. “Any” applied to “ABC”, “ABC”, “ABC” will return “ABC”.

See also

transformations/aggregate.txt · Last modified: 2020/02/10 16:23 by dmitry