transformations:keepduplicates
Table of Contents
KEEP DUPLICATES
Category: Transform / Filters
Description
This action keeps all duplicate rows and removes all unique rows. This operation can be performed on all columns, or only on specific columns. In the latter case, the uniqueness of values in columns not selected will be ignored.
Use cases
In datasets where all records should be unique, this action helps clean them by looking for duplicate values/records and pulling them out for review - to determine if those records warrant removal or modification.
Action settings
Setting | Description |
---|---|
Apply to | Select whether to check the values in all columns for duplicates, or just specified columns. Options: All columns or Selected columns (and select the columns to check). |
Examples
Example: Find duplicates in column "Continent".
Source table: The longest rivers in the world
River | Length (km) | Continent |
---|---|---|
Nile | 6650 | Africa |
Amazon | 6400 | South America |
Yangtze | 6300 | Eurasia |
Yellow River (Huang He) | 5464 | Eurasia |
Action parameters:
Apply to "Selected columns"
Columns: "Continent"
Result table:
River | Length (km) | Continent |
---|---|---|
Yangtze | 6300 | Eurasia |
Yellow River (Huang He) | 5464 | Eurasia |
Community examples
- Compare two Excel sheets with EasyMorph (Project; Module: Main; Group: Compare data; Table: Matching rows; Action position: 3)
See also
transformations/keepduplicates.txt · Last modified: 2021/07/19 02:13 by craigt