This is an old revision of the document!
Table of Contents
SPLIT DELIMITED TEXT INTO COLUMNS
Category: Transform / Advanced
Description
This action splits text values that are separated by delimiting characters into multiple columns.
Use cases
Use this action on any text-based data files using non-standard delimiters (e.g. pipe, tilde) that Easymorph doesn't automatically detect, and imports all values into a single column.
Action settings
Setting | Description |
---|---|
Column | Select the column containing the delimited text to be split. |
Separator | Select the delimiting character to split the text by. Options: Comma, Semicolon, Tab, Pipe, Tilde, Space, Custom (and select the character from the dialog that displays), or No separator. |
Decimal | Select the character to represent the decimal point for columns determined to be containing numeric values, after splitting. Options: Point or Comma. |
Ignore quoting | When selected, delimiting characters contained within paired quotation marks will be used for splitting text. |
Keep original column | When selected, the source column of the text to be split is kept in the dataset. |
Columns | For each column of data you expect there to be, based on the number of delimiters within the source text, create a column, and provide a column name, to hold that portion of the data. |
Text beyond Nth seperator | For any "overflow" text that falls after the last column defined above, determine how it will be handled. Options: Ignore (overflow text is dropped) or New column (enter the column name to hold the remaining overflow text) |
Remarks
It can be difficult to know how many columns will be needed to capture all of the split values, especially with large values. You can use either a formula in a Calculate new column(s) action, or use multiple Calculate new column(s) to:
- Calculate the full length of the source column values,
- Remove the delimiters using either a formula ("Replace") or action,
- Calculate the length of the modified values,
- Subtract the modified length from the source length to determine the number of delimiters in the source text,
- Find the maximum value across all rows (using Analysis View, column profile/filter, or Table metadata), and add one.
If you only need to capture values to a specific point, insert enough columns to capture the data you need, and either let the rest drop ("Ignore" text beyond [#] separator) or capture the overflow for possible later use ("New column" for text beyond the [#] separator).
Examples
Example 1: Separate the following input into individual columns based on the "-" (dash) character.
Input table:
Date |
---|
2017-May-12 |
2018-Sep-30 |
Action parameters:
Delimiter: "-" (dash)(Custom character 2d)
Result:
Year | Month | Day |
---|---|---|
2017 | May | 12 |
2018 | Sep | 30 |
Example 2: Separate the following input into individual columns based on the "line break" character.
Input table:
Location |
---|
Mississauga Ontario Canada North America |
Cambridge Massachusetts United States North America |
Action parameters:
Delimiter: line break (Custom character 10)
Result:
City/town | State/Province | Country | Continent |
---|---|---|---|
Mississauga | Ontario | Canada | North America |
Cambridge | Massachusetts | United States | North America |