User Tools

Site Tools


transformations:splitdelimitedtext

This is an old revision of the document!


SPLIT DELIMITED TEXT INTO COLUMNS

Category: Transform / Advanced


Description

This action splits text values that are separated by delimiting characters into multiple columns.


Use cases

Use this action on any text-based data files using non-standard delimiters (e.g. pipe, tilde) that Easymorph doesn't automatically detect, and imports all values into a single column.


Action settings

SettingDescription
ColumnSelect the column containing the delimited text to be split.
SeparatorSelect the delimiting character to split the text by. Options: Comma, Semicolon, Tab, Pipe, Tilde, Space, Custom (and select the character from the dialog that displays), or No separator.
DecimalSelect the character to represent the decimal point for columns determined to be containing numeric values, after splitting. Options: Point or Comma.
Ignore quoting When selected, delimiting characters contained within paired quotation marks will be used for splitting text.
Keep original columnWhen selected, the source column of the text to be split is kept in the dataset.
ColumnsFor each column of data you expect there to be, based on the number of delimiters within the source text, create a column, and provide a column name, to hold that portion of the data.
Text beyond Nth seperatorFor any "overflow" text that falls after the last column defined above, determine how it will be handled.
Options: Ignore (overflow text is dropped) or New column (enter the column name to hold the remaining overflow text)


Remarks

It can be difficult to know how many columns will be needed to capture all of the split values, especially with large values. You can use either a formula in a Calculate new column(s) action, or use multiple Calculate new column(s) to:

  1. Calculate the full length of the source column values,
  2. Remove the delimiters using either a formula ("Replace") or action,
  3. Calculate the length of the modified values,
  4. Subtract the modified length from the source length to determine the number of delimiters in the source text,
  5. Find the maximum value across all rows (using Analysis View, column profile/filter, or Table metadata), and add one.


If you only need to capture values to a specific point, insert enough columns to capture the data you need, and either let the rest drop ("Ignore" text beyond [#] separator) or capture the overflow for possible later use ("New column" for text beyond the [#] separator).


Examples

Example 1: Separate the following input into individual columns based on the "-" (dash) character.

Input table:

Date
2017-May-12
2018-Sep-30


Action parameters:

Delimiter: "-" (dash)(Custom character 2d)


Result:

Year Month Day
2017May 12
2018Sep 30



Example 2: Separate the following input into individual columns based on the "line break" character.

Input table:

Location
Mississauga
Ontario
Canada
North America
Cambridge
Massachusetts
United States
North America


Action parameters:

Delimiter: line break (Custom character 10)


Result:

City/town State/Province Country Continent
Mississauga Ontario Canada North America
Cambridge Massachusetts United States North America


See also

transformations/splitdelimitedtext.1619030150.txt.gz · Last modified: 2021/04/21 14:35 by dmitry

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki