User Tools

Site Tools


transformations:regexp

REGULAR EXPRESSION

Category: Transform / Advanced


Description

This action creates a column with values that match a pattern specified using a regular expression.


Use cases

A Regular Expression ("RegEx") is typically used to match values based on patterns or combinations of characters, numbers, symbols, and spaces that common "find" or "replace" functions (which use static values with wildcard characters) can't work with.

Confirm the existence of values in a strict, predetermined format: phone numbers, zip codes, email addresses, etc.

Validate that values fall within specific formatting parameters, e.g., a specific number of letters, numbers, and symbols in a required order.

Check for alternate spellings of keywords, e.g., "color" vs. "colour".

RegEx can be used to replace a pattern of characters with another value or pattern.


Action settings

SettingDescription
Source columnSelect the column to evaluate the values in.
Regular expression*Select the regular expression pattern to match to.
Case insensitiveWhen checked, text case is not taken into account when evaluating matches.
New column nameEnter the name of the new column that will hold the match results.
ModeSelect whether to return the first match only, or all matches. See the "Mode options" table, below.
If not matchedSelect what value is returned when a match is not made. Options: Keep original value, Make empty, or
Default to (and enter either a static value or expression to be returned - select whether It's an expression or
It's a text or number.)


Mode options

Option Description
First match onlyOnly the first match to the pattern will be returned.
All matches (may create new lines)All matches found will be returned. This will create a new row for each match.
Capture mode This option appears when All matches (above) is selected. Select the method of capturing
the matched values. Options: Matches only ( ), Groups only ( ), Matches and Groups ( ).


Remarks

EasyMorph uses the .NET regex language.

While most "find" and "replace" functions can use wildcard characters to allow for a single values (?) or any number of values (*), they can't differentiate letters, numbers, spaces and symbols like RegEx can. Use RegEx when not only the number of characters is important, but also the types of characters.

RegEx can be set to detect one or more occurrences of the search pattern within a given value.


Examples

Example: Find phone numbers that match pattern (XXX)XXX-XXXX where X is any number.

Source table:

Phone number
1234567890
(123)456-7890
123-4567890
123-456-7890
(123)4567890


Action parameters:

Source column is "Phone number"
Regular expression is ^\(\d{3}\)\d{3}-\d{4}
New column name is "Matches"
Mode is "First match only"
If not matched "Make empty"


Result table:

Phone number Matches
1234567890
(123)456-7890 (123)456-7890
123-4567890
123-456-7890
(123)4567890


See also

transformations/regexp.txt · Last modified: 2021/07/19 02:22 by craigt