Category: Transform / Advanced
This action creates a column with values that match a pattern specified using a regular expression.
A Regular Expression ("RegEx") is typically used to match values based on patterns or combinations of characters, numbers, symbols, and spaces that common "find" or "replace" functions (which use static values with wildcard characters) can't work with.
Confirm the existence of values in a strict, predetermined format: phone numbers, zip codes, email addresses, etc.
Validate that values fall within specific formatting parameters, e.g., a specific number of letters, numbers, and symbols in a required order.
Check for alternate spellings of keywords, e.g., "color" vs. "colour".
RegEx can be used to replace a pattern of characters with another value or pattern.
Setting | Description |
---|---|
Source column | Select the column to evaluate the values in. |
Regular expression* | Select the regular expression pattern to match to. |
Case insensitive | When checked, text case is not taken into account when evaluating matches. |
New column name | Enter the name of the new column that will hold the match results. |
Mode | Select whether to return the first match only, or all matches. See the "Mode options" table, below. |
If not matched | Select what value is returned when a match is not made. Options: Keep original value, Make empty, or Default to (and enter either a static value or expression to be returned - select whether It's an expression or It's a text or number.) |
Option | Description |
---|---|
First match only | Only the first match to the pattern will be returned. |
All matches (may create new lines) | All matches found will be returned. This will create a new row for each match. |
Capture mode | This option appears when All matches (above) is selected. Select the method of capturing the matched values. Options: Matches only ( ), Groups only ( ), Matches and Groups ( ). |
EasyMorph uses the .NET regex language.
While most "find" and "replace" functions can use wildcard characters to allow for a single values (?) or any number of values (*), they can't differentiate letters, numbers, spaces and symbols like RegEx can. Use RegEx when not only the number of characters is important, but also the types of characters.
RegEx can be set to detect one or more occurrences of the search pattern within a given value.
Example: Find phone numbers that match pattern (XXX)XXX-XXXX
where X is any number.
Source table:
Phone number |
---|
1234567890 |
(123)456-7890 |
123-4567890 |
123-456-7890 |
(123)4567890 |
Action parameters:
Source column is "Phone number"
Regular expression is^\(\d{3}\)\d{3}-\d{4}
New column name is "Matches"
Mode is "First match only"
If not matched "Make empty"
Result table:
Phone number | Matches |
---|---|
1234567890 | |
(123)456-7890 | (123)456-7890 |
123-4567890 | |
123-456-7890 | |
(123)4567890 |