transformations:sanitize
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
transformations:sanitize [2018/08/15 08:34] – dmitry | transformations:sanitize [2021/07/19 02:18] (current) – [Examples] craigt | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ===== Sanitize text ===== | + | {{ transformations: |
+ | ====== | ||
+ | Category: Transform / Advanced\\ | ||
- | This action removes from text values | + | \\ |
+ | =====Description===== | ||
+ | This action removes | ||
* Hidden system characters | * Hidden system characters | ||
Line 10: | Line 14: | ||
* Repeating spaces | * Repeating spaces | ||
- | Non-text values are not affected by this action. | + | Non-text values |
- | ==See also== | + | \\ |
- | * [[syntax:functions:sanitize]] | + | =====Use cases===== |
- | * [[syntax:functions:compact]] | + | *Use //Sanitize text// on text columns to be used in a merge action, just prior to performing the merge, to ensure " |
- | * [[syntax:functions:trim]] | + | *Remove markup tags from XML- or HTML-based files, leaving the plain text for downstream processing. |
+ | |||
+ | \\ | ||
+ | =====Action settings===== | ||
+ | ^Setting^Description^ | ||
+ | |Remove system characters|When checked, ASCII characters 0-31 are removed, except for //tab//, //carriage return// and //line feed// characters.| | ||
+ | |Tabs|Select how tab characters embedded in the text will be handled. | ||
+ | |Line breaks|Select how line breaks embedded in the text will be handled. | ||
+ | |Remove ASCII FE-FF|When checked, the characters with ASCII codes 0xFE (hexadecimal, | ||
+ | |Trim leading spaces|When checked, whitespace occurring at the start of text will be removed.| | ||
+ | |Trim trailing spaces|When checked, whitespace occurring at the end of text will be removed.| | ||
+ | |Remove repeating spaces|When checked, instances of more than one, adjacent space will be converted to a single space.| | ||
+ | |Remove XML/HTML tabs|When checked, all XML and HTML markup tags will be removed.| | ||
+ | |Sanitize columns|Select whether to sanitize | ||
+ | |||
+ | \\ | ||
+ | =====Remarks===== | ||
+ | The //Remove repeating spaces// option removes repeating spaces from // | ||
+ | |||
+ | \\ | ||
+ | =====Examples===== | ||
+ | |||
+ | **Example: | ||
+ | \\ | ||
+ | **Source data:** (raw text shown for clarity) | ||
+ | < | ||
+ | Sample Text | ||
+ | " | ||
+ | "2 Trailing spaces | ||
+ | "< | ||
+ | "2 spaces here-> | ||
+ | </ | ||
+ | |||
+ | \\ | ||
+ | **Action parameters: | ||
+ | >Row 1 requires //Trim leading spaces// | ||
+ | >Row 2 requires //Trim trailing spaces// | ||
+ | >Row 3 requires //Remove XML/HTML tags// | ||
+ | >Row 4 requires //Remove repeating spaces// | ||
+ | > | ||
+ | |||
+ | \\ | ||
+ | **Result table:** | ||
+ | ^Sample Text^ | ||
+ | |2 Leading spaces| | ||
+ | |2 Trailing spaces| | ||
+ | |Bold HTML tags| | ||
+ | |2 spaces here-> and 3 spaces here-> .| | ||
+ | |||
+ | \\ | ||
+ | ====Community examples==== | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | |||
+ | \\ |
transformations/sanitize.1534336490.txt.gz · Last modified: 2018/08/15 08:34 by dmitry