User Tools

Site Tools


transformations:sanitize

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
transformations:sanitize [2018/08/15 08:34] dmitrytransformations:sanitize [2021/07/19 02:18] (current) – [Examples] craigt
Line 1: Line 1:
-===== Sanitize text =====+{{ transformations:SanitizeAction.png}} 
 +====== SANITIZE TEXT ====== 
 +Category: Transform / Advanced\\
  
-This action removes from text values invisible characters that are frequently unwanted because they may lead to mismatches and wrong merges:+\\  
 +=====Description===== 
 +This action removes "invisible" characters from text values that are frequently unwanted because they may lead to mismatches and wrong merges:
  
   * Hidden system characters   * Hidden system characters
Line 10: Line 14:
   * Repeating spaces   * Repeating spaces
  
-Non-text values are not affected by this action.+Non-text values (numbers, symbols, etc.) are not affected by this action.\\
  
-==See also== +\\  
-  * [[syntax:functions:sanitize]] +=====Use cases===== 
-  * [[syntax:functions:compact]] +  *Use //Sanitize text// on text columns to be used in a merge action, just prior to performing the merge, to ensure "hidden" characters don't prevent proper matches. 
-  * [[syntax:functions:trim]]+  *Remove markup tags from XML- or HTML-based files, leaving the plain text for downstream processing. 
 + 
 +\\  
 +=====Action settings===== 
 +^Setting^Description^ 
 +|Remove system characters|When checked, ASCII characters 0-31 are removed, except for //tab//, //carriage return// and //line feed// characters.| 
 +|Tabs|Select how tab characters embedded in the text will be handled.  Options//Do nothing//, //Remove//, //Remove repeating//,\\ and //Replace with spaces//.| 
 +|Line breaks|Select how line breaks embedded in the text will be handled.  Options//Do nothing//, //Remove//, //Remove repeating//,\\ and //Replace with spaces//.| 
 +|Remove ASCII FE-FF|When checked, the characters with ASCII codes 0xFE (hexadecimal, 254 decimal) and 0xFF (hexadecimal, 255 decimal) will be removed.| 
 +|Trim leading spaces|When checked, whitespace occurring at the start of text will be removed.| 
 +|Trim trailing spaces|When checked, whitespace occurring at the end of text will be removed.| 
 +|Remove repeating spaces|When checked, instances of more than one, adjacent space will be converted to a single space.| 
 +|Remove XML/HTML tabs|When checked, all XML and HTML markup tags will be removed.| 
 +|Sanitize columns|Select whether to sanitize all columns, or selected columns.  Options: //Sanitize all columns// or //Sanitize only\\ selected columns// (and select which columns to process).| 
 + 
 +\\  
 +=====Remarks===== 
 +The //Remove repeating spaces// option removes repeating spaces from //anywhere// within the text, leading spaces, and trailing spaces.  All occurrences found within a text value will be replaced, so more than one instance within a single text value will be addressed.\\ 
 + 
 +\\  
 +=====Examples===== 
 + 
 +**Example:** Clean out all unneeded text characters.\\ 
 +\\   
 +**Source data:** (raw text shown for clarity) 
 +<code> 
 +Sample Text 
 + 2 Leading spaces" 
 +"2 Trailing spaces 
 +"<b>Bold HTML tags</b>" 
 +"2 spaces here->  and 3 spaces here->   ." 
 +</code> 
 + 
 +\\  
 +**Action parameters:** 
 +>Row 1 requires //Trim leading spaces// 
 +>Row 2 requires //Trim trailing spaces// 
 +>Row 3 requires //Remove XML/HTML tags// 
 +>Row 4 requires //Remove repeating spaces// 
 +>Sanitize all columns. 
 + 
 +\\  
 +**Result table:** 
 +^Sample Text^ 
 +|2 Leading spaces| 
 +|2 Trailing spaces| 
 +|Bold HTML tags| 
 +|2 spaces here-> and 3 spaces here-> .| 
 + 
 +\\  
 +====Community examples==== 
 +  * [[https://community.easymorph.com/t//2008/2|“Printed” Text FileCould EasyMorph import this?]] ([[https://community.easymorph.com/uploads/short-url/kIb1qqOJb9WK6D1N1jFZdnHzC46.morph|Project]]; Module: //Parse Group//; Group: //Tab 1//; Table: //Header (3)//; Action position: //5//) 
 +  * [[https://community.easymorph.com/t//2160/1|How to pull data from web APIs with pagination]] ([[https://community.easymorph.com/uploads/short-url/dvCSpcEDXYZ8aB0B2gtnt7qulTF.morph|Project]]; Module: //Main//; Group: //Group 1//; Table: //Query API with pagination//;\\ Action position: //5//) 
 + 
 +\\ 
transformations/sanitize.1534336490.txt.gz · Last modified: 2018/08/15 08:34 by dmitry

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki