GEN EXPRESS Data Cleanser
Description
The GEN EXPRESS cleanser is designed to modify data imported from each data point before they enter the sourcing pipeline process. It achieves this by executing a general expression, typically written as a Groovy script. This functionality is particularly useful when data needs to be altered before it enters the sourcing pipeline, ensuring that it conforms to the expected data type and format.
Config Location
To configure the GEN EXPRESS cleanser, you need to define it within the config
section of the data feed. This involves specifying the cleanser name and entity.
Supported Data Types
- STRING
Config Requirements
Config (
Example Config
Below is an example of how to configure the GEN EXPRESS cleanser in XML format. This example demonstrates a cleanser that removes the "%" character from the "percentage" data point, allowing the data to be cast into a DOUBLE type.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
|
Here is the GEN EXPRESS cleanser, showing only its core structure:
1 2 3 4 5 6 7 8 9 |
|
Example Result
- the GEN EXPRESS cleanser will execute the specified Groovy script on the "percentage" data point, removing the "%" character. This allows the data point to source "15.5" which can be cast into a DOUBLE type.
- One or many cleansers can by added to the array. In the above example, only one is included but data points may contain multiple cleansers.
Config Parameters
name | acceptable values | comment |
---|---|---|
root | A Groovy script that defines the operation to be performed on the data point. | The script can include arithmetic operations, data manipulation, and more. |
Common Mistakes
- Ensure that the cleanser name and entity are correctly defined in the configuration to ensure that the cleanser is correctly identified and applied during the data processing pipeline.
- Verify that the Groovy script within the
<config>
tag is correctly written and performs the intended operation on the data point. - Remember that the configuration for the GEN EXPRESS cleanser is required. If no configuration, or improper configuration is provided then a block violation will be thrown.