Skip to content

REPROCESS SCHEMA

Description

The REPROCESS SCHEMA processor is designed to rerun the consolidation pipeline for provided data block matches. This processor is particularly useful in scenarios where the initial consolidation validation may not have captured all necessary data points or when the data schema has changed, requiring a reevaluation of the data through the consolidation pipeline. It supports all data types, including BLOBs, making it versatile for various data processing needs.


Config Location

To configure the REPROCESS SCHEMA processor, you need to define it within the consDPProcessors section of your data point configuration within the schema. This involves specifying the processor name and entity.


Supported Data Types

  • Json
  • String
  • BLOB
  • Boolean
  • Integer
  • Decimal
  • Double

Config Requirements

Config ({}) options are required for the REPROCESS SCHEMA processor. If no configuration is provided or if the configuration is improperly set up, a block violation will be thrown. For more information on Apiro violations and their appearance in the Apiro UI or logs, refer to the violations section.


Example Config

Below is an example of how to configure the REPROCESS SCHEMA processor in XML format:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
<dataPoint name="NAME" dataType="STRING">
    <consDPProcessors>
        <consDPProcessor name="REPROCESS_WITH_NEW_DATA" entity="REPROCESS_SCHEMA">
            <config>
                <![CDATA[
                    {
                        "schema" : "MY_SCHEMA",
                        "ddMatches" : {
                            "MY_DD" : "value1",
                            "MY_DD2" : "value2"
                        },
                        inheritContext:false
                    }
                ]]>
            </config>
        </consDPProcessor>
    </consDPProcessors>
</dataPoint>

Example Result

Upon successful configuration and execution, the REPROCESS SCHEMA processor will rerun the consolidation pipeline for the specified block matches, allowing for a reevaluation of the data based on the updated schema or other changes.


Config Parameters

name acceptable values comment
schema A string representing the schema or structure to be used for reprocessing. This defines the schema for the reprocessing.
ddMatches A JSON object specifying the data dictionary matches for reprocessing. This includes the data dictionary matches that trigger the reprocessing.
inheritContext A boolean value indicating whether to inherit the context from the previous processing. If set to false, the context will not be inherited.

Common Mistakes

  • Ensure that the processor name and entity are correctly defined in the configuration to ensure that the processor is correctly identified and applied during the data processing pipeline.
  • Verify that the configuration parameters within the <config> tag are correctly written and match the expected format and values.
  • Remember that the configuration for the REPROCESS SCHEMA processor is required. If no configuration, or improper configuration is provided then a block violation will be thrown.