Skip to content

HARD MATCH SYNTH

Description

The HARD MATCH SYNTH processor is designed to perform hard entity matching during the consolidation phase of data processing. This processor checks and matches entities based on predefined criteria, ensuring that the data is accurately and precisely aligned according to the specified matching logic. It supports all data types, including BLOBs, making it highly adaptable for various data processing needs.


Configuration

To configure the HARD MATCH SYNTH processor, you need to define it within the consDPProcessors section of your data point configuration within the schema. This involves specifying the processor name and entity.


Supported Data Types

  • Decimal
  • Double
  • Integer
  • String
  • Boolean

Config Requirements

Config ({}) options are required for the HARD MATCH SYNTH processor. If no configuration is provided or if the configuration is improperly set up, a block violation will be thrown. For more information on Apiro violations and their appearance in the Apiro UI or logs, refer to the violations section.


Example Config

Below is an example of how to configure the HARD MATCH SYNTH processor in XML format:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
<dataPoint name="NAME" dataType="STRING">
    <consDPProcessors>
        <consDPProcessor name="MATCH_DATA" entity="HARD_MATCH_SYNTH">
            <config>
                <![CDATA[
                    {
                        "matchSch":"COMPANY_DETAILS",
                        "matchCrit":{},
                        "copyFromDP":{},
                        "copyToDP":{},
                        "fuzzyInfoDD":"fuzzy_info",
                        "synthSrcDD":"synth_src",
                        "fuzzyInfoDD":"fuzzy_info"
                    }
                ]]>
            </config>
        </consDPProcessor>
    </consDPProcessors>
</dataPoint>

Example Result

Upon successful configuration and execution, the HARD MATCH SYNTH processor will execute the specified matching logic on the data point, ensuring that entities are accurately matched based on the criteria defined in the configuration.


Config Parameters

name acceptable values comment
matchSch A string representing the schema or criteria for matching entities. This defines the basis for the matching process.
matchCrit A JSON object defining the specific criteria for matching entities. This includes any conditions or rules that must be met for a match.
copyFromDP A JSON object specifying the data points to copy from during the match. This defines the source of data for the match process.
copyToDP A JSON object specifying the data points to copy to after a successful match. This defines the destination for matched data.
fuzzyInfoDD A string representing the data dictionary for fuzzy matching information. This is used for storing and retrieving fuzzy matching data.
synthSrcDD A string representing the data dictionary for synthetic source data. This is used for storing and retrieving synthetic source data.

Common Mistakes

  • Ensure that the processor name and entity are correctly defined in the configuration to ensure that the processor is correctly identified and applied during the data processing pipeline.
  • Verify that the configuration parameters within the <config> tag are correctly written and match the expected format and values.
  • Remember that the configuration for the HARD MATCH SYNTH processor is required. If no configuration, or improper configuration is provided then a block violation will be thrown.