Skip to content

FUZZY MATCH SYNTH

Description

The FUZZY MATCH SYNTH processor is designed for fuzzy entity matching within data processing pipelines. This processor leverages string matching techniques to identify and synthesize matches between entities based on their string representations, even when there are minor discrepancies or variations in the data.


Config Location

To configure the FUZZY MATCH SYNTH processor, you need to define it within the consDPProcessors section of your data point configuration within the schema. This involves specifying the processor name and entity.


Supported Data Types

  • String

The FUZZY MATCH SYNTH processor is specifically designed to work with string data types, as it relies on string matching algorithms to perform its operations [0].


Config Requirements

Config ({}) options are not required for the FUZZY MATCH SYNTH processor. This simplifies the configuration process, making it easier to integrate into existing data processing pipelines without the need for additional configuration parameters.


Example Config

Below is an example of how to configure the FUZZY MATCH SYNTH processor in XML format:

1
2
3
4
5
6
7
<dataPoint name="NAME" dataType="STRING">
    <consDPProcessors>
        <consDPProcessor name="MATCH_DATA" entity="FUZZY_MATCH_SYNTH">
            <config>{}</config>
        </consDPProcessor>
    </consDPProcessors>
</dataPoint>

This configuration is minimal, as no specific configuration options are required for this processor.


Config Parameters

Since config options are not required for the FUZZY MATCH SYNTH processor, there are no specific configuration parameters to list.


Common Mistakes

  • Ensure that the processor name and entity are correctly defined in the configuration to ensure that the processor is correctly identified and applied during the data processing pipeline.
  • Remember that the FUZZY MATCH SYNTH processor is intended for string data types. Ensure that the data item you are applying this processor to is of the correct data type.