ANONYMISE JSON SINK

Description

The ANONYMISE JSON SINK processor is designed to handle the anonymization of JSON data sinks within a data processing pipeline. This processor is particularly useful for scenarios where sensitive data needs to be transformed into a JSON format and then sent to a specified sink, such as a Hazelcast cluster queue, while ensuring the privacy of the data.

Config Location

To configure the ADHOC JSON SINK processor, you need to define it within the schemaAppliedProcessors section of your schema configuration. This involves specifying the processor name, entity, and configuration details such as the JSON payload structure and the data sink target.

Supported Data Types

String
Double
Decimal
Integer
Json
BLOB
Boolean

Config Requirements

The configuration for the ADHOC JSON SINK processor is required. If no configuration is provided or if the configuration is improperly set up, a block violation will be thrown. For more information on Apiro violations and their appearance in the Apiro UI or logs, refer to the violations section.

Example Config

Below is an example of how to configure the ANONYMISE JSON SINK processor in XML format:

<schemaAppliedProcessors>
    <dataBlockProcessors>
        <dataBlockProcessor name="BLOCK_ANONYMISE_JSON_SINK" entity="ANONYMISE_JSON_SINK">
            <config>
                <![CDATA[
                    {
                        "jsonPayload":{
                            "name": "#GRV{ CTX['NAME'] }",
                            "surname": "#GRV{ CTX['SURNAME'] }",
                            "address": "#GRV{ CTX['ADDRESS'] }",
                            "phone": "#GRV{ CTX['PHONE'] }",
                            "postcode": "#GRV{ CTX['POSTCODE'] }",
                            "email": "#GRV{ CTX['EMAIL'] }"
                        },
                        "mimeType":"application/octet-stream",
                        "dataSink": {
                            "name": "anonymise",
                            "entity": "HAZELCAST_CLUSTER_QUEUE",
                            "config" : {
                                "queueName": "CUSTOMER_DETAILS"
                            }
                        }
                    }
            ]]>
            </config>
        </dataBlockProcessor>
    </dataBlockProcessors>
</schemaAppliedProcessors>

Example Result

configuration defines a JSON payload with fields for name, surname, address, phone, postcode, and email
these are dynamically populated from the context (CTX)
the data is then sent to a Hazelcast cluster queue named CUSTOMER_DETAILS, ensuring the privacy of the data.

Config Parameters

name	acceptable values	comment
jsonPayload	custom json object containing one or many fields	These can be sourced from `CTX`
mimeType	mimeType of dataSink input, usually: "application/octet-stream"
dataSink	json object representing a data sink entity: contains required fields "name", "entity" and "config" "config" contains field "queueName" which contains the name of the Hazelcast queue	see data sinks for more

Common Mistakes

Ensure that the jsonPayload structure matches the expected format for the data being processed. Mismatches in the payload structure can lead to processing errors.
Verify that the dataSink configuration correctly specifies the target entity and any necessary configuration parameters, such as the queue name for a Hazelcast cluster queue.
Remember that the processor name and entity must be correctly defined in the configuration to ensure that the processor is correctly identified and applied during the data processing pipeline.
Config ({}) options are required. If no configuration, or improper configuration is provided then a block violation will be thrown. See violations for more on Apiro violations and what they look like in the Apiro UI or Apiro logs.