Transformer - ChatGTP

Description

The ChatGPT transformer is designed to facilitate seamless integration of advanced natural language processing capabilities into Apiro feeds. This transformer uses prompts designed to output Json structures, enhancing user experience by offering personalized and contextually relevant information.

Config

Parameters

Parameter	Type	Default	Description
`apiKey`	String	N/A	Your API key for accessing the GPT-4 model.
`model`	String	`gpt-4`	The GPT model version to use.
`messages`	Array	N/A	An array of message objects, each containing a `role` and `content`.
`role`	String	`user`	The role of the message sender, typically `user` or `system`.
`content`	String	N/A	The content of the message.
`substitutions`	Object	N/A	An object containing key-value pairs for dynamic content substitution.
`CERTAINTY_FACTOR`	Number	`68`	A threshold for determining the need for manual review based on certainty levels.

Example

<?xml version="1.0" encoding="UTF-8"?>
<apiroConf version="1" xmlns="http://apiro.com/apiro/v1/root">
    <loadOrder>20</loadOrder>
    <envProperties>
        <envProperty>
            <name>CHATGPT_ENERGYBILL_PROMPT</name>
            <value>
                <![CDATA[
prepare a json message of the following format { firstname, lastname, service_address,commodity being the product being sold,  kw_hr being kilowatt-hours of energy consumed, kilojoules being the kilojoule equivalent of energy consumed as the calculated number, bill_days as an integer, due_date in format yyyy/mm/dd, delivery_line_diameter in cm, energy_company as a string being the company issuing the bill, energy_company_abn as the australian business number of the issuing company  }

if you cannot find the field value it as null

for each field place another field with name <field>_certainty with the value being your estimate of extraction accuracy as a percentage.

place another boolean field in the json called manual_review and make it true if any of the certainty fields other than those related to delivery_line_diameter are less than [CERTAINTY_FACTOR], otherwise make it false

place another string field in the json called "review_reason" outlining the reason the manual review field was set to true. set it to an empty string if not utilised.

finally add one more string  field called gpt_comment with any information you regard as important to convey regarding the extraction of the information

do not output anything other than the json

                ]]>
            </value>
            <jsonEscape>true</jsonEscape>
        </envProperty>
    </envProperties>
    <dataFeeds>
        <dataFeed definition="EXPR_JSON_FEED2" name="ENERGYBILL_CHATGPT">
            <execPriority>10</execPriority>
            <enabled>true</enabled>
            <push>false</push>
            <pull>true</pull>
            <schema>ENERGYBILL</schema>
            <config><![CDATA[
{
  "dataSource": {
    "entity": "GIT",
    "config": {
      "password": "${SYS:TESTFEED_GIT_PASSWORD}",
      "gitURL": "https://github.com/redapiro/apiro_engine_test_feeds.git",
      "branch": "rudtest",
      "pathPrefix": "/rudtest/energybills/petros.pdf",
      "username": "apirobot",
      "transformers": [
        {
          "name": "CALL_CHATGPT",
          "entity": "CHATGPT",
          "config": {
            "apiKey": "${SYS:OPENAI_API_KEY}",
            "substitutions" : {
                "CERTAINTY_FACTOR":"68"
            },
            "request": {
              "model": "gpt-4",
              "messages": [
                {
                  "role": "user",
                  "content": "${SYS:CHATGPT_ENERGYBILL_PROMPT}"
                }
              ]
            }
          }
        }
      ]
    }
  },
  "explicitMappings": [
      {
      "dictionary": "full_json",
      "value": "#{PAYLOAD.resolve('$')}"
    },
    {
      "dictionary": "manual_review",
      "value": "#{PAYLOAD.resolve('$.manual_review')}"
    },
    {
      "dictionary": "gpt_comment",
      "value": "#{PAYLOAD.resolve('$.gpt_comment')}"
    },
    {
      "dictionary": "energy_company_abn",
      "value": "#{PAYLOAD.resolve('$.energy_company_abn')}"
    }
  ]
}
]]>
            </config>
        </dataFeed>
    </dataFeeds>
</apiroConf>

Here is a concise portion of the above example, including only the direct structure of the transformer:

{
    "transformers": [
        {
          "name": "CALL_CHATGPT",
          "entity": "CHATGPT",
          "config": {
            "apiKey": "${SYS:OPENAI_API_KEY}",
            "substitutions" : {
                "CERTAINTY_FACTOR":"68"
            },
            "request": {
              "model": "gpt-4",
              "messages": [
                {
                  "role": "user",
                  "content": "${SYS:CHATGPT_ENERGYBILL_PROMPT}"
                }
              ]
            }
          }
        }
    ]
}

Common Mistakes

Incorrect API Key: Ensure that the apiKey parameter is correctly set to your valid API key for accessing the GPT-4 model.
Model Version Mismatch: Verify that the model parameter matches the version of the GPT model you intend to use.
Incorrect Message Format: Ensure that the messages array is correctly formatted, with each message object containing a role and content.
Missing Substitutions: If using the substitutions parameter, ensure that all required substitutions are defined and correctly formatted.
CERTAINTY_FACTOR Misuse: The CERTAINTY_FACTOR should be a number representing the threshold for manual review. Ensure it is set appropriately based on your application's needs.