Skip to content

Section 3 - Create a second GIT Data Source (Excel) manualy, using InteliJ

Go back to Getting started guide

In this section we will:

  • Source the Excel file customers_b.xlsx from GIT.
  • Create data schema data source, data feed config files.
  • Create data pipelines.
  • Push all the files into git inline with the GitOps framework.
Description
Config Reference
Artifacts
Required prerequisites

Walk through

Generated a second feed for an existing schema CUSTOMER
  1. Create a new configuration file called FEED_CUSTOMERS_B_XLSX.xml.
  2. Copy the configuration shown below and paste it into the new file. Note: This is the actualk file produced in Section1 GIT via UI and it was called FEED_CUSTOMERS_A_XLSX.xml
  3. In line 7 below replace CUSTOMERS_A_XLSX with CUSTOMERS_B_XLSX but keep <schema>CUSTOMERS</schema> at line 91 unchanged. Note: This will associate a second feed with the same schema CUSTOMER and aggregate the values sourced from both feeds.
  4. In line 25 below customer_a.xlsx with customer_b.xlsx. Note: This will specify the feed FEED_CUSTOMERS_B_XLSX will source data from the excell file customer_b.xlsx.
      1
      2
      3
      4
      5
      6
      7
      8
      9
     10
     11
     12
     13
     14
     15
     16
     17
     18
     19
     20
     21
     22
     23
     24
     25
     26
     27
     28
     29
     30
     31
     32
     33
     34
     35
     36
     37
     38
     39
     40
     41
     42
     43
     44
     45
     46
     47
     48
     49
     50
     51
     52
     53
     54
     55
     56
     57
     58
     59
     60
     61
     62
     63
     64
     65
     66
     67
     68
     69
     70
     71
     72
     73
     74
     75
     76
     77
     78
     79
     80
     81
     82
     83
     84
     85
     86
     87
     88
     89
     90
     91
     92
     93
     94
     95
     96
     97
     98
     99
    100
    101
    102
    103
            <?xml version="1.0" encoding="UTF-8"?>
    
            <apiroConf version="1" xmlns="http://apiro.com/apiro/v1/root">
              <groups/>  
              <loadOrder>30</loadOrder>  
              <dataFeeds> 
                <dataFeed definition="EXPR_EXCEL_FEED2" name="CUSTOMERS_A_XLSX">
                  <groupTags> 
                    <groupTag>DEFAULT</groupTag> 
                  </groupTags>  
                  <metaData/>  
                  <abstract>false</abstract>  
                  <inheritable>true</inheritable>  
                  <description/>  
                  <execPriority>50</execPriority>  
                  <execPredicate>#GRV{true}</execPredicate>  
                    <config>
                        <![CDATA[{
                            "dataSource": {
                              "config": {
                                "username": "${SYS:APIRO_GITHUB_USERNAME}",
                                "password": "${SYS:APIRO_GITHUB_PW}",
                                "gitURL": "https://github.com/redapiro/apiro_examples.git",
                                "branch": "main",
                                "pathPrefix": "/artifacts/source_files/customers_a.xlsx"
                              },
                              "entity": "GIT"
                            },
                            "dataAtLine": 2,
                            "sheet": "data",
                            "itemLimit": 20,
                            "explicitMappings": [
                              {
                                "dictionary": "BAC",
                                "value": "#GRV{PAYLOAD.resolve('A')}"
                              },
                              {
                                "dictionary": "FIRST_NAME",
                                "value": "#GRV{PAYLOAD.resolve('B')}"
                              },
                              {
                                "dictionary": "LAST_NAME",
                                "value": "#GRV{PAYLOAD.resolve('C')}"
                              },
                              {
                                "dictionary": "ADDRESS",
                                "value": "#GRV{PAYLOAD.resolve('D')}"
                              },
                              {
                                "dictionary": "PHONE_NUMBER",
                                "value": "#GRV{PAYLOAD.resolve('E')}"
                              },
                              {
                                "dictionary": "AGE",
                                "value": "#GRV{PAYLOAD.resolve('F')}"
                              },
                              {
                                "dictionary": "YEARLY_INCOME",
                                "value": "#GRV{PAYLOAD.resolve('G')}"
                              },
                              {
                                "dictionary": "TFN",
                                "value": "#GRV{PAYLOAD.resolve('H')}"
                              },
                              {
                                "dictionary": "PORTFOLIO_VALUE",
                                "value": "#GRV{PAYLOAD.resolve('I')}"
                              },
                              {
                                "dictionary": "COMPANY_NAME",
                                "value": "#GRV{PAYLOAD.resolve('J')}"
                              },
                              {
                                "dictionary": "COMPANY_ADDRESS",
                                "value": "#GRV{PAYLOAD.resolve('K')}"
                              },
                              {
                                "dictionary": "COMPANY_WEBSITE",
                                "value": "#GRV{PAYLOAD.resolve('L')}"
                              },
                              {
                                "dictionary": "PROFILE_IMAGE",
                                "value": "#GRV{PAYLOAD.resolve('M')}"
                              }
                            ]
                          }
                        ]]>
                      </config>
                  <longLived>true</longLived>  
                  <enabled>false</enabled>  
                  <schema>CUSTOMERS</schema>
                  <push>true</push>  
                  <pull>true</pull>  
                  <cronTriggers> 
                    <cronTrigger> 
                      <description>Every day at 6pm (18:00)</description>  
                      <cron>0 0 18 ? * * *</cron> 
                    </cronTrigger> 
                  </cronTriggers> 
                </dataFeed> 
              </dataFeeds>  
              <dataSinks/> 
        </apiroConf>
    
  5. The contents of the resulting file is shown below. Note: This will specify the feed FEED_CUSTOMERS_B_XLSX will source data from the excell file customer_b.xlsx.
      1
      2
      3
      4
      5
      6
      7
      8
      9
     10
     11
     12
     13
     14
     15
     16
     17
     18
     19
     20
     21
     22
     23
     24
     25
     26
     27
     28
     29
     30
     31
     32
     33
     34
     35
     36
     37
     38
     39
     40
     41
     42
     43
     44
     45
     46
     47
     48
     49
     50
     51
     52
     53
     54
     55
     56
     57
     58
     59
     60
     61
     62
     63
     64
     65
     66
     67
     68
     69
     70
     71
     72
     73
     74
     75
     76
     77
     78
     79
     80
     81
     82
     83
     84
     85
     86
     87
     88
     89
     90
     91
     92
     93
     94
     95
     96
     97
     98
     99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
            <?xml version="1.0" encoding="UTF-8"?>
    
            <apiroConf version="1" xmlns="http://apiro.com/apiro/v1/root">
              <groups/>  
              <loadOrder>30</loadOrder>  
              <dataFeeds> 
                <dataFeed definition="EXPR_EXCEL_FEED2" name="CUSTOMERS_B_XLSX">
                  <groupTags> 
                    <groupTag>DEFAULT</groupTag> 
                  </groupTags>  
                  <metaData/>  
                  <abstract>false</abstract>  
                  <inheritable>true</inheritable>  
                  <description/>  
                  <execPriority>50</execPriority>  
                  <execPredicate>#GRV{true}</execPredicate>  
                    <config>
                        <![CDATA[{
                            "dataSource": {
                              "config": {
                                "username": "${SYS:APIRO_GITHUB_USERNAME}",
                                "password": "${SYS:APIRO_GITHUB_PW}",
                                "gitURL": "https://github.com/redapiro/apiro_examples.git",
                                "branch": "main",
                                "pathPrefix": "/artifacts/source_files/customers_b.xlsx"
                              },
                              "entity": "GIT"
                            },
                            "dataAtLine": 2,
                            "sheet": "data",
                            "itemLimit": 20,
                            "explicitMappings": [
                              {
                                "dictionary": "BAC",
                                "value": "#GRV{PAYLOAD.resolve('A')}"
                              },
                              {
                                "dictionary": "FIRST_NAME",
                                "value": "#GRV{PAYLOAD.resolve('B')}"
                              },
                              {
                                "dictionary": "LAST_NAME",
                                "value": "#GRV{PAYLOAD.resolve('C')}"
                              },
                              {
                                "dictionary": "ADDRESS",
                                "value": "#GRV{PAYLOAD.resolve('D')}"
                              },
                              {
                                "dictionary": "PHONE_NUMBER",
                                "value": "#GRV{PAYLOAD.resolve('E')}"
                              },
                              {
                                "dictionary": "AGE",
                                "value": "#GRV{PAYLOAD.resolve('F')}"
                              },
                              {
                                "dictionary": "YEARLY_INCOME",
                                "value": "#GRV{PAYLOAD.resolve('G')}"
                              },
                              {
                                "dictionary": "TFN",
                                "value": "#GRV{PAYLOAD.resolve('H')}"
                              },
                              {
                                "dictionary": "PORTFOLIO_VALUE",
                                "value": "#GRV{PAYLOAD.resolve('I')}"
                              },
                              {
                                "dictionary": "COMPANY_NAME",
                                "value": "#GRV{PAYLOAD.resolve('J')}"
                              },
                              {
                                "dictionary": "COMPANY_ADDRESS",
                                "value": "#GRV{PAYLOAD.resolve('K')}"
                              },
                              {
                                "dictionary": "COMPANY_WEBSITE",
                                "value": "#GRV{PAYLOAD.resolve('L')}"
                              },
                              {
                                "dictionary": "PROFILE_IMAGE",
                                "value": "#GRV{PAYLOAD.resolve('M')}"
                              },
                              {
                                "dictionary": "XML_ROOT_DOC",
                                "value": "#GRV{PAYLOAD.resolve('N')}"
                              },
                              {
                                "dictionary": "JSON_ROOT_DOC",
                                "value": "#GRV{PAYLOAD.resolve('O')}"
                              }
                            ]
                          }
                        ]]>
                      </config>
                  <longLived>true</longLived>  
                  <enabled>false</enabled>  
                  <schema>CUSTOMERS</schema>
                  <push>true</push>  
                  <pull>true</pull>  
                  <cronTriggers> 
                    <cronTrigger> 
                      <description>Every day at 6pm (18:00)</description>  
                      <cron>0 0 18 ? * * *</cron> 
                    </cronTrigger> 
                  </cronTriggers> 
                </dataFeed> 
              </dataFeeds>  
              <dataSinks/> 
        </apiroConf>
    
  6. Save and push your updates to GIT and Reload the configuration files as explained at the bottom of the page.
  7. You will see from the screenshots below what we are now sourcing from two files and using default data aggregators to consolidate the data.
    raw_data
    • Notice the yearly income of Bob, in one feed is 98,000 and the second feed is 100,000. aggregated_data
    • Notice the aggregated yearly income of Bob, it is now 99,000 because the default aggregation algorithm is MEAN_AVERAGE.
    • We will see later in the guide how we can customize the default behaviour.
Deploy config files
  • Follow these steps Config Deployment to deploy and start using your configuration files.