'Shell: Combine Two CSV Files

I have two CSV files, nape_mappings.csv and ipe_mappings.csv.

They look like this https://url.com/{entity}_{element}, {schema_version}

Not all entities and elements are in each .csv file, I want to create a new .csv file that combines the two files into one. The structure of the nape_mappings and ipe_mappings are like so:

IPE Mappings:

Element URI, IPE Mapped

https://uri.net/BRFS/Account_DFRevision, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_ShortName, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_AccountType, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_AffiliateCashAccountNumber, 20220303.1-SNAPSHOT

NAPE Mappings:

Element URI, NAPE Mapped

https://uri.net/BRFS/Account_DFRevision, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_ShortName, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_LongName, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_ExtraLongName, 20220303.1-SNAPSHOT

As you can see there is two similar entities with the same elements, but they are mapped from two different processing engines (NAPE and IPE).

I want to combine these two .csv files into a consolidated file that looks something like this

Element URI, NAPE Mapped, IPE Mapped

https://uri.net/BRFS/Account_DFRevision, 20220303.1-SNAPSHOT, 20220303.1-SNAPSHOT

https://uri.net/BRFS/Account_ShortName, 20220303.1-SNAPSHOT, 20220303.1-SNAPSHOT

https://uri.net/BRFS/Account_AccountType, blank, 20220303.1-SNAPSHOT

https://uri.net/BRFS/Account_AffiliateCashAccountNumber, blank, 20220303.1-SNAPSHOT

Notice that both DFRevision and ShortName elements are in both files for the entity Account, but LongName is only in NAPE, so the IPE section for that in the consolidated should just be left empty (CSV 3).

Right now I need have it generating the two files correctly, but consolidating is what I am having troubles on, I need to go through each element_uri and compare it to the other csv file and determine if it exists in both files or just the one and place the correct element_uri and when it was mapped into the new consolidated file. It is good to note that the two NAPE, IPE mapping files can be a pretty good size depending on how many entities it has and how many elements each entity contains.

Here is my code thus far:

NAPE_MAPPING_FILES=$(grep -Ril "<DFName>" $NAPE_MAPPINGS_DIR)
IPE_MAPPING_FILES=$(grep -Ril "<DFName>" $IPE_MAPPINGS_DIR)

echo "Element URI, NAPE Mapped" >> nape_mappings.csv
for mapping_file in $NAPE_MAPPING_FILES; do

  entity=$(sed -n 's%.*<DFName>\(.*\)</DFName>.*%\1%p' $mapping_file)
  file_elements=$(sed -n 's%.*<DFKey>\(.*\)</DFKey>.*%\1%p' $mapping_file)
  stringarray=($file_elements)

  for i in "${stringarray[@]}"; do
    element=$i
    uri="https://url.net/BRFS/${entity}_${element},$DF_BOM_VERSION"
    echo $uri >> nape_mappings.csv
  done

done

echo "Element URI, IPE Mapped" >> ipe_mappings.csv
for mapping_file in $IPE_MAPPING_FILES; do

  entity=$(sed -n 's%.*<DFName>\(.*\)</DFName>.*%\1%p' $mapping_file)
  file_elements=$(sed -n 's%.*<DFKey>\(.*\)</DFKey>.*%\1%p' $mapping_file)

  stringarray=($file_elements)

  for i in "${stringarray[@]}"; do
    element=$i
    uri="https://url.net/BRFS/${entity}_${element},$DF_BOM_VERSION"
    echo $uri >> ipe_mappings.csv
  done

done

#TODO: Combine nape_mappings.csv and ipe_mappings.csv for each entity and element and write to a single file


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source