'Shell: Combine Two CSV Files
I have two CSV files, nape_mappings.csv and ipe_mappings.csv.
They look like this https://url.com/{entity}_{element}, {schema_version}
Not all entities and elements are in each .csv file, I want to create a new .csv file that combines the two files into one. The structure of the nape_mappings and ipe_mappings are like so:
IPE Mappings:
Element URI, IPE Mapped
https://uri.net/BRFS/Account_DFRevision, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_ShortName, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_AccountType, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_AffiliateCashAccountNumber, 20220303.1-SNAPSHOT
NAPE Mappings:
Element URI, NAPE Mapped
https://uri.net/BRFS/Account_DFRevision, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_ShortName, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_LongName, 20220303.1-SNAPSHOT https://uri.net/BRFS/Account_ExtraLongName, 20220303.1-SNAPSHOT
As you can see there is two similar entities with the same elements, but they are mapped from two different processing engines (NAPE and IPE).
I want to combine these two .csv files into a consolidated file that looks something like this
Element URI, NAPE Mapped, IPE Mapped
https://uri.net/BRFS/Account_DFRevision, 20220303.1-SNAPSHOT, 20220303.1-SNAPSHOT
https://uri.net/BRFS/Account_ShortName, 20220303.1-SNAPSHOT, 20220303.1-SNAPSHOT
https://uri.net/BRFS/Account_AccountType, blank, 20220303.1-SNAPSHOT
https://uri.net/BRFS/Account_AffiliateCashAccountNumber, blank, 20220303.1-SNAPSHOT
Notice that both DFRevision and ShortName elements are in both files for the entity Account, but LongName is only in NAPE, so the IPE section for that in the consolidated should just be left empty (CSV 3).
Right now I need have it generating the two files correctly, but consolidating is what I am having troubles on, I need to go through each element_uri and compare it to the other csv file and determine if it exists in both files or just the one and place the correct element_uri and when it was mapped into the new consolidated file. It is good to note that the two NAPE, IPE mapping files can be a pretty good size depending on how many entities it has and how many elements each entity contains.
Here is my code thus far:
NAPE_MAPPING_FILES=$(grep -Ril "<DFName>" $NAPE_MAPPINGS_DIR)
IPE_MAPPING_FILES=$(grep -Ril "<DFName>" $IPE_MAPPINGS_DIR)
echo "Element URI, NAPE Mapped" >> nape_mappings.csv
for mapping_file in $NAPE_MAPPING_FILES; do
entity=$(sed -n 's%.*<DFName>\(.*\)</DFName>.*%\1%p' $mapping_file)
file_elements=$(sed -n 's%.*<DFKey>\(.*\)</DFKey>.*%\1%p' $mapping_file)
stringarray=($file_elements)
for i in "${stringarray[@]}"; do
element=$i
uri="https://url.net/BRFS/${entity}_${element},$DF_BOM_VERSION"
echo $uri >> nape_mappings.csv
done
done
echo "Element URI, IPE Mapped" >> ipe_mappings.csv
for mapping_file in $IPE_MAPPING_FILES; do
entity=$(sed -n 's%.*<DFName>\(.*\)</DFName>.*%\1%p' $mapping_file)
file_elements=$(sed -n 's%.*<DFKey>\(.*\)</DFKey>.*%\1%p' $mapping_file)
stringarray=($file_elements)
for i in "${stringarray[@]}"; do
element=$i
uri="https://url.net/BRFS/${entity}_${element},$DF_BOM_VERSION"
echo $uri >> ipe_mappings.csv
done
done
#TODO: Combine nape_mappings.csv and ipe_mappings.csv for each entity and element and write to a single file
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
