'How to merge 2 fields and find unique records from a JSON file using Python

I have a JSON file which contains duplicate records and I need to send the unique records to another application. In order to get the unique records, first I need to merge JRNAL_NO & JRNAL_LINE fields separated a hyphen (ex: 655-1) and then use this key to identify the unique records. Is this possible to do via a python script?

Source

Target

Thank you, John



Solution 1:[1]

I managed to get the expected results using the below script (thanks to some old posts in SO). I am new to Python, so if there are better ways of doing this, kindly suggest. Thanks.

import json
source = json.loads(input_var)
target = []
seen = set()

for record in source:
    name = record['JRNAL_NO'] + record['JRNAL_LINE']
    if name not in seen:
        seen.add(name)
        target.append(record)
del seen

output_var = json.dumps(target)

Solution 2:[2]

In order to solve this problem, we can gather the dictionaries in a list and then apply a list-comprehension to eliminate duplicates:

dictA = {"PERIOD": "2022007", "JRNAL_NO": "655", "JRNAL_LINE": "1", "D_C": "C"}
dictB = {"PERIOD": "2022007", "JRNAL_NO": "655", "JRNAL_LINE": "3", "D_C": "C"}
dictC = {"PERIOD": "2022007", "JRNAL_NO": "655", "JRNAL_LINE": "3", "D_C": "C"}
dictD = {"PERIOD": "2022007", "JRNAL_NO": "655", "JRNAL_LINE": "3", "D_C": "C"}

list_of_dicts = [dictA, dictB, dictC, dictD]

result = []
[result.append(x) for x in list_of_dicts if x not in result]

print(result)

This will return a list of the non-reapeted dictionaries:

[{'PERIOD': '2022007', 'JRNAL_NO': '655', 'JRNAL_LINE': '1', 'D_C': 'C'}, {'PERIOD': '2022007', 'JRNAL_NO': '655', 'JRNAL_LINE': '3', 'D_C': 'C'}]

Extra relevant links:

List Comprehension

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 John
Solution 2 san99tiago