'Combine dictionaries with the same value for a specific key

I have a group of dictionaries of 2 patterns like {"Id": 1, "title":"example"} and {"Id": 1, "location":"city"}. I want to combine these 2 together to get {"Id": 1, "title":"example", "location":"city"}, for all the dictionaries with Ids that match. In this case the group is of 200 items of 100 titles and 100 locations all with Ids from 0-99. I want to return a list of 100 combined dictionaries.

May be like the following:

def ResultHandler(extractedResult: list):
    jsonObj = {}
    jsonList = []
    for result in extractedResult:
        for key, val in result.items():
            #this works if its hardcoded val to a number...
            if key == "Id" and val == 1:
                jsonObj.update(result)
    jsonList.append(jsonObj)
    return jsonList


Solution 1:[1]

A more functional (but slightly less efficient) approach:

from itertools import groupby
from functools import reduce
from operator import itemgetter


new_data = []
for _, g in groupby(sorted(data, key=itemgetter("Id")), key=itemgetter("Id")):
    new_data.append(reduce(lambda d1, d2: {**d1, **d2}, g))

Solution 2:[2]

This function has a nested loop. The outer loop iterates through the list of dictionaries. The inner loop iterates through the list of dictionaries again to check if the id of the current dictionary is already in the list of dictionaries. If it is not, it appends the dictionary to the list of dictionaries. If it is, it updates the dictionary in the list of dictionaries with the contents of the current dictionary.

lst = [
    {"id": 1, "fname": "John"},
    {"id": 2, "name": "Bob"},
    {"id": 1, "lname": "Mary"},
]
def combine_dicts(lst):
    res = []
    for d in lst:
        if d.get("id") not in [x.get("id") for x in res]:
            res.append(d)
        else:
            for r in res:
                if r.get("id") == d.get("id"):
                    r.update(d)
    return res


print(combine_dicts(last))
# output: [{'id': 1, 'fname': 'John', 'lname': 'Mary'}, {'id': 2, 'name': 'Bob'}]

Solution 3:[3]

The following code should work:

def resultHandler(extractedResult):
  jsonList = []
  for i in range(len(extractedResult) // 2):
    jsonList.append({"Id": i})
  for i in range(len(extractedResult)):
    for j in range(len(jsonList)):
      if jsonList[j]["Id"] == extractedResult[i]["Id"]:
        if "title" in extractedResult[i]:
          jsonList[j]["title"] = extractedResult[i]["title"];
        else:
          jsonList[j]["location"] = extractedResult[i]["location"];
  return jsonList;

extractedResult = [{"Id": 0, "title":"example1"}, {"Id": 1, "title":"example2"}, {"Id": 0, "location":"example3"}, {"Id": 1, "location":"example4"}]

jsonList = resultHandler(extractedResult)

print(jsonList)

Output:

[{'Id': 0, 'title': 'example1', 'location': 'example3'}, {'Id': 1, 'title': 'example2', 'location': 'example4'}]

This code works by first filling up jsonList with Id values from 0 to half of the length of extractedResult (so the number of IDs).

Then, for every dictionary in extractedResult, we find the dictionary in jsonList with the matching ID. If that dictionary of extractedResult contains a key, "title", then we create that value for that dictionary in jsonList. The same applied for "location".

I hope this helps answer your question! Please let me know if you need any further clarification or details :)

Solution 4:[4]

This code will solve your problem in linear time i.e., O(n) where n is the order of growth of the length of your dictionary. It will consider only those Id which has both title and location and will ignore the rest.

from collections import Counter

data = [{"Id": 1, "title":"example1"},
        {"Id": 2, "title":"example2"},
        {"Id": 3, "title":"example3"},
        {"Id": 4, "title":"example4"},
        {"Id": 1, "location":"city1"},
        {"Id": 2, "location":"city2"},
        {"Id": 4, "location":"city4"},
        {"Id": 5, "location":"city5"}]

paired_ids = set([key for key, val in dict(Counter([item["Id"] for item in data])).items() if val == 2]) # O(n)

def combine_dict(data):
    result = {key: [] for key in paired_ids} # O(m), m: number of paired ids (m <= n/2)
    for item in data: # O(n)
        items = list(item.items())
        id, tl, val = items[0][1], items[1][0], items[1][1]

        if id in paired_ids: # O(1), as paired_ids is a set lookup takes O(1)
            result[id].append({tl: val})

    return [{"Id": id, "title": lst[0]["title"], "location": lst[1]["location"]} for id, lst in result.items()] # O(n)


print(*combine_dict(data), sep="\n")

Output:

{'Id': 1, 'title': 'example1', 'location': 'city1'}
{'Id': 2, 'title': 'example2', 'location': 'city2'}
{'Id': 4, 'title': 'example4', 'location': 'city4'}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Parvesh Kumar
Solution 3 Ani M
Solution 4