'Finding different values ​for the same key in different dictionaries in Python

I have lots of dictionary in one list. For example;

totalList = [
{'id': 1111, 'source': 'user_1', 'count_id': 10, 'description': 'aaaa'}, 
{'id': 1412, 'source': 'user_2', 'count_id': 5, 'description': 'bbbb'}, 
{'id': 5123, 'source': 'user_1', 'count_id': 10, 'description': 'aaaa'}, 
{'id': 1982, 'source': 'user_3', 'count_id': 7, 'description': 'bbbb'},
{'id': 3198, 'source': 'user_3', 'count_id': 7, 'description': 'bbbb'},
{'id': 1082, 'source': 'user_1', 'count_id': 10, 'description': 'aaaa'}
              ]
  • The id's are always different.
  • All keys are the same.

I want to get id's that have the same source, same count_id and same description values. In this example, I just need to get the id's. Output:

1111, 5123, 1082 same
1982, 3198 same

How can i achieve this?

Thanks.



Solution 1:[1]

I'd reformat the data into a dictionary of items, where each key is a tuple of the three values you care about. Then you can iterate through the dictionary and efficiently find duplicates.

# Original data
totalList = [
    {'id': 1111, 'source': 'user_1', 'count_id': 10, 'description': 'aaaa'}, 
    {'id': 1412, 'source': 'user_2', 'count_id': 5, 'description': 'bbbb'}, 
    {'id': 5123, 'source': 'user_1', 'count_id': 10, 'description': 'aaaa'}, 
    {'id': 1982, 'source': 'user_3', 'count_id': 7, 'description': 'bbbb'},
    {'id': 3198, 'source': 'user_3', 'count_id': 7, 'description': 'bbbb'},
    {'id': 1082, 'source': 'user_1', 'count_id': 10, 'description': 'aaaa'}
]

# Detect duplicates
from collections import defaultdict

def get_key(item):
    return (item['source'], item['count_id'], item['description'])

ids_by_source_count_and_desc = defaultdict(list)
for item in totalList:
    ids_by_source_count_and_desc[get_key(item)].append(item['id'])

for key in ids_by_source_count_and_desc:
    ids = ids_by_source_count_and_desc[key]
    if len(ids) > 1:
        print(key, "same", ids)

I also use defaultdict to avoid having to check if the dictionary I'm inserting into already contains a list.

Output:

('user_1', 10, 'aaaa') same [1111, 5123, 1082]
('user_3', 7, 'bbbb') same [1982, 3198]

Solution 2:[2]

Personally speaking, working with pandas mostly can make coding much faster and simpler. What I have come up with is as what follows:

import pandas as pd
df = pd.DataFrame(totalList)
result = {}
groups = df.groupby(by=["source", "count_id", "description"])["id"]
for name, group in groups:
  tempList = group.tolist()
  if len(tempList) > 1:
    result[name] = group.tolist()
result

Ouput

{('user_1', 10, 'aaaa'): [1111, 5123, 1082],
 ('user_3', 7, 'bbbb'): [1982, 3198]}

To get the same output as the one mentioned your answer, you just need to loop over the result variable and use join function on the list:

for key, value in result.items():
  print(",".join(str(v) for v in value) + " same")

Final Output

1111,5123,1082 same
1982,3198 same

Note that, we need to use str(v) for v in value in the join function since the value does not contain strings, rather it contains just floats.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Nick ODell
Solution 2