'Create unique list of dictionaries using dict keys
I have a list of dictionaries where I want to get a new list of dictionaries with unique two keys: 1. City, 2. Country.
list = [
{ City: "Gujranwala", Country: "Pakistan", other_columns },
{ City: "Gujrwanala", Country: "India", other_columns },
{ City: "Gujranwala", Country: "Pakistan", other_columns }
]
The output should be:
list = [
{ City: "Gujranwala", Country: "Pakistan", other_columns },
{ City: "Gujrwanala", Country: "India", other_columns }
]
Solution 1:[1]
You can first extract the key-value pairs from the dicts and then remove duplicates by using a set. So you can do something like this:
- Convert dicts into a list of dict_items:
dict_items = [tuple(d.items()) for d in lst] # they need to be tuples, otherwise you wouldn't be able to cast the list to a set
- Deduplicate:
deduplicated = set(dict_items)
- Convert the dict_items back to dicts:
back_to_dicts = [dict(i) for i in deduplicated]
Solution 2:[2]
One way to do this reduction is to have a dictionary with a unique key for every city, country combination. In my case I've just concatenated both those properties for the key which is a simple working solution.
We are using a dictionary here as the lookup on a dictionary happens in constant time, so the whole algorithm will run in O(n).
lst = [
{"City": "Gujranwala", "Country": "Pakistan"},
{"City": "Gujrwanala", "Country": "India"},
{"City": "Gujranwala", "Country": "Pakistan"}
]
unique = dict()
for item in lst:
# concatenate key
key = f"{item['City']}{item['Country']}"
# only add the value to the dictionary if we do not already have an item with this key
if not key in unique:
unique[key] = item
# get the dictionary values (we don't care about the keys now)
result = list(unique.values())
print(result)
Expected ouput:
[{'City': 'Gujranwala', 'Country': 'Pakistan'}, {'City': 'Gujrwanala', 'Country': 'India'}]
Solution 3:[3]
I'm sure there are many other and probably better approaches to this problem, but you can use:
l = [
{ "City": "Gujranwala", "Country": "Pakistan" },
{ "City": "Gujrwanala", "Country": "India" },
{ "City": "Gujranwala", "Country": "Pakistan" }
]
ll, v = [], set()
for d in l:
k = d["City"] + d["Country"]
if not k in v:
v.add(k)
ll.append(d)
print(ll)
# [{'City': 'Gujranwala', 'Country': 'Pakistan'}, {'City': 'Gujrwanala', 'Country': 'India'}]`
We basically create a list with unique values containing the city and country that we use to verify if both values are already present on the final list.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Mushroomator |
| Solution 3 |
