'separate a dictionary of lists into chunks in Python
How can I split a dictionary of lists into chunks of a given size in Python. I can chunk a list. I can chunk a dictionary. But I can't quite work out how to chunk a dictionary of lists (efficiently).
my_dict = {
"key1": ["a","b","c","d"],
"key2": ["e","f","g","h"],
}
How can I chunk it so that each chunk has no more than 3 values:
{
"key1": ["a","b","c"]
}
{
"key1": ["d"],
"key2": ["e","f"],
}
{
"key2": ["g","h"],
}
Notice how the 2nd dictionary spans 2 keys.
Solution 1:[1]
This requires a single pass, but isn't the most elegant solution by any means:
def chunk_dict(in_dict, chunk_size):
chunked = [{}]
items_left = chunk_size
for key in in_dict:
for el in in_dict[key]:
if items_left == 0:
chunked.append({})
items_left = chunk_size
target_dict = chunked[-1]
if key not in target_dict:
target_dict[key] = []
target_dict[key].append(el)
items_left -= 1
return chunked
Solution 2:[2]
I see that there are answers and they're a great (and I'm slow). Though, I do it a bit differently - I do not add items 1 by 1 to the list, but rather do this in batch. You can throw yield there to make generator if you need to.
def split(_dict, limit=3):
result = []
room = 0
for key, val in _dict.items():
cur_val = val
while cur_val:
if room < 1:
result.append({})
room = limit
cut = cur_val[:room]
room -= len(cut)
result[-1][key] = cut
cur_val = cur_val[len(cut):]
return result
split(my_dict, 3)
> [{'key1': ['a', 'b', 'c']}, {'key1': ['d'], 'key2': ['e', 'f']}, {'key2': ['g', 'h']}]
Solution 3:[3]
You can first "flatten" the dict of lists into pairs of (key, value_from_list). Then you can simply iterate a list in chunks. The tricky part is just making the chunks back into a dict of lists (turn [("key1", "a"), ("key1", "b")] into {"key1": ["a", "b"]}). For that we will use a defaultdict and iterate over the chunks:
from collections import defaultdict
def chunker(d, chunk_size):
flat = [(key, value) for key, l in d.items() for value in l]
for pos in range(0, len(flat), chunk_size):
chunk = defaultdict(list)
for key, value in flat[pos:pos + chunk_size]:
chunk[key].append(value)
yield dict(chunk)
And running it as:
my_dict = {
"key1": ["a","b","c","d"],
"key2": ["e","f","g","h"],
}
for chunk in chunker(my_dict, 3):
print(chunk)
Will give:
{'key1': ['a', 'b', 'c']}
{'key1': ['d'], 'key2': ['e', 'f']}
{'key2': ['g', 'h']}
If you want to go the extra mile of saving the creation of the flat list, you can make it a generator instead (flat = ((key, value) for key, l in d.items() for value in l)) and then follow how to Iterate an iterator by chunks (of n) in Python?.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jeremy |
| Solution 2 | Alexander B. |
| Solution 3 | Tomerikoo |
