'separate a dictionary of lists into chunks in Python

How can I split a dictionary of lists into chunks of a given size in Python. I can chunk a list. I can chunk a dictionary. But I can't quite work out how to chunk a dictionary of lists (efficiently).

my_dict = {
  "key1": ["a","b","c","d"],
  "key2": ["e","f","g","h"],
}

How can I chunk it so that each chunk has no more than 3 values:

{
  "key1": ["a","b","c"]
}
{
  "key1": ["d"],
  "key2": ["e","f"],
}
{
  "key2": ["g","h"],
}

Notice how the 2nd dictionary spans 2 keys.



Solution 1:[1]

This requires a single pass, but isn't the most elegant solution by any means:

def chunk_dict(in_dict, chunk_size):
    chunked = [{}]
    items_left = chunk_size
    for key in in_dict:
        for el in in_dict[key]:
            if items_left == 0:
                chunked.append({})
                items_left = chunk_size
            target_dict = chunked[-1]
            if key not in target_dict:
                target_dict[key] = []
            target_dict[key].append(el)
            items_left -= 1
    return chunked

Solution 2:[2]

I see that there are answers and they're a great (and I'm slow). Though, I do it a bit differently - I do not add items 1 by 1 to the list, but rather do this in batch. You can throw yield there to make generator if you need to.

def split(_dict, limit=3):
    result = []
    room = 0
    for key, val in _dict.items():
        cur_val = val
        while cur_val:
            if room < 1:
                result.append({})
                room = limit
            cut = cur_val[:room]
            room -= len(cut)
            result[-1][key] = cut
            cur_val = cur_val[len(cut):]
    return result

split(my_dict, 3)
> [{'key1': ['a', 'b', 'c']}, {'key1': ['d'], 'key2': ['e', 'f']}, {'key2': ['g', 'h']}]

Solution 3:[3]

You can first "flatten" the dict of lists into pairs of (key, value_from_list). Then you can simply iterate a list in chunks. The tricky part is just making the chunks back into a dict of lists (turn [("key1", "a"), ("key1", "b")] into {"key1": ["a", "b"]}). For that we will use a defaultdict and iterate over the chunks:

from collections import defaultdict

def chunker(d, chunk_size):
    flat = [(key, value) for key, l in d.items() for value in l]
    for pos in range(0, len(flat), chunk_size):
        chunk = defaultdict(list)
        for key, value in flat[pos:pos + chunk_size]:
            chunk[key].append(value)
        yield dict(chunk)

And running it as:

my_dict = {
  "key1": ["a","b","c","d"],
  "key2": ["e","f","g","h"],
}

for chunk in chunker(my_dict, 3):
    print(chunk)

Will give:

{'key1': ['a', 'b', 'c']}
{'key1': ['d'], 'key2': ['e', 'f']}
{'key2': ['g', 'h']}

If you want to go the extra mile of saving the creation of the flat list, you can make it a generator instead (flat = ((key, value) for key, l in d.items() for value in l)) and then follow how to Iterate an iterator by chunks (of n) in Python?.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jeremy
Solution 2 Alexander B.
Solution 3 Tomerikoo