'Equally distribute and zip two lists of different length in Python

I have two lists:

A = ["a","b","c","d","e","f","g","h","i"]
B = [1,2,3]

A is 3 times longer than B and so I'd like to match them together using that, as below:

C = [("a",1"),("b",1"),("c",1),
     ("d",2),("e",2),("f",2),
     ("g",3),("h",3),("i",3)]

So the first 3 elements of A are matched with the first element of B, the next 3 elements of A are matched with the second element of B and so on.

Futhermore, this is a very simplistic example. I'd also be interested in how to best fairly distribute the elements when the one list is bigger than the other by a number which is not a whole number. For example, my two lists are 10001 and 511 elements long, so the first is ~19.57 bigger than the second. Preferably I'd like to use every element in both lists.



Solution 1:[1]

I will assume here that first list is the longer.

Here is is simple way:

rep = len(A) // len(B)
ia = iter(A)
C = [(next(ia), b) for b in B for i in range(rep)]
C.extend((a, B[-1]) for a in ia)         # in case len(A) is not an exact multiple of len(B)

Solution 2:[2]

Assuming the length of A is a multiple of B, you can easily do

>>> scale = len(A) // len(B)
>>> [(a, B[i // scale]) for i, a in enumerate(A)] 
[('a', 1),
 ('b', 1),
 ('c', 1),
 ('d', 2),
 ('e', 2),
 ('f', 2),
 ('g', 3),
 ('h', 3),
 ('i', 3)]

How it works:

  1. Determine the value of k such that len(A) == k * len(B)
  2. Iterate over A, and use k to determine which value of B to pick by dividing the current index by it, accordingly

If the lengths are not multiples, then it will throw a

IndexError: list index out of range

You can circumvent this by computing scale to be

scale = len(A) // len(B) * len(B)

For example,

A = ["a", "b", "c", "d", "e", "f", "g", "h"]
B = [1, 2, 3]

>>> scale = len(A) // len(B) * len(B)
>>> [(a, B[i // scale]) for i, a in enumerate(A)] 
[('a', 1),
 ('b', 1),
 ('c', 1),
 ('d', 1),
 ('e', 1),
 ('f', 1),
 ('g', 2),
 ('h', 2)]

Here's a functional approach using itertools repeat and chain.from_iterable.

>>> from itertools import repeat, chain
>>> list(zip(A, chain.from_iterable(zip(*repeat(B, scale)))))
[('a', 1),
 ('b', 1),
 ('c', 1),
 ('d', 2),
 ('e', 2),
 ('f', 2),
 ('g', 3),
 ('h', 3),
 ('i', 3)]

Solution 3:[3]

You can use the grouper recipe from the itertools docs (or import it from more_itertools).

Recipe:

from itertools import zip_longest

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

Application:

>>> from more_itertools import grouper                                                               
>>> A = ["a","b","c","d","e","f","g","h","i"]                                                                          
>>> B = [1,2,3]                                                                                                        
>>> [(x, i) for vals, i in zip(grouper(A, len(B)), B) for x in vals]                                                   
[('a', 1),
 ('b', 1),
 ('c', 1),
 ('d', 2),
 ('e', 2),
 ('f', 2),
 ('g', 3),
 ('h', 3),
 ('i', 3)]

Solution 4:[4]

You can also try using zip() and repeat B with your own list comprehension:

>>> A = ["a","b","c","d","e","f","g","h","i"]
>>> B = [1,2,3]
>>> list(zip(A, (y for x in B for y in len(B) * [x])))
[('a', 1), ('b', 1), ('c', 1), ('d', 2), ('e', 2), ('f', 2), ('g', 3), ('h', 3), ('i', 3)]

It's also recommended not to hardcode len(B) * 3, and use scale = len(A) // len(B) to get the proportional distribution, as shown in @coldspeed's answer

Solution 5:[5]

You can also group A into a list of sublists, each sublist the length of B:

A = ["a","b","c","d","e","f","g","h","i"]
B = [1,2,3]
_b = len(B)
new_a = [A[i:i+_b] for i in range(0, len(A), _b)]
final_result = [(c, i) for a, i in zip(new_a, B) for c in a]

Output:

[('a', 1), ('b', 1), ('c', 1), ('d', 2), ('e', 2), ('f', 2), ('g', 3), ('h', 3), ('i', 3)]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Serge Ballesta
Solution 2
Solution 3 timgeb
Solution 4
Solution 5 Ajax1234