'Equally distribute and zip two lists of different length in Python
I have two lists:
A = ["a","b","c","d","e","f","g","h","i"]
B = [1,2,3]
A is 3 times longer than B and so I'd like to match them together using that, as below:
C = [("a",1"),("b",1"),("c",1),
("d",2),("e",2),("f",2),
("g",3),("h",3),("i",3)]
So the first 3 elements of A are matched with the first element of B, the next 3 elements of A are matched with the second element of B and so on.
Futhermore, this is a very simplistic example. I'd also be interested in how to best fairly distribute the elements when the one list is bigger than the other by a number which is not a whole number. For example, my two lists are 10001 and 511 elements long, so the first is ~19.57 bigger than the second. Preferably I'd like to use every element in both lists.
Solution 1:[1]
I will assume here that first list is the longer.
Here is is simple way:
rep = len(A) // len(B)
ia = iter(A)
C = [(next(ia), b) for b in B for i in range(rep)]
C.extend((a, B[-1]) for a in ia) # in case len(A) is not an exact multiple of len(B)
Solution 2:[2]
Assuming the length of A is a multiple of B, you can easily do
>>> scale = len(A) // len(B)
>>> [(a, B[i // scale]) for i, a in enumerate(A)]
[('a', 1),
('b', 1),
('c', 1),
('d', 2),
('e', 2),
('f', 2),
('g', 3),
('h', 3),
('i', 3)]
How it works:
- Determine the value of
ksuch thatlen(A) == k * len(B) - Iterate over
A, and usekto determine which value ofBto pick by dividing the current index by it, accordingly
If the lengths are not multiples, then it will throw a
IndexError: list index out of range
You can circumvent this by computing scale to be
scale = len(A) // len(B) * len(B)
For example,
A = ["a", "b", "c", "d", "e", "f", "g", "h"]
B = [1, 2, 3]
>>> scale = len(A) // len(B) * len(B)
>>> [(a, B[i // scale]) for i, a in enumerate(A)]
[('a', 1),
('b', 1),
('c', 1),
('d', 1),
('e', 1),
('f', 1),
('g', 2),
('h', 2)]
Here's a functional approach using itertools repeat and chain.from_iterable.
>>> from itertools import repeat, chain
>>> list(zip(A, chain.from_iterable(zip(*repeat(B, scale)))))
[('a', 1),
('b', 1),
('c', 1),
('d', 2),
('e', 2),
('f', 2),
('g', 3),
('h', 3),
('i', 3)]
Solution 3:[3]
You can use the grouper recipe from the itertools docs (or import it from more_itertools).
Recipe:
from itertools import zip_longest
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)
Application:
>>> from more_itertools import grouper
>>> A = ["a","b","c","d","e","f","g","h","i"]
>>> B = [1,2,3]
>>> [(x, i) for vals, i in zip(grouper(A, len(B)), B) for x in vals]
[('a', 1),
('b', 1),
('c', 1),
('d', 2),
('e', 2),
('f', 2),
('g', 3),
('h', 3),
('i', 3)]
Solution 4:[4]
You can also try using zip() and repeat B with your own list comprehension:
>>> A = ["a","b","c","d","e","f","g","h","i"]
>>> B = [1,2,3]
>>> list(zip(A, (y for x in B for y in len(B) * [x])))
[('a', 1), ('b', 1), ('c', 1), ('d', 2), ('e', 2), ('f', 2), ('g', 3), ('h', 3), ('i', 3)]
It's also recommended not to hardcode len(B) * 3, and use scale = len(A) // len(B) to get the proportional distribution, as shown in @coldspeed's answer
Solution 5:[5]
You can also group A into a list of sublists, each sublist the length of B:
A = ["a","b","c","d","e","f","g","h","i"]
B = [1,2,3]
_b = len(B)
new_a = [A[i:i+_b] for i in range(0, len(A), _b)]
final_result = [(c, i) for a, i in zip(new_a, B) for c in a]
Output:
[('a', 1), ('b', 1), ('c', 1), ('d', 2), ('e', 2), ('f', 2), ('g', 3), ('h', 3), ('i', 3)]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Serge Ballesta |
| Solution 2 | |
| Solution 3 | timgeb |
| Solution 4 | |
| Solution 5 | Ajax1234 |
