'How to convert this nested loop to numpy broadcast?

I want to rearrange my data (two even-length 1d arrays):

cs = [w x y z]
rs = [a b c d e f]

to make a result like this:

[[a b w x]
 [c d w x]
 [e f w x]
 [a b y z]
 [c d y z]
 [e f y z]]

This is what I have tried (it works):

ls = []
for c in range(0,len(cs),2):
  for r in range(0,len(rs),2):
    item = [rs[r], rs[r+1], cs[c], cs[c+1]]
    ls.append(item)

But I want to get the same result using reshaping/broadcasting or other numpy functions. What is the idiomatic way to do this task in numpy?



Solution 1:[1]

An alternative to using tile/repeat, is to generate repeated row indices.

Make the two arrays - reshaped as they will be combined:

In [106]: rs=np.reshape(list('abcdef'),(3,2))
In [107]: cs=np.reshape(list('wxyz'),(2,2))
In [108]: rs
Out[108]: 
array([['a', 'b'],
       ['c', 'd'],
       ['e', 'f']], dtype='<U1')
In [109]: cs
Out[109]: 
array([['w', 'x'],
       ['y', 'z']], dtype='<U1')

Make 'meshgrid' like indices (itertools.product could also be used)

In [110]: IJ = np.indices((3,2))
In [111]: IJ
Out[111]: 
array([[[0, 0],
        [1, 1],
        [2, 2]],

       [[0, 1],
        [0, 1],
        [0, 1]]])

reshape with order gives two 1d arrays:

In [112]: I,J=IJ.reshape(2,6,order='F')
In [113]: I,J
Out[113]: (array([0, 1, 2, 0, 1, 2]), array([0, 0, 0, 1, 1, 1]))

Then just index the rs and cs and combine them with hstack:

In [114]: np.hstack((rs[I],cs[J]))
Out[114]: 
array([['a', 'b', 'w', 'x'],
       ['c', 'd', 'w', 'x'],
       ['e', 'f', 'w', 'x'],
       ['a', 'b', 'y', 'z'],
       ['c', 'd', 'y', 'z'],
       ['e', 'f', 'y', 'z']], dtype='<U1')

edit

Here's another way of looking this - a bit more advanced. With sliding_window_view we can get a "block" view of that Out[114] result:

In [130]: np.lib.stride_tricks.sliding_window_view(_114,(3,2))[::3,::2,:,:]
Out[130]: 
array([[[['a', 'b'],
         ['c', 'd'],
         ['e', 'f']],

        [['w', 'x'],
         ['w', 'x'],
         ['w', 'x']]],


       [[['a', 'b'],
         ['c', 'd'],
         ['e', 'f']],

        [['y', 'z'],
         ['y', 'z'],
         ['y', 'z']]]], dtype='<U1')

With a bit more reverse engineering, I find I can create Out[114] with:

In [147]: res = np.zeros((6,4),'U1')
In [148]: res1 = np.lib.stride_tricks.sliding_window_view(res,(3,2),writeable=True)[::3,::2,:,:]
In [149]: res1[:,0,:,:] = rs
In [150]: res1[:,1,:,:] = cs[:,None,:]
In [151]: res
Out[151]: 
array([['a', 'b', 'w', 'x'],
       ['c', 'd', 'w', 'x'],
       ['e', 'f', 'w', 'x'],
       ['a', 'b', 'y', 'z'],
       ['c', 'd', 'y', 'z'],
       ['e', 'f', 'y', 'z']], dtype='<U1')

I can't say that either of these is superior, but they show there are various ways of "vectorizing" this kind of array layout.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1