'Python: applying function to each column of an array

I need to apply a function to each column of a numpy array. I can't do it for each element of the array but it must be each column as each column combined together represents an information.

import numpy as np
C = np.random.normal(0, 1, (500, 30))

Is this the most efficient way to do this (for illustration I am using np.sum):

C2 = [ np.sum( C[ :, i ] )  for i in range( 0, 30) ]

The array C is 500x4000 and I am applying a time consuming function to each column as well.



Solution 1:[1]

It appears to take ~75% of the time to use this instead:

[ np.sum(row) for row in C.T ]

It also is more Pythonic. For reference, these are the timeit results.

>>> timeit('[ np.sum( C[ :, i ] )  for i in range( 0, 30) ]', 
    setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.418906474798
>>> print timeit('[ np.sum(row) for row in C.T ]', 
    setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.345153254432
>>> print timeit('np.apply_along_axis(np.sum, 0, C)', 
    setup='import numpy as np; C = np.random.normal(0, 1, (500, 30))', number=1000)
0.732931300891

Solution 2:[2]

You can try np.apply_along_axis:

In [21]: A = np.array([[1,2,3],[4,5,6]])

In [22]: A
Out[22]: 
array([[1, 2, 3],
       [4, 5, 6]])

In [23]: np.apply_along_axis(np.sum, 0, A)
Out[23]: array([5, 7, 9])

In [24]: np.apply_along_axis(np.sum, 1, A)
Out[24]: array([ 6, 15])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Jared Goguen
Solution 2 ChaimG