'Vectorized operations in Pandas with fixed columns/rows/values

I would like to perform operations on Pandas dataframes using fixed columns, rows, or values.

For example:

import numpy as np
import pandas as pd

df = pd.DataFrame({'a':(1,2,3), 'b':(4,5,6), 'c':(7,8,9), 'd':(10,11,12),
                  'e':(13,14,15)})

df
Out[57]: 
   a  b  c   d   e
0  1  4  7  10  13
1  2  5  8  11  14
2  3  6  9  12  15

I want to use the values in columns 'a' and 'b' as fixed values.


# It's easy enough to perform the operation I want on one column at a time:
df.loc[:,'f'] = df.loc[:,'c'] + df.loc[:,'a'] + df.loc[:,'b']

# It gets cumbersome if there are many columns to perform the operation on though:
df.loc[:,'g'] = df.loc[:,'d'] / df.loc[:,'a'] * df.loc[:,'b']
df.loc[:,'h'] = df.loc[:,'e'] / df.loc[:,'a'] * df.loc[:,'b']
# etc.

# This returns columns with all NaN values.
df.loc[:,('f','g','h')] = df.loc[:,'c':'e'] / df.loc[:'a']

Is there an optimal way to do what I want in Pandas? I could not find working solutions in the Pandas documentation or this SO thread. I don't think I can use .map() or .applymap(), because I'm under the impression they can only be using for simple equations (one input value). Thanks for reading.



Solution 1:[1]

As @Corralien pointed out, its better to use Pandas dataframe operations such as .div(), but I also figured out that the usage of .loc[] is important.

# Doesn't work:
df.loc[:,['f','g','h']] = df.loc[:,'c':'e'].div(df.loc[:'a'], axis=0)

# Doesn't work:
df[['f','g','h']] = df.loc[:,'c':'e'].div(df.loc[:'a'], axis=0)

# Now works.
df[['f','g','h']] = df.loc[:,'c':'e'].div(df['a'], axis=0)

At the moment, I'm not exactly sure why this is. Any insight would be helpful, thanks.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 LabRat01010