'Pandas DataFrame Arithmetic ignoring column index
DataFrame arithmetic always align both index and column names. If I have two dfs with same number of columns but different column names, it seems I can't do arithmetic operations between them:
Out[1]:
length = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['length1','length2'])
length
Out[2]:
length1 length2
0 -0.430872 1.087211
1 -0.788218 -0.440801
2 -0.540136 -1.217191
3 -0.561248 0.305545
4 0.158832 0.075283
height = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(1,6),columns=['height1','height2'])
height
Out[3]:
height1 height2
1 -1.105751 1.089808
2 -0.360827 -0.803927
3 0.454469 -0.766144
4 0.476534 -0.855870
5 -0.007049 0.038307
length*height
Out[4]:
height1 height2 length1 length2
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
This is probably a safety measure to make sure you are only operating on the intended data. But I'm still wondering is there a way I can perform operations between two DataFrames (with same number of columns) but only aligning on index axis?
Edit: original example was over-simplified in that the two df's have the same index [0,1,2,3,4]. I shifted the second df's index by 1 to make it a better example.
Solution 1:[1]
ans=pd.DataFrame(length.values * height.values)
Converted it to a numpy array and do multiplication like that
0 1
0 0.396724 -0.264562
1 -0.460419 -0.285086
2 0.126083 -0.494675
3 -0.272121 0.305155
4 -0.159292 0.444439
Solution 2:[2]
Going of of what user3589054 did, I think this code might work for you:
height.multiply(length.values, axis = 0)
Here is my output:
>>> length = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['length1','length2'])
>>> height = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['height1','height2'])
>>> length
length1 length2
0 1.000865 -0.758316
1 0.285942 -2.000440
2 -0.399625 0.686547
3 0.809561 1.238211
4 2.216696 -1.347227
>>> height
height1 height2
0 0.505477 -0.299634
1 -0.234154 -2.490459
2 -0.134534 1.063768
3 0.010025 0.435895
4 2.290053 -0.096494
>>> height.multiply(length.values, axis = 0)
height1 height2
0 0.505915 0.227217
1 -0.066954 4.982013
2 0.053763 0.730326
3 0.008116 0.539730
4 5.076352 0.129999
Solution 3:[3]
Concise renaming of columns in order to preserve index alignment:
length * height.set_axis(length.columns, axis=1)
# output:
length1 length2
0 NaN NaN
1 0.010236 -0.144040
2 -0.342200 -1.320554
3 -0.223242 -0.545550
4 -4.178892 0.139534
5 NaN NaN
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | NinjaGaiden |
| Solution 2 | walker_4 |
| Solution 3 |
