'multiply and sum two columns in two dataframes in Python
I have two dataframes both with 6 rows. I want to multiply the values in two selected columns from the two dataframes (one from each df)
result = sum(a * b for a, b in zip(list(df1['col1']), list(df2['col3'])))
I do not seem to get what I want. I did the calc "manually" in Excel (for one date in my time series), which gave me the expected result. So my question is if I did something wrong?
Solution 1:[1]
If same number of rows and same indices simple subtract and then use sum:
result = (df1['col1'] * df2['col3']).sum()
If possible different indices but same length:
result = (df1['col1'] * df2['col3'].to_numpy()).sum()
Or use numpy.dot:
result = np.dot(df1['col1'], df2['col3'])
If possible different length and different indices:
result = (df1['col1'].reset_index(drop=True)
.mul(df2['col3'].reset_index(drop=True), fill_value=1).sum()
Solution 2:[2]
You can do it like this:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'col1':[0, 1, 2, 3, 4, 5]})
df2 = pd.DataFrame({'col1':[0, 1, 2, 3, 4, 5]})
result = np.matmul(df1.col1, df2.col1)
This will also sum the multiplications.
Your formulation works too, if you add []:
result = sum([a * b for a, b in zip(list(df1['col1']), list(df2['col1']))])
This gives the same result.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
