'multiply and sum two columns in two dataframes in Python

I have two dataframes both with 6 rows. I want to multiply the values in two selected columns from the two dataframes (one from each df)

result = sum(a * b for a, b in zip(list(df1['col1']), list(df2['col3'])))

I do not seem to get what I want. I did the calc "manually" in Excel (for one date in my time series), which gave me the expected result. So my question is if I did something wrong?



Solution 1:[1]

If same number of rows and same indices simple subtract and then use sum:

result = (df1['col1'] * df2['col3']).sum()

If possible different indices but same length:

result  = (df1['col1'] * df2['col3'].to_numpy()).sum()

Or use numpy.dot:

result = np.dot(df1['col1'],  df2['col3'])

If possible different length and different indices:

result = (df1['col1'].reset_index(drop=True)
             .mul(df2['col3'].reset_index(drop=True), fill_value=1).sum()

Solution 2:[2]

You can do it like this:

import pandas as pd 
import numpy as np 

df1 = pd.DataFrame({'col1':[0, 1, 2, 3, 4, 5]})
df2 = pd.DataFrame({'col1':[0, 1, 2, 3, 4, 5]})

result = np.matmul(df1.col1, df2.col1)

This will also sum the multiplications.

Your formulation works too, if you add []:

result = sum([a * b for a, b in zip(list(df1['col1']), list(df2['col1']))])

This gives the same result.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2