'Something strange is happening with groupby and agg functions in python pandas

I have a dataset that looks like

 A    B     C      year   CompanyName   sector
 1    nan   1      1999      tesla        10 
 4     3    4      2000      tesla        10 
 Nan  nan   7      2001      tesla        10 
 2    nan   8      2002      tesla        10
 3    nan   10     1999      BMW          12
 2    -1    234    2000      BMW          12
 2    nan   548    2002      BMW          12

Column B is the diffrence between two consecutive years of column A for the same company(B=A(n)-A(n-1)).

I calculate a new column D which is: D=(B(n)-C(n))/B(n)

After calculating all these column I group by sector and year to have my data looking like this:

Sector    year   Amean     Bmean    Cmean    Dmean   Dmedian
 10       2000   ..        ..        ..       ..     . .
 10       2001   . .       . .       .        .       .
 ............................................................

The strange thing happening is that i have many missing values for Dmean(Dmean column has too many np.NaNs even though the Dmedian is a numeric value) all other values are present, what am I doing wrong? here is my code:

 g = finalData.groupby('CompanyName')
 #The year is shifted and we add one to confirm that only consecutive years are 
 subtracted
 finalData['B'] = finalData['A'].diff().where(finalData['year'].eq(g['year'].shift()+1))
 finalData["D"] = numpy.where(finalData.B.notnull(), (finalData.B-finalData.C)/finalData.B, numpy.NaN)
 finalData = finalData.groupby(['Sector','year']).agg({'C':'mean', "A":'mean', "B":['mean', 'median'],  "D":['mean', 'median']}).reset_index()

Ps. I think its the line of code where i use numpy to assign column D



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source