'Explained and Unexplained Effects don't add up to gap in Oaxaca-Blinder decomposition (statsmodels)

Here's a Python implementation of Oaxaca-Blinder decomposition analysis using statsmodels.

First, how do I interpret the negative value for Explained Effect here? Second, I thought the two effects (Explained and Unexplained) should add up to the Gap (0.07160), but they do not here. In the past, I've definitely seen the two add up to the Gap.

from statsmodels.stats.oaxaca import OaxacaBlinder

features = ['geo', 'figure', 'multipart', 'long']

ob = OaxacaBlinder(endog=df['rate'], exog=df[features], bifurcate='geo')
ob.two_fold().summary()
ob.three_fold().summary()
Oaxaca-Blinder Two-fold Effects

Unexplained Effect: 0.06270
Explained Effect: -0.00204
Gap: 0.07160

Oaxaca-Blinder Three-fold Effects

Characteristic Effect: -0.00191
Coefficient Effect: 0.06242
Interaction Effect: 0.00014
Gap: 0.07160

I've also run the underlying linear regression models and confirmed that the gap is identical to the difference between the mean predicted outcomes of the two models, so I believe the issue has to do with the individual Effects values.

from sklearn.linear_model import LinearRegression


us = df[df['geo']==0]
intl = df[df['geo']==1]

a = LinearRegression()
a.fit(us[features], us['rate'])
a.predict([us[features].mean().values])  # 0.59456195

b = LinearRegression()
b.fit(intl[features], intl['rate'])
b.predict([intl[features].mean().values])  # 0.66615859

0.66615859 - 0.59456195  # 0.07159663999999999  - same as gap above


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source