'XGBoost: Multioutput Regression with a Custom Objective
Since the 1.6 version update of XGBoost, it now starts to support a multi-output regression. According to the website the support is in really early stages, so its unclear to which extend it already works properly. I wanted to get some feedback, if anybody got it running with a customized objective function already. I was working with the following code in my single-output regression (its from the XGBoost website as an example for custom objective functions):
def gradient(predt: np.ndarray, dtrain: xgb.DMatrix) -> np.ndarray:
'''Compute the gradient squared log error.'''
y = dtrain.get_label()
return (np.log1p(predt) - np.log1p(y)) / (predt + 1)
def hessian(predt: np.ndarray, dtrain: xgb.DMatrix) -> np.ndarray:
'''Compute the hessian for squared log error.'''
y = dtrain.get_label()
return ((-np.log1p(predt) + np.log1p(y) + 1) /
np.power(predt + 1, 2))
def squared_log(predt: np.ndarray,
dtrain: xgb.DMatrix) -> Tuple[np.ndarray, np.ndarray]:
'''Squared Log Error objective. A simplified version for RMSLE used as
objective function.
'''
predt[predt < -1] = -1 + 1e-6
grad = gradient(predt, dtrain)
hess = hessian(predt, dtrain)
return grad, hess
xgb.train(obj=squared_log)
This piece of code is now no longer working, and for me its unfortunately unclear, if thats due to the lack of support for multi-output so far, or because an error that arises, if this code is used in multi-output regression.
The error it produces is:
ValueError: operands could not be broadcast together with shapes
It works again if i just change the label to a dataset with only 1 column and perform a single output regression.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
