'XGBoost: Multioutput Regression with a Custom Objective

Since the 1.6 version update of XGBoost, it now starts to support a multi-output regression. According to the website the support is in really early stages, so its unclear to which extend it already works properly. I wanted to get some feedback, if anybody got it running with a customized objective function already. I was working with the following code in my single-output regression (its from the XGBoost website as an example for custom objective functions):

def gradient(predt: np.ndarray, dtrain: xgb.DMatrix) -> np.ndarray:
    '''Compute the gradient squared log error.'''
    y = dtrain.get_label()
    return (np.log1p(predt) - np.log1p(y)) / (predt + 1)

def hessian(predt: np.ndarray, dtrain: xgb.DMatrix) -> np.ndarray:
    '''Compute the hessian for squared log error.'''
    y = dtrain.get_label()
    return ((-np.log1p(predt) + np.log1p(y) + 1) /
            np.power(predt + 1, 2))

def squared_log(predt: np.ndarray,
                dtrain: xgb.DMatrix) -> Tuple[np.ndarray, np.ndarray]:
    '''Squared Log Error objective. A simplified version for RMSLE used as
    objective function.
    '''
    predt[predt < -1] = -1 + 1e-6
    grad = gradient(predt, dtrain)
    hess = hessian(predt, dtrain)
    return grad, hess

xgb.train(obj=squared_log)

This piece of code is now no longer working, and for me its unfortunately unclear, if thats due to the lack of support for multi-output so far, or because an error that arises, if this code is used in multi-output regression.

The error it produces is:

ValueError: operands could not be broadcast together with shapes

It works again if i just change the label to a dataset with only 1 column and perform a single output regression.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source