'The loss function and evaluation metric of XGBoost
I am confused now about the loss functions used in XGBoost. Here is how I feel confused:
- we have
objective, which is the loss function needs to be minimized;eval_metric: the metric used to represent the learning result. These two are totally unrelated (if we don't consider such as for classification onlyloglossandmloglosscan be used aseval_metric). Is this correct? If I am, then for a classification problem, how you can usermseas a performance metric? - take two options for
objectiveas an example,reg:logisticandbinary:logistic. For 0/1 classifications, usually binary logistic loss, or cross entropy should be considered as the loss function, right? So which of the two options is for this loss function, and what's the value of the other one? Say, ifbinary:logisticrepresents the cross entropy loss function, then what doesreg:logisticdo? - what's the difference between
multi:softmaxandmulti:softprob? Do they use the same loss function and just differ in the output format? If so, that should be the same forreg:logisticandbinary:logisticas well, right?
supplement for the 2nd problem
say, the loss function for 0/1 classification problem should be
L = sum(y_i*log(P_i)+(1-y_i)*log(P_i)). So if I need to choose binary:logistic here, or reg:logistic to let xgboost classifier to use L loss function. If it is binary:logistic, then what loss function reg:logistic uses?
Solution 1:[1]
'binary:logistic' uses -(y*log(y_pred) + (1-y)*(log(1-y_pred)))
'reg:logistic' uses (y - y_pred)^2
To get a total estimation of error we sum all errors and divide by number of samples.
You can find this in the basics. When looking on Linear regression VS Logistic regression.
Linear regression uses (y - y_pred)^2 as the Cost Function
Logistic regression uses -(y*log(y_pred) + (y-1)*(log(1-y_pred))) as the Cost function
Evaluation metrics are completely different thing. They design to evaluate your model. You can be confused by them because it is logical to use some evaluation metrics that are the same as the loss function, like MSE in regression problems. However, in binary problems it is not always wise to look at the logloss. My experience have thought me (in classification problems) to generally look on AUC ROC.
EDIT
according to xgboost documentation:
reg:linear: linear regression
reg:logistic: logistic regression
binary:logistic: logistic regression for binary classification, output probability
So I'm guessing:
reg:linear: is as we said, (y - y_pred)^2
reg:logistic is -(y*log(y_pred) + (y-1)*(log(1-y_pred))) and rounding predictions with 0.5 threshhold
binary:logistic is plain -(y*log(y_pred) + (1-y)*(log(1-y_pred))) (returns the probability)
You can test it out and see if it do as I've edited. If so, I will update the answer, otherwise, I'll just delete it :<
Solution 2:[2]
- Yes, a loss function and evaluation metric serve two different purposes. The loss function is used by the model to learn the relationship between input and output. The evaluation metric is used to assess how good the learned relationship is. Here is a link to a discussion of model evaluation: https://scikit-learn.org/stable/modules/model_evaluation.html
- I'm not sure exactly what you are asking here. Can you clarify this question?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Joshua Cook |
