'target encoding in the tidymodels framework using embedd
I would like to do target encoding for a categorical variable with too many levels.
I have seen this vignette , which proposes the following approach to target encode a variable:
step_lencode_glm()
step_lencode_bayes()
step_lencode_mixed()
The three approaches use all the records to create the estimates, which tends to overfit to that column.
Using tidymodels, is there an easy way to split my training set 5 folds and get the target encoding from the other 4 folds?
Thanks
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
