'Dummy variables: must I scale* them if I scale all dataset or leave them apart? *(center, scale, normalize...)
When working with scaled data, dummy variables should also be scaled or should be left apart without scaling? Can ML algorithms produce different result and which could be the best option? As a reprex I upload two images of iris dataset, once Species variables has been converted to dummy variable, one dataset without scaling and the other dataset with all variables scaled.
If I use an algorithm which requires normalized data, my question is:
which would be the best option? a) scale all dataset, including dummy variables to be scaled b) scale only numeric variables, let dummy variables without scale
could scaled dummy variables affect to ML algorithm performance? If it does, producing better or worse results than not scaling?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|


