'Multiple regression: R splits Variable into multiple

Hey there i want to explore the effect of Age and Gender on points of a test via mlr. Yet when i type

model <- lm(punkte~ Age + Gender, data = df)

R gives me following results

(Intercept)   5.677369   0.176482  32.170  < 2e-16 ***
Age          -0.017953   0.004932  -3.640 0.000300 ***
GenderFemale  0.595369   0.154697   3.849 0.000134 ***
GenderDivers -1.416150   0.684191  -2.070 0.038964 *  

But i dont want the Gender variable to be split into multiple, also GenderMale is missing and i dont know why. Help would be appreciated very much



Solution 1:[1]

"Male" is missing since your model chooses "male" as the reference, when you have categorical variables in gender.

You can always change the reference variable by something like:

df <- within(df, gender <- relevel(factor(gender), ref = "Female"))

You can only combine the "female" and "divers" if you change the data from the root (and normally we don't do that). For example, combine those two to "non-male" or "others".

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ElleryC