'Fitting a line to the mean values of a multilevel variable using geom_smooth

I have this dataframe.

I create a plot representing on the y-axis "value" and on the x-axis each of the levels of "Column_S", which contains levels from S1 to S10. All this grouped by "VS" and "Inst" (with a facet_grid) and "Estatus" with colour.

na.omit(data) %>% 
  group_by(VS,Estatus,Inst, Columna_S) %>% 
  summarise(media = mean(value), 
            desvio = sd(value),                             
            error_est = desvio / sqrt(n()),            
            intervalo_sup = media + (2*error_est),      
            intervalo_inf = media - (2*error_est)) -> stats

ggplot() +  
  geom_errorbar(
    data = stats,   
    aes(x = Columna_S,
        ymin = intervalo_inf,
        ymax = intervalo_sup,
        color = Estatus   
    ), width = 0.4 )+
  geom_point(data = stats,
             aes(x = Columna_S,
                 y = media,
             color = Estatus))+
  facet_grid(Inst ~ VS, scales = "free") +
  theme_classic()  

However, instead of an independent point for each mean, I'd like to use geom_smooth to plot a line (plus standard error) that fits the means along the x-axis.

I have tried the code below, but it is not generating the desired outcome.

na.omit(data) %>% 
  ggplot(aes(x=Columna_S,y=value,colour=Estatus, na.rm = TRUE) ) +
  geom_smooth(aes(fill=Estatus,na.rm = TRUE)) +
  scale_y_continuous(breaks = seq(0,100, by=25), limits=c(0,100))+
  facet_grid(Institution~Value_system, scales = "free")+
  theme_classic()  


Solution 1:[1]

You have factors on the x-axis, I am not so sure it makes sense to put a smooth line through it, but if thats what you need, it goes like this:

data %>% 
filter(!is.na(Columna_S)) %>%
ggplot(aes(x=Columna_S,y=value,colour=Estatus)) +
geom_smooth(aes(group=Estatus)) +
facet_grid(Inst~VS, scales = "free")+
theme_classic() 

enter image description here

You can consider putting a line through all the means:

data %>% 
filter(!is.na(Columna_S)) %>%
ggplot(aes(x=Columna_S,y=value,colour=Estatus)) +
stat_summary(aes(group=Estatus),fun=mean,geom="line") +
facet_grid(Inst~VS, scales = "free")+
theme_classic()

enter image description here

Solution 2:[2]

It is a year late, but…

I have a similar problem, I have an ordered factor and want to fit a straight line through the levels, but the solution above does not work: no errors given, just no lines on the plot.

The solution I have found works because in ggplot you can plot datasets on top of each other (you are not restricted to one data frame). So,

  1. create a numeric version of the factor
  2. Plot the box plots using the faceting etc as above
  3. then plot that on top of the factor layer a geom_smooth layer mentioning the data= (same data frame) and
  4. use the numeric factor as x for this layer, keep the same y.

It works! Because the factor values = the numeric values the lines plot in the right place. If it is still needed I can add the code.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 StupidWolf
Solution 2 DaveG