'How to remove outliers from nonlinear regression curve? with 3-sigma limits?
I tried to remove outliers from nonlinear regression curve, and get the updated formula and error criteria. Someone suggested me to use 3-sigma limits, a statistical calculation where the data are within three standard deviations from a mean. But I don't know how to realize it in my case.
Here is the original data.
ISIDOR <- structure(list(Pos_heliaphen = c("W30", "X41", "Y27", "Z24",
"Y27", "W30", "W30", "X41", "Y27", "W30", "X41", "Z40", "Z99"
), traitement = c("WW", "WW", "WW", "WW", "WW", "WW", "WW", "WW",
"WW", "WW", "WW", "WW", "WW"), Variete = c("Isidor", "Isidor",
"Isidor", "Isidor", "Isidor", "Isidor", "Isidor", "Isidor", "Isidor",
"Isidor", "Isidor", "Isidor", "Cali"), FTSW_apres_arros = c(0.462837958498518,
0.400045032939416, 0.352560790392534, 0.377856799586057, 0.170933345859364,
0.315689846065931, 0.116825600914318, 0.0332444780173884, 0.00966070114456602,
0.0871102539376406, 0.0107280083093036, 0.195548432729584, 1),
NLE = c(0.903498791068124, 0.954670066942938, 0.970762905436272,
0.873838605282389, 0.647875257025359, 0.53056603773585, 0.0384548155916796,
0.0470924009989314, 0.00403163281128882, 0.193696514297641,
0.0718450645564359, 0.295346695941639, 1)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -13L))
Here is my original code.
pred_df <- data.frame(FTSW_apres_arros = seq(min(ISIDOR$FTSW_apres_arros),
max(ISIDOR$FTSW_apres_arros),
length.out = 100))
pred_df$NLE <- predict(mod, newdata = pred_df)
mod = nls(NLE ~ 2/(1+exp(a*FTSW_apres_arros))-1,start = list(a=1),data = ISIDOR)
ISIDOR$pred = predict(mod,ISIDOR)
a = coef(mod)
RMSE = rmse(ISIDOR$NLE, ISIDOR$pred)
MSE = mse(ISIDOR$NLE, ISIDOR$pred)
Rsquared = summary(lm(ISIDOR$NLE~ ISIDOR$pred))$r.squared
ggplot(ISIDOR, aes(FTSW_apres_arros, NLE)) +
geom_point(aes(color = Variete), pch = 19, cex = 3) +
geom_line(data = pred_df) +
scale_color_manual(values = c("red3","blue3"))+
scale_y_continuous(limits = c(0, 1.0)) +
scale_x_continuous(limits = c(0, 1)) +
labs(title = "Isidor",
y = "Expansion folliaire totale relative",
x = "FTSW",
subtitle = paste0("y = 2/(1 + exp(", round(a, 3), "* x)) -1)","\n",
"R^2 = ", round(Rsquared, 3)," RMSE = ",
round(RMSE, 3), " MSE = ", round(MSE, 3)))+
theme(plot.title = element_text(hjust = 0, size = 14, face = "bold",
colour = "black"),
plot.subtitle = element_text(hjust = 0,size=10, face = "italic",
colour = "black"),
legend.position = "none")
Here is the picture I got. I also want to get the updated formula and error criteria (circled in red).

If 3-sigma limits doesn't work for my case, could anyone recommend me other ways to deal with outliers?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
