'R - how to do regression y~x for different id? [duplicate]
say I have a dataframe with column id, x and y:
df <- data.frame(id = c("A","A","A","B","B","B","B","B","C","C","C","C","C","D","D",'D'),
y = c(1,3,5,4,3,4,6,8,1,4,7,10,2,5,6,8),
x = c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4))
How could I do regression y~x for each different id?
I see a similar question here.
But is there a simple way to just do what I need here?
Solution 1:[1]
How about this. If you want just the model coefficients, you could use sapply() to make a matrix of results. Otherwise lapply() (or sapply() too) could be used to make a list of the models.
df <- data.frame(id = c("A","A","A","B","B","B","B","B","C","C","C","C","C","D","D",'D'),
y = c(1,3,5,4,3,4,6,8,1,4,7,10,2,5,6,8),
x = c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4))
coefs <- sapply(unique(df$id), function(i)lm(y ~ x, data=subset(df, id == i))$coef)
coefs
#> A B C D
#> (Intercept) -1 2.117647 -1.411765 1.833333
#> x 2 1.029412 2.823529 1.500000
mods <- lapply(unique(df$id), function(i)lm(y ~ x, data=subset(df, id == i)))
mods
#> [[1]]
#>
#> Call:
#> lm(formula = y ~ x, data = subset(df, id == i))
#>
#> Coefficients:
#> (Intercept) x
#> -1 2
#>
#>
#> [[2]]
#>
#> Call:
#> lm(formula = y ~ x, data = subset(df, id == i))
#>
#> Coefficients:
#> (Intercept) x
#> 2.118 1.029
#>
#>
#> [[3]]
#>
#> Call:
#> lm(formula = y ~ x, data = subset(df, id == i))
#>
#> Coefficients:
#> (Intercept) x
#> -1.412 2.824
#>
#>
#> [[4]]
#>
#> Call:
#> lm(formula = y ~ x, data = subset(df, id == i))
#>
#> Coefficients:
#> (Intercept) x
#> 1.833 1.500
Created on 2022-02-15 by the reprex package (v2.0.1)
Solution 2:[2]
library(tidyverse)
df <- data.frame(
id = c("A", "A", "A", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "D", "D", "D"),
y = c(1, 3, 5, 4, 3, 4, 6, 8, 1, 4, 7, 10, 2, 5, 6, 8),
x = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4)
)
df %>%
nest(-id) %>%
mutate(model = data %>% map(~ lm(y ~ x, data = .x)))
Solution 3:[3]
I got a warning message about an essentially perfect fit when I use your data set, but with a slightly different data frame, this could work.
You can use by(), define the data set df, the variable for identifiers df$idand the function you want to be carried out, summary(lm(y ~ x, data = df)).
df <- data.frame(id = rep(letters[1:3], 10),
y = rnorm(30),
x = rnorm(30))
by(df, df$id, function(df) summary(lm(y ~ x, data = df)))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | DaveArmstrong |
| Solution 2 | danlooo |
| Solution 3 | POC |
