'How to use broom::tidy with multiple models?
I'm trying to summarize the results of 19 polynomial regression models by using broom. I've followed this SO Question and am trying to use it with broom::tidy. My script is as follows:
ALTER PROCEDURE [dbo].[spRegressionPeak]
@StudyID int
AS
BEGIN
Declare @sStudyID VARCHAR(50)
Set @sStudyID = CONVERT(VARCHAR(50),@StudyID)
--We are selecting the distinct StudyID, Productnumber, ResponseID and mean
values 1 thorugh 6 from the CodeMeans table.
--Note that spCodeMeans must be run before running this stored procedure to
ensure response data exists in the CodeMeans table.
--We use IsNull values to pass zeroes where an average wasn't calculated os that
the polynomial regression can be calculated.
DECLARE @inquery AS NVARCHAR(MAX) = '
Select
c.StudyID, c.RespID, c.LikingOrder, avg(isnull(C1,0)) as C1, avg(isnull(C2,0)) as C2, avg(isnull(C3,0)) as C3, avg(isnull(C4,0)) as C4,
avg(isnull(C5,0)) as C5, avg(isnull(C6,0)) as C6, avg(isnull(C7,0)) as C7, avg(isnull(C8,0)) as C8, avg(isnull(C9,0)) as C9,
avg(isnull(C10,0)) as C10, avg(isnull(C11,0)) as C11, avg(isnull(C12,0)) as C12, avg(isnull(C13,0)) as C13, avg(isnull(C14,0)) as C14,
avg(isnull(C15,0)) as C15, avg(isnull(C16,0)) as C16, avg(isnull(C17,0)) as C17, avg(isnull(C18,0)) as C18, avg(isnull(C19,0)) as C19
from ClosedStudyResponses c
where c.StudyID = @StudyID
group by StudyID, RespID, LikingOrder
order by RespID
'
--We are setting @inquery aka InputDataSet to be our initial dataset.
--R Services requires that a data.frame be passed to any calculations being
generated. As such, df is simply data framing the @inquery data.
--The res object holds the polynomial regression results by RespondentID and
LikingOrder for each of the averages in the @inquery resultset.
EXEC sp_execute_external_script @language = N'R'
, @script = N'
library(tidyr, broom)
studymeans <- InputDataSet
df <- data.frame(studymeans)
lin.mod.1 <- lm(df$LikingOrder ~ poly(df$C1,3, raw=TRUE))
lin.mod.2 <- lm(df$LikingOrder ~ poly(df$C2,3, raw=TRUE))
lin.mod.3 <- lm(df$LikingOrder ~ poly(df$C3,3, raw=TRUE))
lin.mod.4 <- lm(df$LikingOrder ~ poly(df$C4,3, raw=TRUE))
lin.mod.5 <- lm(df$LikingOrder ~ poly(df$C5,3, raw=TRUE))
lin.mod.6 <- lm(df$LikingOrder ~ poly(df$C6,3, raw=TRUE))
lin.mod.7 <- lm(df$LikingOrder ~ poly(df$C7,3, raw=TRUE))
lin.mod.8 <- lm(df$LikingOrder ~ poly(df$C8,3, raw=TRUE))
lin.mod.9 <- lm(df$LikingOrder ~ poly(df$C9,3, raw=TRUE))
lin.mod.10 <- lm(df$LikingOrder ~ poly(df$C10,3, raw=TRUE))
lin.mod.11 <- lm(df$LikingOrder ~ poly(df$C11,3, raw=TRUE))
lin.mod.12 <- lm(df$LikingOrder ~ poly(df$C12,3, raw=TRUE))
lin.mod.13 <- lm(df$LikingOrder ~ poly(df$C13,3, raw=TRUE))
lin.mod.14 <- lm(df$LikingOrder ~ poly(df$C14,3, raw=TRUE))
lin.mod.15 <- lm(df$LikingOrder ~ poly(df$C15,3, raw=TRUE))
lin.mod.16 <- lm(df$LikingOrder ~ poly(df$C16,3, raw=TRUE))
lin.mod.17 <- lm(df$LikingOrder ~ poly(df$C17,3, raw=TRUE))
lin.mod.18 <- lm(df$LikingOrder ~ poly(df$C18,3, raw=TRUE))
lin.mod.19 <- lm(df$LikingOrder ~ poly(df$C19,3, raw=TRUE))
lst <- lapply(ls(pattern="lin.mod"), get)
allmodels <- lapply(lst, summary)
res <- broom::tidy(allmodels)
'
, @input_data_1 = @inquery
, @output_data_1_name = N'res'
, @params = N'@StudyID int'
,@StudyID = @StudyID
--- Edit this line to handle the output data frame.
--WITH RESULT SETS ((StudyID int, RespID int, LikingOrder int, NewColumn int,
res varchar(max)));
END;
The above script throws the following error when a valid StudyID input parameter is passed to it:
Error in setNames(data.frame(data), value.name) :
'names' attribute [1] must be the same length as the vector [0]
Calls: source ... <Anonymous> -> <Anonymous> -> melt.default -> setNames
In addition: There were 50 or more warnings (use warnings() to see the first
50)
My input data is as follows:
The desired outcome is to obtain the summary of all 19 models in a data.frame. How do I resolve the error and modify my code to accomplish the end result?
Solution 1:[1]
You haven't given us a reproducible example; here's something that seems to work. A few potential issues: you need to run tidy on models, not summaries; it's better to avoid $-indexing in model formulas.
library(purrr)
df <- mtcars
predvars <- colnames(mtcars)[-1]
... this would be paste0("C",1:19) in your case ...
respvar <- "mpg" ## would be "LikingOrder"
predpolys <- sprintf("poly(%s,3,raw=TRUE)",predvars)
forms <- map(predpolys, reformulate,
response=respvar) ## construct formulas
names(forms) <- predvars ## names will be inherited by model lists
modList <- map(forms, lm, data= df) ## fit all models
res <- map(modList, broom::tidy) ## tidy all models
If desired you can dplyr::bind_rows(res,.id="predvar") at this point, or you can replace map() with map_dfr(..., .id = "predvar") ...
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
