'Why does cur_data() within summarize() return df_slice() error?
I ran into trouble today when using cur_data() within summarize().
Example data:
library(tidyverse)
dat <- tibble(id = 1:6,
type = c(1, 1, 2, 2, 3, 3),
value = c(2, 4, 6, 8, 7, NA))
This first pipeline throws an error, mentioning df_slice():
dat %>%
group_by(type) %>%
summarize(mean = mean(value),
n = length(cur_data() %>% filter(!is.na(value)) %>% pull(id) %>% unique()),
.groups = "drop")
#> Error in `summarize()`:
#> ! Problem while computing `n = length(...)`.
#> ℹ The error occurred in group 1: type = 1.
#> Caused by error:
#> ! Internal error in `df_slice()`: Columns must match the data frame size.
However, switching the order of the summary stats within summarize() avoids the error:
dat %>%
group_by(type) %>%
summarize(n = length(cur_data() %>% filter(!is.na(value)) %>% pull(id) %>% unique()),
mean = mean(value),
.groups = "drop")
#> # A tibble: 3 × 3
#> type n mean
#> <dbl> <int> <dbl>
#> 1 1 2 3
#> 2 2 2 7
#> 3 3 1 NA
Additionally, piping cur_data() into as.data.frame() also avoids the error:
dat %>%
group_by(type) %>%
summarize(mean = mean(value),
n = length(cur_data() %>% as.data.frame() %>% filter(!is.na(value)) %>% pull(id) %>% unique()),
.groups = "drop")
#> # A tibble: 3 × 3
#> type mean n
#> <dbl> <dbl> <int>
#> 1 1 3 2
#> 2 2 7 2
#> 3 3 NA 1
Created on 2022-02-15 by the reprex package (v2.0.1)
Why can I not use the first example syntax? Ultimately I calculated anything that required cur_data() within mutate() and just kept the first() observation within a later summarize() call, but I'd like to know what I'm missing about summarize().
Additional session info:
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: aarch64-apple-darwin20.6.0 (64-bit)
Running under: macOS Monterey 12.1
Matrix products: default
LAPACK: /opt/homebrew/Cellar/r/4.1.2/lib/R/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reprex_2.0.1 palmerpenguins_0.1.0 forcats_0.5.1 stringr_1.4.0 readr_2.1.2
[6] tibble_3.1.6 ggplot2_3.3.5 tidyverse_1.3.1 tidyr_1.2.0 purrr_0.3.4
[11] dplyr_1.0.8
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
