'gtsummary - present proportion of unknown/missing values separately
I have a categorical variable (ethnicity) with unknown values, and I am trying to present it with gtsummary so that the proportions are computed separately for known and unknown values
I have managed to get to this stage
But I cannot find a way to calculate the proportion of missing values separately
My code is something along these lines
trial %>%
select(age, response, trt) %>%
tbl_summary(
by = trt,
missing = "if_any",
statistic = list(all_continuous() ~ "{mean} ({sd})",
response~ c("{n} ({p}%)",
"{N_miss} ({p_miss})"),
all_categorical() ~ "{n} ({p}%)"),
)
I did try the solution suggested here (i.e. fct_replace_na and setting missing = "no") but it keeps including the unknown rows in the overall proportions
Thank you
Solution 1:[1]
That's a great question, and I think I should implement something to make this easier. Anyway, here's how I would go about this: 1. Define a new variable indicating if the variable is missing. 2. Summarize this variable in the table and update the default label to Unknown, 3. Indent the missing rows.
Example Below!
library(gtsummary)
library(dplyr, warn.conflicts = FALSE)
packageVersion("gtsummary")
#> [1] '1.6.0'
tbl <-
trial %>%
mutate(across(c(age, response), is.na, .names = "{.col}_missing")) %>%
select(age, age_missing, response, response_missing, trt) %>%
tbl_summary(
by = trt,
missing = "no",
label = ends_with("_missing") ~ "Unknown",
statistic = list(all_continuous() ~ "{mean} ({sd})",
response~ c("{n} ({p}%)",
"{N_miss} ({p_miss})"),
all_categorical() ~ "{n} ({p}%)"),
) %>%
modify_column_indent(columns = label, rows = endsWith(variable, "_missing"))
Created on 2022-04-27 by the reprex package (v2.0.1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Daniel D. Sjoberg |
