'How do I assign group level value - based on row level values - to df using dplyr
I have the following decision rules:
RELIABILITY LEVEL     DESCRIPTION
LEVEL I               Multiple regression
LEVEL II              Multiple regression + mechanisms specified (all interest variables)
LEVEL III             Multiple regression + mechanisms specified (all interest + control vars)
The first three columns are the data upon which the 4th column should be reproduced using dplyr.
The reliability level should be the same for the whole table (model)... I want to code it using dplyr.
Here is my try so far... As you can see, I can't get it to be the same for the whole model
library(tidyverse)
library(readxl)
library(effectsize)
df <- read_excel("https://github.com/timverlaan/relia/blob/59d2cbc5d7830c41542c5f65449d5f324d6013ad/relia.xlsx")
df1 <- df %>%
  group_by(study, table, function_var) %>%
  mutate(count_vars = n()) %>%
  ungroup %>%
  group_by(study, table, function_var, mechanism_described) %>%
  mutate(count_int = case_when(
    function_var == 'interest' & mechanism_described == 'yes' ~ n()
    )) %>%
  mutate(count_con = case_when(
    function_var == 'control' & mechanism_described == 'yes' ~ n()
    )) %>% 
  mutate(reliable_int = case_when(
    function_var == 'interest' & count_vars/count_int == 1 ~ 1)) %>%
  mutate(reliable_con = case_when(
    function_var == 'control' & count_vars/count_con == 1 ~ 1)) %>%
  # group_by(study, source) %>%
  mutate(reliable = case_when(
    reliable_int != 1 ~ 1,
    reliable_int == 1 ~ 2,
    reliable_int + reliable_con == 2 ~ 3)) %>%
  # ungroup() %>%
Solution 1:[1]
The code settled on is:
library(tidyverse)
library(readxl)
df <- read_excel("C:/Users/relia.xlxs")
df <- df %>% select(-reliability_score)
test<-df %>% group_by(study,model,function_var) %>%
  summarise(count_yes=sum(mechanism_described=="yes"),n=n(),frac=count_yes/n) %>%
  mutate(frac_control=frac[function_var=="control"],
         frac_interest=frac[function_var=="interest"]) %>%
  mutate(reliability = case_when(
    frac_control == 1 & frac_interest != 1 ~ -99,
    frac_control != 1 & frac_interest != 1 ~ 2,
    frac_interest == 1 & frac_control != 1 ~ 3,
    frac_interest ==1 & frac_control == 1 ~ 4)) %>% group_by(study,model) %>% summarise(reliability=mean(reliability))
df_reliability<-left_join(df,test)
View(df_reliability)
However, I would prefer to do this all within one dplyr pipe. If anyone has a solution I would love to hear it...
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source | 
|---|---|
| Solution 1 | DeMelkbroer | 
