'Using R to measure congruence between levels in variable
I have two different variables in R. The first ("candimmi") represents political candidates' opinion on immigration. The second variable (voterimmi) represents voters opinion on immigration. Both variables have the same 3 levels being either anti-immigration, intermediate or pro-immigration.
My issue is that I want to create a new variable stating wether there is congruence or not between the voter and the political candidates. The levels in the new variable would be called "both anti-immigrant", "both intermediate", "both pro-immigration" and "mismatch".
Can any of you give me some advice on how to do this?
Thanks in advance!
Best, Malte
I have tried finding solutions already, but can't find any answers to my question online.
Solution 1:[1]
You can use case_when, which is just dplyr's version of ifelse:
set.seed(05062020)
library(dplyr)
responses <- c("Anti","Intermed","Pro")
df <- data.frame(candidate = sample(responses, 10, replace = TRUE),
voter = sample(responses, 10, replace = TRUE))
df2 <- df %>% mutate(result = case_when(candidate %in% "Anti" & voter %in% "Anti" ~ "Both Anti",
candidate %in% "Intermed" & voter %in% "Intermed" ~ "Both Intermed",
candidate %in% "Pro" & voter %in% "Pro" ~ "Both Pro",
candidate != voter ~ "Discordant"))
# candidate voter result
# 1 Pro Intermed Discordant
# 2 Anti Anti Both Anti
# 3 Pro Pro Both Pro
# 4 Pro Anti Discordant
# 5 Pro Anti Discordant
# 6 Pro Pro Both Pro
# 7 Pro Intermed Discordant
# 8 Intermed Pro Discordant
# 9 Intermed Intermed Both Intermed
# 10 Anti Pro Discordant
A base R way to do it is using nested ifelse statements:
df$result <- ifelse(df$candidate %in% "Anti" & df$voter %in% "Anti", "Both Anti",
ifelse(df$candidate %in% "Intermed" & df$voter %in% "Intermed", "Both Intermed",
ifelse(df$candidate %in% "Pro" & df$voter %in% "Pro", "Both Pro",
ifelse(df$candidate != df$voter, "Discordant", NA))))
# > df
# candidate voter result
# 1 Pro Intermed Discordant
# 2 Anti Anti Both Anti
# 3 Pro Pro Both Pro
# 4 Pro Anti Discordant
# 5 Pro Anti Discordant
# 6 Pro Pro Both Pro
# 7 Pro Intermed Discordant
# 8 Intermed Pro Discordant
# 9 Intermed Intermed Both Intermed
# 10 Anti Pro Discordant
Solution 2:[2]
Here is a simple approach using base R functions factor and interaction (using @jpsmith example data.frame with different random seed). At the core of this, interaction will automatically create a new factor with combined levels, then you can just rename these if you like (might be useful with many factor levels).
set.seed(234) # fixed random seed for reproducibility
responses <- c("Anti", "Intermed", "Pro")
congruence <- c("both anti-immigrant", "both intermediate", "both pro-immigration", "mismatch")
df <- data.frame(candidate = sample(responses, 10, replace = TRUE),
voter = sample(responses, 10, replace = TRUE))
df$candidate <- factor(df$candidate, levels=responses) # make sure you have all the levels
df$voter <- factor(df$voter, levels=responses) # make sure you have all the levels
df$congruence <- with(df, interaction(candidate, voter)) # create new factor representing both levels
levels(df$congruence) <- congruence[c(1,4,4,4,2,4,4,4,3)] # match up factor levels to rename
df
#> candidate voter congruence
#> 1 Anti Pro mismatch
#> 2 Pro Pro both pro-immigration
#> 3 Intermed Intermed both intermediate
#> 4 Intermed Pro mismatch
#> 5 Intermed Intermed both intermediate
#> 6 Intermed Intermed both intermediate
#> 7 Anti Anti both anti-immigrant
#> 8 Anti Anti both anti-immigrant
#> 9 Pro Intermed mismatch
#> 10 Intermed Pro mismatch
Created on 2022-04-05 by the reprex package (v2.0.1)
Solution 3:[3]
Both of the other answers work fine, but the simplest solution is to use just one ifelse(). Below I first create some sample data and then show how you would use ifelse() in either the tidyverse or base R if you prefer.
library(tidyverse)
# Create data sample
d <- crossing(
candimmi = c("anti", "inter", "pro"),
voterimmi = candimmi
)
d |>
mutate(new_tidy = ifelse(candimmi != voterimmi,
"mismatch",
str_c("both ", candimmi)))
#> # A tibble: 9 × 3
#> candimmi voterimmi new_tidy
#> <chr> <chr> <chr>
#> 1 anti anti both anti
#> 2 anti inter mismatch
#> 3 anti pro mismatch
#> 4 inter anti mismatch
#> 5 inter inter both inter
#> 6 inter pro mismatch
#> 7 pro anti mismatch
#> 8 pro inter mismatch
#> 9 pro pro both pro
d$new_base <- ifelse(d$candimmi != d$voterimmi,
"mismatch",
paste("both", d$candimmi))
d
#> # A tibble: 9 × 3
#> candimmi voterimmi new_base
#> <chr> <chr> <chr>
#> 1 anti anti both anti
#> 2 anti inter mismatch
#> 3 anti pro mismatch
#> 4 inter anti mismatch
#> 5 inter inter both inter
#> 6 inter pro mismatch
#> 7 pro anti mismatch
#> 8 pro inter mismatch
#> 9 pro pro both pro
Created on 2022-04-05 by the reprex package (v2.0.1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | user12728748 |
| Solution 3 | shs |
