'How do I generate a series of tables displaying counts for each unique pairing of a set of binary numeric variables?
I'm trying to get R to generate a series of tables that show me the distribution of values between each unique pairing of a set of 10 binary variables (values are 0, 1 or NA). Then, I want to run a series of chi-square tests of independence using those tables. I could just run individual table and chi-square commands -
TAB1_2 = table(var1, var2)
CHI1_2 = chisq.test(TAB1_2, correct = TRUE)
TAB1_3 = table(var1, var3)
CHI1_3 = chisq.test(TAB1_3, correct = TRUE)
TAB1_4 = table(var1, var4)
CHI1_4 = chisq.test(TAB1_4, correct = TRUE)
and so on, but it's tedious. Is there a way I can run some kind of loop to do this?
Here's a fictitious dataset that is similar in structure to the one I'm using:
data = structure(list(var1 = c(0, 1, 0, 1, 1, 0, 0, 1, 0, 1), var2 = c(1,
0, 0, NA, 1, 1, 1, 0, 1, 0), var3 = c(1, 0, 0, 1, 0, 1, 1, 0,
0, 1), var4 = c(1, 0, 1, 0, 1, 1, 1, 1, 1, 1), var5 = c(1, 0,
1, 1, 1, 1, 0, 0, 1, 0), var6 = c(0, 1, 0, 0, NA, 0, 1, 0, 0,
1), var7 = c(1, 1, 0, 1, 0, 0, 1, 0, 1, 0), var8 = c(1, 1, 0,
1, 0, 0, 1, 1, 0, 0), var9 = c(0, 1, 1, 0, 0, 0, 1, 0, 0, 0),
var10 = c(1, 1, 0, 0, 1, 0, 0, 0, NA, 1)), row.names = c(NA,
10L), class = "data.frame")
Help would be much appreciated!
Solution 1:[1]
You can use lapply()
to "loop" through all columns. The result would be a list of length = ncol(data)
.
lapply(data, function(x) chisq.test(x = data$var1, y = x, correct = T))
Output
The second variable name would be the names in the list. Note that the first entry is var1
against var1
.
$var1
Pearson's Chi-squared test with Yates' continuity correction
data: data$var1 and x
X-squared = 6.4, df = 1, p-value = 0.01141
$var2
Pearson's Chi-squared test with Yates' continuity correction
data: data$var1 and x
X-squared = 0.95063, df = 1, p-value = 0.3296
$var3
Pearson's Chi-squared test with Yates' continuity correction
data: data$var1 and x
X-squared = 0, df = 1, p-value = 1
$var4
Pearson's Chi-squared test with Yates' continuity correction
data: data$var1 and x
X-squared = 0.625, df = 1, p-value = 0.4292
$var5
Pearson's Chi-squared test with Yates' continuity correction
data: data$var1 and x
X-squared = 0.41667, df = 1, p-value = 0.5186
$var6
Pearson's Chi-squared test with Yates' continuity correction
data: data$var1 and x
X-squared = 0.05625, df = 1, p-value = 0.8125
$var7
Pearson's Chi-squared test with Yates' continuity correction
data: data$var1 and x
X-squared = 0, df = 1, p-value = 1
$var8
Pearson's Chi-squared test with Yates' continuity correction
data: data$var1 and x
X-squared = 0, df = 1, p-value = 1
$var9
Pearson's Chi-squared test with Yates' continuity correction
data: data$var1 and x
X-squared = 0, df = 1, p-value = 1
$var10
Pearson's Chi-squared test with Yates' continuity correction
data: data$var1 and x
X-squared = 0.14062, df = 1, p-value = 0.7077
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | benson23 |