'How do I generate a series of tables displaying counts for each unique pairing of a set of binary numeric variables?

I'm trying to get R to generate a series of tables that show me the distribution of values between each unique pairing of a set of 10 binary variables (values are 0, 1 or NA). Then, I want to run a series of chi-square tests of independence using those tables. I could just run individual table and chi-square commands -

TAB1_2 = table(var1, var2)
CHI1_2 = chisq.test(TAB1_2, correct = TRUE)

TAB1_3 = table(var1, var3)
CHI1_3 = chisq.test(TAB1_3, correct = TRUE)

TAB1_4 = table(var1, var4)
CHI1_4 = chisq.test(TAB1_4, correct = TRUE)

and so on, but it's tedious. Is there a way I can run some kind of loop to do this?

Here's a fictitious dataset that is similar in structure to the one I'm using:

data = structure(list(var1 = c(0, 1, 0, 1, 1, 0, 0, 1, 0, 1), var2 = c(1, 
0, 0, NA, 1, 1, 1, 0, 1, 0), var3 = c(1, 0, 0, 1, 0, 1, 1, 0, 
0, 1), var4 = c(1, 0, 1, 0, 1, 1, 1, 1, 1, 1), var5 = c(1, 0, 
1, 1, 1, 1, 0, 0, 1, 0), var6 = c(0, 1, 0, 0, NA, 0, 1, 0, 0, 
1), var7 = c(1, 1, 0, 1, 0, 0, 1, 0, 1, 0), var8 = c(1, 1, 0, 
1, 0, 0, 1, 1, 0, 0), var9 = c(0, 1, 1, 0, 0, 0, 1, 0, 0, 0), 
    var10 = c(1, 1, 0, 0, 1, 0, 0, 0, NA, 1)), row.names = c(NA, 
10L), class = "data.frame")

Help would be much appreciated!



Solution 1:[1]

You can use lapply() to "loop" through all columns. The result would be a list of length = ncol(data).

lapply(data, function(x) chisq.test(x = data$var1, y = x, correct = T))

Output

The second variable name would be the names in the list. Note that the first entry is var1 against var1.

$var1

    Pearson's Chi-squared test with Yates' continuity correction

data:  data$var1 and x
X-squared = 6.4, df = 1, p-value = 0.01141


$var2

    Pearson's Chi-squared test with Yates' continuity correction

data:  data$var1 and x
X-squared = 0.95063, df = 1, p-value = 0.3296


$var3

    Pearson's Chi-squared test with Yates' continuity correction

data:  data$var1 and x
X-squared = 0, df = 1, p-value = 1


$var4

    Pearson's Chi-squared test with Yates' continuity correction

data:  data$var1 and x
X-squared = 0.625, df = 1, p-value = 0.4292


$var5

    Pearson's Chi-squared test with Yates' continuity correction

data:  data$var1 and x
X-squared = 0.41667, df = 1, p-value = 0.5186


$var6

    Pearson's Chi-squared test with Yates' continuity correction

data:  data$var1 and x
X-squared = 0.05625, df = 1, p-value = 0.8125


$var7

    Pearson's Chi-squared test with Yates' continuity correction

data:  data$var1 and x
X-squared = 0, df = 1, p-value = 1


$var8

    Pearson's Chi-squared test with Yates' continuity correction

data:  data$var1 and x
X-squared = 0, df = 1, p-value = 1


$var9

    Pearson's Chi-squared test with Yates' continuity correction

data:  data$var1 and x
X-squared = 0, df = 1, p-value = 1


$var10

    Pearson's Chi-squared test with Yates' continuity correction

data:  data$var1 and x
X-squared = 0.14062, df = 1, p-value = 0.7077

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 benson23