'Eliminating values in CrossTable in R

I'm just getting started in R and I'm trying to wrap my head around Chi square for a university assignment.

Specifically, I am using the General Social Survey 2018 dataset (for codebook: https://www.thearda.com/Archive/Files/Codebooks/GSS2018_CB.asp), and I am trying to figure out if religion has any effect on the way people seek out help for mental health.

I want to use reliten (self-assessment of religiousness - from strong to no religion) as the independent variable, and mentloth, (asks if a person with mental health issues should reach out to a mental health professional - yes or no) as the dependent variable. Next to the Chi-square, I also want to add CrossTable(GSS18$reliten, GSS18$mentloth), but I'm not sure how to take out the "Not applicable", "Don't know" and "No response" values coded as 0, 8 and 9. Anyone has some tips?

Below there is a short preview of my data, if it helps.

structure(list(reliten = structure(c(1, 1, 4, 1, 1, 2, 1, 1, 
4, 2, 2, 3, 2, 2, 4, 1, 4, 3, 2, 1, 2, 1, 2, 2, 1), label = "Would you call yourself a strong [religious preference] or a not very strong [re", format.stata = "%8.0g", labels = c(`Not applicable` = 0, 
Strong = 1, `Not very strong` = 2, `Somewhat strong` = 3, `No religion` = 4, 
`Don't know` = 8, `No answer` = 9), class = c("haven_labelled", 
"vctrs_vctr", "double")), mentloth = structure(c(0, 1, 0, 1, 
2, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0
), label = "Should [NAME] go to a therapist, or counselor, like a psychologist, social worke", format.stata = "%8.0g", labels = c(`Not applicable` = 0, 
Yes = 1, No = 2, `Don't know` = 8, `No answer` = 9), class = c("haven_labelled", 
"vctrs_vctr", "double"))), row.names = c(NA, -25L), class = c("tbl_df", 
"tbl", "data.frame"))

Any help would be much appreciated!



Solution 1:[1]

The CrossTable function is from the gmodels package, which doesn't know how to handle objects of class haven_labelled, so treats them as numeric vectors.

To get a nicer output, you can convert them into base R factors for CrossTable to retain the names. Fortunately, the haven package contains the function as_factor for doing exactly that.

Once you have done that, it is easy to drop the factor levels you don't want, as shown below:

library(gmodels)
library(haven)

df <- GSS18[!GSS18$mentloth %in% c(0, 8, 9),]
df$reliten <- as_factor(df$reliten)
df$mentloth <- as_factor(df$mentloth)
df$reliten <- factor(as.character(df$reliten), 
                     levels = c("No religion", "Somewhat strong", 
                                "Not very strong", "Strong"))

So now you can do

CrossTable(df$reliten, df$mentloth)

   Cell Contents
|-------------------------|
|                       N |
| Chi-square contribution |
|           N / Row Total |
|           N / Col Total |
|         N / Table Total |
|-------------------------|

 
Total Observations in Table:  12 

 
                | df$mentloth 
     df$reliten |       Yes |        No | Row Total | 
----------------|-----------|-----------|-----------|
    No religion |         1 |         0 |         1 | 
                |     0.008 |     0.083 |           | 
                |     1.000 |     0.000 |     0.083 | 
                |     0.091 |     0.000 |           | 
                |     0.083 |     0.000 |           | 
----------------|-----------|-----------|-----------|
Somewhat strong |         1 |         0 |         1 | 
                |     0.008 |     0.083 |           | 
                |     1.000 |     0.000 |     0.083 | 
                |     0.091 |     0.000 |           | 
                |     0.083 |     0.000 |           | 
----------------|-----------|-----------|-----------|
Not very strong |         3 |         0 |         3 | 
                |     0.023 |     0.250 |           | 
                |     1.000 |     0.000 |     0.250 | 
                |     0.273 |     0.000 |           | 
                |     0.250 |     0.000 |           | 
----------------|-----------|-----------|-----------|
         Strong |         6 |         1 |         7 | 
                |     0.027 |     0.298 |           | 
                |     0.857 |     0.143 |     0.583 | 
                |     0.545 |     1.000 |           | 
                |     0.500 |     0.083 |           | 
----------------|-----------|-----------|-----------|
   Column Total |        11 |         1 |        12 | 
                |     0.917 |     0.083 |           | 
----------------|-----------|-----------|-----------|

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1