'Analyzing proportions/count data with a chi-squared test in R
df <- structure(list(Zone = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("Crocodile",
"Rankin", "West", "Whipray"), class = "factor"),
Year = c(2016L, 2017L, 2018L, 2019L, 2016L, 2017L, 2018L, 2019L, 2016L, 2017L,
2018L, 2019L, 2016L, 2017L, 2018L, 2019L), total = c(1L, 18L,
7L, 0L, 14L, 46L, 69L, 66L, 29L, 67L, 58L, 71L, 9L, 7L, 15L,
10L), empty = c(0L, 8L, 2L, 0L, 3L, 17L, 8L, 19L, 7L,
31L, 17L, 17L, 4L, 4L, 0L, 3L), full = c(1L, 10L, 5L,
0L, 11L, 29L, 61L, 47L, 22L, 36L, 41L, 54L, 5L, 3L, 15L, 7L)), row.names = c(NA,
-16L), groups = structure(list(Zone = structure(1:4, .Label = c("Crocodile",
"Rankin", "West", "Whipray"), class = "factor"), .rows = structure(list(
1:4, 5:8, 9:12, 13:16), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -4L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
My df consists of fish counts with a corresponding time and place each batch was caught. The 'empty' col is the number of fish with empty stomachs out of the total, and the 'full' col is the number of fish caught with full stomachs out of the total. Can someone help me run and understand the output of a chi-squared test on this data in R? I would like to see if the proportion of fish with empty stomachs vs full is the same in a given year (for all zones) and a given zone (over all years).
I've been told it's the correct one to use, but this test seems very versatile and I'm getting confused with how to use it, given my type of data doesn't match any of the examples I've been looking at. I'm also not sure if I need a "continuity correction" (or what that means). Any help is greatly appreciated!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
