'Rank Biserial Correlation with r

I have a non-normal distribution and two variables: one ordinal and the other a binary categorical, both from same sample population. I've found out that rank biserial correlations are the adequate to this kind of data.

Is there a package or can somebody help me to calculate a rank biserial correlation with p-value and effect size?

r


Solution 1:[1]

Look at the polycor package: http://cran.r-project.org/web/packages/polycor/polycor.pdf polyserial() might be what you are looking for

Solution 2:[2]

Use rstatix::wilcox_test() for p value and rstatix::wilcox_effsize() for effect size. The second function implements methods described in section "Effect size estimates used with non-parametric test" of article [1] on page 23. In the article, it is clearly explained how regular and squared versions of the coefficient should be interpreted.

Example (you are interested in p and effsize):

rstatix::wilcox_test(len ~ supp, data = ToothGrowth)
#> # A tibble: 1 x 7
#>   .y.   group1 group2    n1    n2 statistic      p
#> * <chr> <chr>  <chr>  <int> <int>     <dbl>  <dbl>
#> 1 len   OJ     VC        30    30      576. 0.0645


rstatix::wilcox_effsize(len ~ supp, data = ToothGrowth)
#> # A tibble: 1 x 7
#>   .y.   group1 group2 effsize    n1    n2 magnitude
#> * <chr> <chr>  <chr>    <dbl> <int> <int> <ord>    
#> 1 len   OJ     VC       0.240    30    30 small

References:

  1. Tomczak M., Tomczak E. The need to report effect size estimates revisited. An overview of some recommended measures of effect size, Trends in Sport Sciences 21: 19–25 (2014). Available at: http://tss.awf.poznan.pl/files/3_Trends_Vol21_2014__no1_20.pdf.

Solution 3:[3]

The command wilcoxonRG in the library rcompanion could help.

Let's create a synthesis data for illustration:

library(rcompanion)
Criticism = c(1, 1, 0, 0, 2, 2, 2, 3, 4)
Praise = c(4, 4, 5, 5, 6, 6, 4, 3, 6)
Y = c(Criticism, Praise)
Bi_group = factor(c(rep("Criticism", length(Criticism)),  
                 rep("Praise", length(Praise))))
cbind(Y, Bi_group)

Then, this is the data that we created:

      Y Bi_group
 [1,] 1        1
 [2,] 1        1
 [3,] 0        1
 [4,] 0        1
 [5,] 2        1
 [6,] 2        1
 [7,] 2        1
 [8,] 3        1
 [9,] 4        1
[10,] 4        2
[11,] 4        2
[12,] 5        2
[13,] 5        2
[14,] 6        2
[15,] 6        2
[16,] 4        2
[17,] 3        2
[18,] 6        2

with Y is the ordinal and Bi_group is the binary categorical.

Then we can use the command

wilcoxonRG(x = Y, g = Bi_group, verbose=TRUE)

to get

Levels: Criticism Praise
n for Criticism = 9
n for Praise = 9
Mean of ranks for Criticism = 5.333333
Mean of ranks for Praise = 13.66667
Difference in mean of ranks = -8.333333
Total n = 18
2 * difference / total n = -0.926

    rg 
-0.926 

Note

You can also use the command

wilcoxonRG(table(Bi_group, Y))   #put the binary catergorical var first

to get

    rg 
-0.926 

References:

https://www.rdocumentation.org/packages/rcompanion/versions/2.4.1/topics/wilcoxonRG

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Rentrop
Solution 2 GegznaV
Solution 3 Tranle