'Python Kruskal Wallis test reliability?

I have a question about scipy's kruskal wallis test. I recently performed this test over many groups and returned several p values that were completely the same. I also noticed that this test could be performed on strings (?) Here is an example of what I am talking about

In [40]: scipy.stats.kruskal("x","y","z")
Out [40]: KruskalResult(statistic=2.0, pvalue=0.36787944117144245)

As you can see, this just performed the kruskal-wallis test on three letters and returned a p value and a test statistic. How is this possible? Is this test reliable at all?



Solution 1:[1]

For me this makes sense because the Kruskall-Wallis test statistic only involves the ranks of the observations, not their value, and there is an order relation between strings (the lexicographic order), so the ranks make sense. R gives the same p-value as Python for three groups containing only one value, when the three values are distinct:

> kruskal.test(x = c(0, 1, 2), g = 1:3)

    Kruskal-Wallis rank sum test

data:  c(0, 1, 2) and 1:3
Kruskal-Wallis chi-squared = 2, df = 2, p-value = 0.3679

> kruskal.test(x = c(0, 11, 22), g = 1:3)

    Kruskal-Wallis rank sum test

data:  c(0, 11, 22) and 1:3
Kruskal-Wallis chi-squared = 2, df = 2, p-value = 0.3679

But R only accepts numeric observations.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Stéphane Laurent