'How do I compare each element in a list with each other element and outpt the results as a pairwise comparison matrix in R?
I am trying to automate the process of calculating Jaccard's index of similarity for every possible pair of sites surveyed in a recent vegetation study.
Below is a dummy list in the format of my data, where x, y, and z are discrete survey sites, and function jaccard().
x <- c("sp1","sp2","sp3")
y <- c("sp2","sp3","sp4")
z <- c("sp3","sp4","sp5")
dummy_list <- list(x,y,z)
jaccard <- function(a, b) {
intersection = length(intersect(a, b))
union = length(a) + length(b) - intersection
return (intersection/union) }
I want to pass each pairwise comparison (x-y, x-z, y-z) to jaccard() and output a matrix of calculated Jaccard indicies. How can I achieve this?
Solution 1:[1]
We could first Vectorize your jaccard function and then use outer:
x <- c("sp1","sp2","sp3")
y <- c("sp2","sp3","sp4")
z <- c("sp3","sp4","sp5")
dummy_list <- setNames(list(x, y, z), c("x","y","z"))
jaccard <- function(a, b) {
intersection = length(intersect(a, b))
union = length(a) + length(b) - intersection
return (intersection/union)
}
vjaccard <- Vectorize(jaccard)
outer(dummy_list, dummy_list, FUN = "vjaccard")
#> x y z
#> x 1.0 0.5 0.2
#> y 0.5 1.0 0.5
#> z 0.2 0.5 1.0
Created on 2022-03-02 by the reprex package (v2.0.1)
Solution 2:[2]
jaccard <- function(List) {
ln <- combn(List, 2,function(x){
n <- length(intersect(x[[1]], x[[2]]))
m <- length(unlist(x))
n/(m-n)})
structure(ln, Size = length(ln), Diag = FALSE, class = 'dist')
}
jaccard(dummy_list)
1 2
2 0.5
3 0.2 0.5
Solution 3:[3]
We can use the following base R approach (without using the jaccard function but following the same definition)
> dummy_list <- list(x = x, y = y, z = z)
> 1 / (outer(lengths(dummy_list), lengths(dummy_list), `+`) / crossprod(table(stack(dummy_list))) - 1)
x y z
x 1.0 0.5 0.2
y 0.5 1.0 0.5
z 0.2 0.5 1.0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | TimTeaFan |
| Solution 2 | onyambu |
| Solution 3 | ThomasIsCoding |
