'Build identity matrix from dataframe (sparsematrix) in R

I am trying to create an identity matrix from a dataframe. The dataframe is like so:


i<-c("South Korea", "South Korea", "France", "France","France")
j <-c("Rwanda", "France", "Rwanda", "South Korea","France")
distance <-c(10844.6822,9384,6003,9384,0)
dis_matrix<-data.frame(i,j,distance)

dis_matrix

1   South Korea     South Korea        0.0000
2   South Korea          Rwanda    10844.6822
3   South Korea          France     9384.1793
4        France          Rwanda     6003.3498
5        France     South Korea     9384.1793
6        France          France        0.0000

I am trying to create a matrix that will look like this:

                South Korea           France         Rwanda     
South Korea               0        9384.1793     10844.6822
France            9384.1793                0      6003.3498
Rwanda           10844.6822        6003.3498              0

I have tried using SparseMatrix from Matrix package as described here (Create sparse matrix from data frame) The issue is that the i and j have to be integers, and I have character strings. I am unable to find another function that does what I am looking for. I would appreciate any help. Thank you



Solution 1:[1]

A possible solution:

tidyr::pivot_wider(dis_matrix, id_cols = i, names_from = j,
         values_from = distance, values_fill = 0)

#> # A tibble: 2 × 4
#>   i           Rwanda France `South Korea`
#>   <chr>        <dbl>  <dbl>         <dbl>
#> 1 South Korea 10845.   9384             0
#> 2 France       6003       0          9384

Solution 2:[2]

You can use igraph::get.adjacency to create the desired matrix. You can also create a sparse matrix with sparse = TRUE.

library(igraph)

g <- graph.data.frame(dis_matrix, directed = FALSE)
get.adjacency(g, attr="distance", sparse = FALSE)

            South Korea France   Rwanda
South Korea        0.00   9384 10844.68
France          9384.00      0  6003.00
Rwanda         10844.68   6003     0.00

Solution 3:[3]

We may convert the first two columns to factor with levels specified as the unique values from both columns, and then use xtabs from base R

un1 <- unique(unlist(dis_matrix[1:2]))
dis_matrix[1:2] <- lapply(dis_matrix[1:2], factor, levels = un1)
xtabs(distance ~ i + j, dis_matrix)

-output

           j
i             South Korea   France   Rwanda
  South Korea        0.00  9384.00 10844.68
  France          9384.00     0.00  6003.00
  Rwanda             0.00     0.00     0.00

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 PaulS
Solution 2
Solution 3