'pairwise similarity with consecutive points

I have a large matrix of document similarity created with paragraph2vec_similarity in doc2vec package. I converted it to a data frame and added a TITLE column to the beginning to later sort or group it.

Current Dummy Output:

Title	Header	DocName_1900.txt_1	DocName_1900.txt_2	DocName_1900.txt_3	DocName_1901.txt_1	DocName_1901.txt_2
Doc1	DocName_1900.txt_1	1.000000	0.7369358	0.6418045	0.6268959	0.6823404
Doc1	DocName_1900.txt_2	0.7369358	1.000000	0.6544884	0.7418507	0.5174367
Doc1	DocName_1900.txt_3	0.6418045	0.6544884	1.000000	0.6180578	0.5274650
Doc2	DocName_1901.txt_1	0.6268959	0.7418507	0.6180578	1.000000	0.5755243
Doc2	DocName_1901.txt_2	0.6823404	0.5174367	0.5274650	0.5755243	1.000000

What I want is a data frame giving similarity in consecutive order for each following document. That is, the score for Doc1.1 and Doc1.2; and Doc1.2 and Doc1.3. Because I am only interested with similarity scores inside each individual document -- in diagonal order as shown in bold above.

Expected Output

Title	Similarity for 1-2	Similarity for 2-3	Similarity for 3-4
Doc1	0.7369358	0.6544884	NA
Doc2	0.5755243	NA	NA	NA
Doc3	0.6049844	0.5250659	0.5113757

I was able to produce one giving the similarity scores of one doc with the remaining all docs with x<-data.frame(col=colnames(m)[col(m)], row=rownames(m)[row(last)], similarity=c(m)). This is the closest I could get. Is there a better way? Because I am dealing with more than 500 titles with varying lengths. There is still the option of using diag but it gets everything to the end of matrix and I loose document grouping.

Solution 1:^[1]

Another solution:

df %>%
  group_by(Title) %>%
  summarize(name = embed(Header, 2), .groups = 'drop') %>%
  mutate(value = transform(df, row.names = Header)[name],
         name = str_remove_all(paste(name[,2],name[,1], sep = '_'), '[^_]+[.]'))%>%
  pivot_wider()

# A tibble: 2 x 3
  Title `1_2`     `2_3`    
  <chr> <chr>     <chr>    
1 Doc1  0.7369358 0.6544884
2 Doc2  0.5755243 NA

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	onyambu

'pairwise similarity with consecutive points

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]