'Create clusters based on pairwise columns (iteratively)
I have a long pairwise table with two columns. I would like to convert these columns into clusters. For example, I would like:
input:
A B
A C
A D
A E
A F
B G
Y Z
output:
Cluster1 A,B,C,D,E,F,G
Cluster2 Y,Z
The part that is difficult for me is iterating through. For example, I can use datamash to create this (datamash -W -g1 collapse 2):
A B,C,D,E,F
B G
Y Z
But, I also want G to get grouped with A (through B) and to go through enough iterations to saturate. Any help is appreciated. Thanks!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
