'How can i identiy fast indirect relations in a dependency matrix
|A |B |C |
A|Nan|x |x |
B|x |Nan|Nan|
C|x |Nan|Nan|
I have this example from a csv file and with Pandas i managed to remove X and Nan values and replace them with 0/1
|A|B|C|
A|0|1|1|
B|1|0|0|
C|1|0|0|
My aim is to find and add the indirect relations. For example if A has a depedency to B and C, then add the value 1 to B and C elements. My table is more than 400 elements, so i can choose every element by column name, therefor i will use for loops to map the coordinates of the values 1 and then find the indirect relation. For example: 1,2 and 1,3 have a Value of 1, then 2,3 and 3,2 will have also the value 1. My result should be like this table:
|A|B|C|
A|0|1|1|
B|1|0|1|
C|1|1|0|
Does anyone have another idea for an easier way or has seen something similar. The difficult part for me is the creation of the 1 values in the table, where i am not sure how it can be done.
Solution 1:[1]
What you have here is a graph problem.
Starting from this input (I added 2 more nodes D/E):
df = pd.DataFrame([[0,1,1,0,0],[1,0,0,0,0],[1,0,0,0,0],[0,0,0,0,1],[0,0,0,1,0]],
columns=list('ABCDE'), index=list('ABCDE'))
A B C D E
A 0 1 1 0 0
B 1 0 0 0 0
C 1 0 0 0 0
D 0 0 0 0 1
E 0 0 0 1 0
You have the following graph:
and want to find all edges:
For this you can start by constructing a list of edges:
df2 = df.where(df.eq(1)).stack().rename_axis(['source', 'target']).reset_index()
source target 0
0 A B 1.0
1 A C 1.0
2 B A 1.0
3 C A 1.0
4 D E 1.0
5 E D 1.0
Then compute a graph with networkx and get the connected components (i.e. the disconnected subgroups):
import networkx as nx
G = nx.from_pandas_edgelist(df2)
groups = nx.connected_components(G)
# NB. the above is a generator which gives
# [{'A', 'B', 'C'}, {'D', 'E'}]
Finally, generate the list all edges pairs with itertools.permutations and create the desired output:
from itertools import permutations, chain
idx = pd.MultiIndex.from_tuples(chain.from_iterable(permutations(l, 2)
for l in nx.connected_components(G)))
out = pd.Series(index=idx).fillna(1, downcast='infer').unstack(fill_value=0)
A B C D E
A 0 1 1 0 0
B 1 0 1 0 0
C 1 1 0 0 0
D 0 0 0 0 1
E 0 0 0 1 0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mozway |


