'Adding row names as keys and column names as values from pandas dataframe

I have a couple DFs of varying length and row number. Such as this:

df = 
        c1    c2    c3    c4
r1      8      4     5     2
r2      2      5     9     2

I need to loop through each column and return the row with the highest value and add that as a key to a dictionary with the column name as the value. So for c1 I would get r1 as key and c1 as value and for c2 I would get r2 as key and c2 as value. Then for c3 I would get r2 as key and now [c2,c3] as key. Furthermore, for c4 I need all keys to get c4 added to their list as values.

The dataframe I'm working on has many more columns and many more rows.

The final dictionary should look like this:

dict = {"r1": ["c1","c4"], "r2": ["c2","c3","c4"]}  

This seemed at first not that hard, but I am quite stuck.

The most difficult problem I have is to keep the lists/values separate from each key.

Any help would be greatly appreciated!



Solution 1:[1]

IIUC, you could compare each value with the max per column, then keep only the max values and reshape to dictionary:

s = df.eq(df.max()).stack()
s[s].reset_index(level=1).groupby(level=0)['level_1'].agg(list).to_dict()

output: {'r1': ['c1', 'c4'], 'r2': ['c2', 'c3', 'c4']}

Alternative approach taking advantage of the fact that groupby on an Index returns a dict:

s = df.eq(df.max()).stack()
s = s[s].reset_index(level=0)['level_0']
out = s.index.groupby(s)
# {'r1': ['c1', 'c4'], 'r2': ['c2', 'c3', 'c4']}

Solution 2:[2]

Let us try with rank

d = df.rank(ascending=False,method = 'dense').apply(lambda x: x.index[x==1].tolist(),axis=1).to_dict()
Out[52]: {'r1': ['c1', 'c4'], 'r2': ['c2', 'c3', 'c4']}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 BENY