'idxmax and .loc with multiindex dataframes

I am trying to filter a multiindex dataframe based on the maximum values of another multiindex dataframe. Here are the two dataframes:

df1

               Month1 Month2 Month3 .... 
Long Lat
0.0. 90.       20.     2.     12.0
     89.75.    15.     3.     14.0

df2

               Month1 Month2 Month3 .... 
Long Lat
0.0. 90.       14.8     13.8     14.5
     89.75.    12.3     12.2     11.2

max_vals = df1.idxmax(axis=1) gives the following:

Long Lat
0.0. 90.       Month1
     89.75.    Month1

And then combined_dfs = df2.loc[df1]

gives the following error:

KeyError: ['Month1' 'Month1'] not in index

I would like the output to be:

               Month
Long Lat
0.0. 90.       14.8 
     89.75.    12.3   

What am I doing wrong?



Solution 1:[1]

One option is to reindex df2 with the indices and column names obtained from idxmax; then use numpy.diag to get the max values in each location in df1:

import numpy as np
max_pos = df1.idxmax(axis=1)
out = np.diag(df2.reindex(columns=max_pos, index=max_pos.index))

Output:

array([14.8, 12.3])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1