'Pandas: drop a level from a multi-level column index?

If I've got a multi-level column index:

>>> cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
>>> pd.DataFrame([[1,2], [3,4]], columns=cols)
    a
   ---+--
    b | c
--+---+--
0 | 1 | 2
1 | 3 | 4

How can I drop the "a" level of that index, so I end up with:

    b | c
--+---+--
0 | 1 | 2
1 | 3 | 4


Solution 1:[1]

You can use MultiIndex.droplevel:

>>> cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
>>> df = pd.DataFrame([[1,2], [3,4]], columns=cols)
>>> df
   a   
   b  c
0  1  2
1  3  4

[2 rows x 2 columns]
>>> df.columns = df.columns.droplevel()
>>> df
   b  c
0  1  2
1  3  4

[2 rows x 2 columns]

Solution 2:[2]

As of Pandas 0.24.0, we can now use DataFrame.droplevel():

cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
df = pd.DataFrame([[1,2], [3,4]], columns=cols)

df.droplevel(0, axis=1) 

#   b  c
#0  1  2
#1  3  4

This is very useful if you want to keep your DataFrame method-chain rolling.

Solution 3:[3]

Another way to drop the index is to use a list comprehension:

df.columns = [col[1] for col in df.columns]

   b  c
0  1  2
1  3  4

This strategy is also useful if you want to combine the names from both levels like in the example below where the bottom level contains two 'y's:

cols = pd.MultiIndex.from_tuples([("A", "x"), ("A", "y"), ("B", "y")])
df = pd.DataFrame([[1,2, 8 ], [3,4, 9]], columns=cols)

   A     B
   x  y  y
0  1  2  8
1  3  4  9

Dropping the top level would leave two columns with the index 'y'. That can be avoided by joining the names with the list comprehension.

df.columns = ['_'.join(col) for col in df.columns]

    A_x A_y B_y
0   1   2   8
1   3   4   9

That's a problem I had after doing a groupby and it took a while to find this other question that solved it. I adapted that solution to the specific case here.

Solution 4:[4]

Another way to do this is to reassign df based on a cross section of df, using the .xs method.

>>> df

    a
    b   c
0   1   2
1   3   4

>>> df = df.xs('a', axis=1, drop_level=True)

    # 'a' : key on which to get cross section
    # axis=1 : get cross section of column
    # drop_level=True : returns cross section without the multilevel index

>>> df

    b   c
0   1   2
1   3   4

Solution 5:[5]

You could also achieve that by renaming the columns:

df.columns = ['a', 'b']

This involves a manual step but could be an option especially if you would eventually rename your data frame.

Solution 6:[6]

A small trick using sum with level=1(work when level=1 is all unique)

df.sum(level=1,axis=1)
Out[202]: 
   b  c
0  1  2
1  3  4

More common solution get_level_values

df.columns=df.columns.get_level_values(1)
df
Out[206]: 
   b  c
0  1  2
1  3  4

Solution 7:[7]

I have struggled with this problem since I don’t know why my droplevel() function does not work. Work through several and learn that ‘a’ in your table is columns name and ‘b’, ‘c’ are index. Do like this will help

df.columns.name = None
df.reset_index() #make index become label

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ASGM
Solution 2
Solution 3
Solution 4 spacetyper
Solution 5 sedeh
Solution 6 BENY
Solution 7 dhFrank