'How to extend multilevel columns in Pandas

Given a df with two level columns

      a                                                
  E1_g1 E1_g2 E1_g3 E2_g1 E2_g2 E2_g3 E3_g1 E3_g2 E3_g3
0     4     0     3     3     3     1     3     2     4
1     0     0     4     2     1     0     1     1     0

From a list tuple

[('a', 'E1', 'g1'), ('a', 'E1', 'g2'), ('a', 'E1', 'g3'), ('a', 'E2', 'g1'), ('a', 'E2', 'g2'), ('a', 'E2', 'g3'), ('a', 'E3', 'g1'), ('a', 'E3', 'g2'), ('a', 'E3', 'g3')]

The list of tuple is generated from for-loop shown in accompanying code below.

I would like to expand it into 3 level from the given list tuple

   a                        
  E1       E2       E3      
  g1 g2 g3 g1 g2 g3 g1 g2 g3
0  4  0  3  3  3  1  3  2  4
1  0  0  4  2  1  0  1  1  0

I have the impression this simply can be achieve via

df.colums=pd.MultiIndex.from_tuples(ntuple)

However,applying the above produce

      a                                                
  E1_g1 E1_g2 E1_g3 E2_g1 E2_g2 E2_g3 E3_g1 E3_g2 E3_g3
0     4     0     3     3     3     1     3     2     4
1     0     0     4     2     1     0     1     1     0

May I know what am I missing here?

The full code to reproduce the above is below

import numpy as np
import pandas as pd
np.random.seed(0)
arr = np.random.randint(5, size=(2, 9))

_names = ['a','a','a','a','a','a','a','a','a']
_idx = ['E1_g1','E1_g2','E1_g3',
        'E2_g1','E2_g2','E2_g3',
        'E3_g1','E3_g2','E3_g3']
columns = pd.MultiIndex.from_arrays([_names, _idx])

df= pd.DataFrame(data=arr, columns=columns)

ntuple=[]
for dg in df.columns:
    A,B=dg
    f,r=B.split('_')
    ntuple.append((A,f,r))


df.colums=pd.MultiIndex.from_tuples(ntuple)
print(df)


Solution 1:[1]

If you already have the list of tuples, you can use pd.MultiIndex.from_tuples:

tuples = [('a', 'E1', 'g1'), ('a', 'E1', 'g2'), ('a', 'E1', 'g3'), ('a', 'E2', 'g1'), ('a', 'E2', 'g2'), ('a', 'E2', 'g3'), ('a', 'E3', 'g1'), ('a', 'E3', 'g2'), ('a', 'E3', 'g3')]
df.columns = pd.MultiIndex.from_tuples(tuples)

Output:

   a                        
  E1       E2       E3      
  g1 g2 g3 g1 g2 g3 g1 g2 g3
0  4  0  3  3  3  1  3  2  4
1  0  0  4  2  1  0  1  1  0

Full code:

import numpy as np
import pandas as pd
np.random.seed(0)
arr = np.random.randint(5, size=(2, 9))
tuples = [('a', 'E1', 'g1'), ('a', 'E1', 'g2'), ('a', 'E1', 'g3'), ('a', 'E2', 'g1'), ('a', 'E2', 'g2'), ('a', 'E2', 'g3'), ('a', 'E3', 'g1'), ('a', 'E3', 'g2'), ('a', 'E3', 'g3')]
df = pd.DataFrame(data=arr)
df.columns = pd.MultiIndex.from_tuples(tuples)

For using _names and _idx:

df.columns = pd.MultiIndex.from_tuples([[name]+idx.split('_') for name,idx in zip(_names,_idx)])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1