'two header rows to one header row pandas

Given a dataframe that looks like this:

            min     max      mean       std       wav
symbol
1000038  0.0110  0.0022  0.003296 -0.002611  0.005514
1000043  0.0090  0.0102  0.004124 -0.000507  0.004847
1000055  0.0100  0.0030  0.001358 -0.001719  0.004363
1000059  0.0050  0.0017  0.003372 -0.001089  0.008595
1000060  0.0012  0.0062  0.002367  0.000407  0.006935

The rows headers are on two different rows, but are not multi-index as far as i am aware, so when i try to columns.droplevel(0) this fails with the error: Cannot remove 1 levels from an index with 1 levels: at least one level must be left.

The same csv file looks flat, like the below (but wish to avoid writing and reading back a csv) to correct the format.

symbol      min     max      mean       std       wav
1000038  0.0110  0.0022  0.003296 -0.002611  0.005514
1000043  0.0090  0.0102  0.004124 -0.000507  0.004847
1000055  0.0100  0.0030  0.001358 -0.001719  0.004363
1000059  0.0050  0.0017  0.003372 -0.001089  0.008595
1000060  0.0012  0.0062  0.002367  0.000407  0.006935

so my question is how to turn the above into a tidy (square) shape like the above with just one header row ?



Solution 1:[1]

You can convert index with level symbol to column by DataFrame.reset_index:

print (df.index.name)
symbol

print (df.columns.name)
None

df1 = df.reset_index()
print (df1)
    symbol     min     max      mean       std       wav
0  1000038  0.0110  0.0022  0.003296 -0.002611  0.005514
1  1000043  0.0090  0.0102  0.004124 -0.000507  0.004847
2  1000055  0.0100  0.0030  0.001358 -0.001719  0.004363
3  1000059  0.0050  0.0017  0.003372 -0.001089  0.008595
4  1000060  0.0012  0.0062  0.002367  0.000407  0.006935

but then get default index, RangeIndex.

print (df1.index)
RangeIndex(start=0, stop=5, step=1)

Then if write to file is possible omit indices:

df1.to_csv(file, index=False)

If need only display like square by DataFrame.rename_axis - set index name to column name:

df2 = df.rename_axis(index=None, columns=df.index.name)
print (df2)
symbol      min     max      mean       std       wav
1000038  0.0110  0.0022  0.003296 -0.002611  0.005514
1000043  0.0090  0.0102  0.004124 -0.000507  0.004847
1000055  0.0100  0.0030  0.001358 -0.001719  0.004363
1000059  0.0050  0.0017  0.003372 -0.001089  0.008595
1000060  0.0012  0.0062  0.002367  0.000407  0.006935

but now first column is not column, it is index, only looks like column ;)

print (df2.index)
Int64Index([1000038, 1000043, 1000055, 1000059, 1000060], dtype='int64')

print (df2.columns)
Index(['min', 'max', 'mean', 'std', 'wav'], dtype='object', name='symbol')

print (df2.columns.name)
symbol

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1