'Convert row to column header for Pandas DataFrame,
The data I have to work with is a bit messy.. It has header names inside of its data. How can I choose a row from an existing pandas dataframe and make it (rename it to) a column header?
I want to do something like:
header = df[df['old_header_name1'] == 'new_header_name1']
df.columns = header
Solution 1:[1]
This works (pandas v'0.19.2'):
df.rename(columns=df.iloc[0])
Solution 2:[2]
It would be easier to recreate the data frame. This would also interpret the columns types from scratch.
headers = df.iloc[0]
new_df = pd.DataFrame(df.values[1:], columns=headers)
Solution 3:[3]
To rename the header without reassign df:
df.rename(columns=df.iloc[0], inplace = True)
To drop the row without reassign df:
df.drop(df.index[0], inplace = True)
Solution 4:[4]
You can specify the row index in the read_csv or read_html constructors via the header parameter which represents Row number(s) to use as the column names, and the start of the data. This has the advantage of automatically dropping all the preceding rows which supposedly are junk.
import pandas as pd
from io import StringIO
In[1]
csv = '''junk1, junk2, junk3, junk4, junk5
junk1, junk2, junk3, junk4, junk5
pears, apples, lemons, plums, other
40, 50, 61, 72, 85
'''
df = pd.read_csv(StringIO(csv), header=2)
print(df)
Out[1]
pears apples lemons plums other
0 40 50 61 72 85
Solution 5:[5]
Just for fun :
df.T.set_index(0).T
Transposes the dataframe, sets index with the first row and transposes again, so, no need to remove the fist row :)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Rob |
| Solution 2 | shahar_m |
| Solution 3 | Govinda |
| Solution 4 | |
| Solution 5 | Arnaud Desplanche |
