'How to change the index of a Pandas DataFrame
In the last line of this code I want to set the index of to 'Country' but when I look at the columns of the data frame it is still called 'index'. I have tried without the inplace and create a new df and with option drop=True. But that doesn't to work.
import pandas as pd
import numpy as np
Energy = pd.read_excel('./assets/Energy Indicators.xls', header=None, footer=None, usecols=range(2,6))
Energy = Energy[18:245].reset_index()
Energy.rename(columns={2 : 'Country', 3 :'Energy Supply', 4 : 'Energy Supply per Capita', 5 : '% Renewable'}, inplace=True)
Energy.replace('...', np.nan, inplace=True)
Energy.replace(["Republic of Korea", "United States of America", "United Kingdom of Great Britain and Northern Ireland", "China, Hong Kong Special Administrative Region"],["South Korea", "United States", "United Kingdom", "Hong Kong"], inplace = True)
Energy['Country'] = Energy['Country'].str.replace(r"\(.*\)","")
Energy['Country'] = Energy['Country'].str.replace('\d+', '',)
Energy['Energy Supply'] = Energy['Energy Supply'].apply(lambda x : x * 1000000)
Energy.set_index('Country', inplace=True)
print(Energy.index)
print(Energy.columns.values)
The output is:
Index(['Afghanistan', 'Albania', 'Algeria', 'American Samoa', 'Andorra',
'Angola', 'Anguilla', 'Antigua and Barbuda', 'Argentina', 'Armenia',
...
'United States Virgin Islands', 'Uruguay', 'Uzbekistan', 'Vanuatu',
'Venezuela ', 'Viet Nam', 'Wallis and Futuna Islands', 'Yemen',
'Zambia', 'Zimbabwe'],
dtype='object', name='Country', length=227)
['index' 'Energy Supply' 'Energy Supply per Capita' '% Renewable']
How do you set the index?
Solution 1:[1]
You have done it right!
When you did Energy.set_index('Country', inplace=True), it did work!
That's why when you printed the index, Energy.index, it gave you the Countries as the result. Index is a class within Pandas. Read more here
The output of print(Energy.index) also indicates the index to be set as countries.
The next output, print(Energy.columns) shows an index column, because you did a reset_index() previously. Hope this helps!
Solution 2:[2]
The 'index' you see in your columns is not your index, it is a column left over from when you did Energy = Energy[18:245].reset_index()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | pnv |
| Solution 2 | BeRT2me |
