'pandas how to swap or reorder columns
I know that there are ways to swap the column order in python pandas. Let say I have this example dataset:
import pandas as pd
employee = {'EmployeeID' : [0,1,2],
'FirstName' : ['a','b','c'],
'LastName' : ['a','b','c'],
'MiddleName' : ['a','b', None],
'Contact' : ['(M) 133-245-3123', '(F)[email protected]', '(F)312-533-2442 [email protected]']}
df = pd.DataFrame(employee)
The one basic way to do would be:
neworder = ['EmployeeID','FirstName','MiddleName','LastName','Contact']
df=df.reindex(columns=neworder)
However, as you can see, I only want to swap two columns. It was doable just because there are only 4 column, but what if I have like 100 columns? what would be an effective way to swap or reorder columns?
There might be 2 cases:
- when you just want 2 columns swapped.
- when you want 3 columns reordered. (I am pretty sure that this case can be applied to more than 3 columns.)
Solution 1:[1]
Say your current order of column is [b,c,d,a] and you want to order it into [a,b,c,d], you could do it this way:
new_df = old_df[['a', 'b', 'c', 'd']]
Solution 2:[2]
When faced with same problem at larger scale, I came across a very elegant solution at this link: http://www.datasciencemadesimple.com/re-arrange-or-re-order-the-column-of-dataframe-in-pandas-python-2/ under the heading "Rearrange the column of dataframe by column position in pandas python".
Basically if you have the column order as a list, you can read that in as the new column order.
##### Rearrange the column of dataframe by column position in pandas python
df2=df1[df1.columns[[3,2,1,0]]]
print(df2)
In my case, I had a pre-calculated column linkage that determined the new order I wanted. If this order was defined as an array in L, then:
a_L_order = a[a.columns[L]]
Solution 3:[3]
If you want to have a fixed list of columns at the beginning, you could do something like
cols = ['EmployeeID','FirstName','MiddleName','LastName']
df = df[cols + [c for c in df.columns if c not in cols]]
This will put these 4 columns first and leave the rest untouched (without any duplicate column).
Solution 4:[4]
When writing to a file
Columns can also be reordered when the dataframe is written out to a file (e.g. CSV):
df.to_csv('employees.csv',
columns=['EmployeeID','FirstName','MiddleName','LastName','Contact'])
Solution 5:[5]
A concise way to reorder columns when you don't have too many columns and don't want to list the column names is with .iloc[].
df_reorderd = df.iloc[:, [0, 1, 3, 2, 4]]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | sanster9292 |
| Solution 2 | rssmith01 |
| Solution 3 | Jean-Francois T. |
| Solution 4 | Asclepius |
| Solution 5 | jeffhale |
