'Pandas: How to drop multiple columns with nan as col name?

As per the title here's a reproducible example:

raw_data = {'x': ['this', 'that', 'this', 'that', 'this'], 
            np.nan: [np.nan, np.nan, np.nan, np.nan, np.nan], 
            'y': [np.nan, np.nan, np.nan, np.nan, np.nan],
            np.nan: [np.nan, np.nan, np.nan, np.nan, np.nan]}

df = pd.DataFrame(raw_data, columns = ['x', np.nan, 'y', np.nan])
df

   x     NaN  y    NaN
0  this  NaN  NaN  NaN
1  that  NaN  NaN  NaN
2  this  NaN  NaN  NaN
3  that  NaN  NaN  NaN
4  this  NaN  NaN  NaN

Aim is to drop only the columns with nan as the col name (so keep column y). dropna() doesn't work as it conditions on the nan values in the column, not nan as the col name.

df.drop(np.nan, axis=1, inplace=True) works if there's a single column in the data with nan as the col name, but not with multiple columns with nan as the col name, as in my data.

So how to drop multiple columns where the col name is nan?



Solution 1:[1]

You can try

df.columns = df.columns.fillna('to_drop')
df.drop('to_drop', axis = 1, inplace = True)

Solution 2:[2]

As of pandas 1.4.0

df.drop is the simplest solution, as it now handles multiple NaN headers properly:

df = df.drop(columns=np.nan)

#    x     y
# 0  this  NaN
# 1  that  NaN
# 2  this  NaN
# 3  that  NaN
# 4  this  NaN

Or the equivalent axis syntax:

df = df.drop(np.nan, axis=1)

Note that it's possible to use inplace instead of assigning back to df, but inplace is not recommended and will eventually be deprecated.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Vaishali
Solution 2 tdy