'ValueError: Length mismatch: Expected axis has 0 elements while creating hierarchical columns in pandas dataframe

I was going through the documentation about the hierarchical indexing in Pandas. I tried testing the examples from it to create an empty dataframe with hierarchical indexing:

In [5]: df = pd.DataFrame()

In [6]: df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])

However, it throws an error:

ValueError                                Traceback (most recent call last)
<ipython-input-6-dd823f9b8d22> in <module>()
----> 1 df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])

/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in __setattr__(self, name, value)
   2755         try:
   2756             object.__getattribute__(self, name)
-> 2757             return object.__setattr__(self, name, value)
   2758         except AttributeError:
   2759             pass

pandas/src/properties.pyx in pandas.lib.AxisProperty.__set__ (pandas/lib.c:44873)()

/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in _set_axis(self, axis, labels)
    446 
    447     def _set_axis(self, axis, labels):
--> 448         self._data.set_axis(axis, labels)
    449         self._clear_item_cache()
    450 

/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in set_axis(self, axis, new_labels)
   2800             raise ValueError('Length mismatch: Expected axis has %d elements, '
   2801                              'new values have %d elements' %
-> 2802                              (old_len, new_len))
   2803 
   2804         self.axes[axis] = new_labels

ValueError: Length mismatch: Expected axis has 0 elements, new values have 4 elements

I don't see any problem with my code. Any ideas what is happening?



Solution 1:[1]

This solution does not require numpy:

# create empty DataFrame with 4 columns
df = pd.DataFrame(columns = range(4))

df.columns = pd.MultiIndex(
    levels = [['first', 'second'], ['a', 'b']], 
    codes = [[0, 0, 1, 1], [0, 1, 0, 1]]
)

(Note: I changed labels to codes because that was changed in Pandas v1.0.0)

Solution 2:[2]

This error can also occur if you have used df.loc[, <col_name>]= value and you have not wrapped the condition within double brackets (). Make sure to always insert conditions in loc statements in double brackets.

It should be something similar to the one below:

df.loc[<(condition1) & (condition2)>, <col_name>]= value

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 wisbucky
Solution 2 kulvinder kakar