'ValueError: Length mismatch: Expected axis has 0 elements while creating hierarchical columns in pandas dataframe
I was going through the documentation about the hierarchical indexing in Pandas. I tried testing the examples from it to create an empty dataframe with hierarchical indexing:
In [5]: df = pd.DataFrame()
In [6]: df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
However, it throws an error:
ValueError Traceback (most recent call last)
<ipython-input-6-dd823f9b8d22> in <module>()
----> 1 df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in __setattr__(self, name, value)
2755 try:
2756 object.__getattribute__(self, name)
-> 2757 return object.__setattr__(self, name, value)
2758 except AttributeError:
2759 pass
pandas/src/properties.pyx in pandas.lib.AxisProperty.__set__ (pandas/lib.c:44873)()
/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in _set_axis(self, axis, labels)
446
447 def _set_axis(self, axis, labels):
--> 448 self._data.set_axis(axis, labels)
449 self._clear_item_cache()
450
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in set_axis(self, axis, new_labels)
2800 raise ValueError('Length mismatch: Expected axis has %d elements, '
2801 'new values have %d elements' %
-> 2802 (old_len, new_len))
2803
2804 self.axes[axis] = new_labels
ValueError: Length mismatch: Expected axis has 0 elements, new values have 4 elements
I don't see any problem with my code. Any ideas what is happening?
Solution 1:[1]
This solution does not require numpy:
# create empty DataFrame with 4 columns
df = pd.DataFrame(columns = range(4))
df.columns = pd.MultiIndex(
levels = [['first', 'second'], ['a', 'b']],
codes = [[0, 0, 1, 1], [0, 1, 0, 1]]
)
(Note: I changed labels to codes because that was changed in Pandas v1.0.0)
Solution 2:[2]
This error can also occur if you have used df.loc[, <col_name>]= value and you have not wrapped the condition within double brackets (). Make sure to always insert conditions in loc statements in double brackets.
It should be something similar to the one below:
df.loc[<(condition1) & (condition2)>, <col_name>]= value
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | wisbucky |
| Solution 2 | kulvinder kakar |
