'Getting a shape error when passing in dataframes into python function?
I have this function that tries to take in a dataset and separate it by its labels.
def separate_by_classes(self, X, y):
''' This function separates our dataset in subdatasets by classes '''
self.classes = np.unique(y)
classes_index = {}
subdatasets = {}
cls, counts = np.unique(y, return_counts=True)
self.class_freq = dict(zip(cls, counts))
print(self.class_freq)
for class_type in self.classes:
classes_index[class_type] = np.argwhere(y==class_type)
subdatasets[class_type] = X[classes_index[class_type], :]
self.class_freq[class_type] = self.class_freq[class_type]/sum(list(self.class_freq.values()))
return subdatasets
I'm passing in the dataset as two dataframes: one with the features and another with the labels. These are their dimesnions respectively:
(5, 13) , (5, 1)
I am calling this function like so:
nbCLF = gaussClf() #the function is a part of a class
nbCLF.separate_by_classes(df_1, df_1_label)
But I'm getting this error:
Traceback (most recent call last):
File "c:\Users\KASH\Desktop\ML_DATA\classification.py", line 11, in <module>
nbCLF.separate_by_classes(df_1, df_1_label)
File "c:\Users\KASH\Desktop\ML_DATA\naivebayes.py", line 20, in separate_by_classes
classes_index[class_type] = np.argwhere(y==class_type)
File "<__array_function__ internals>", line 5, in argwhere
File "C:\Users\KASH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\numpy\core\numeric.py", line 617, in argwhere
return transpose(nonzero(a))
File "<__array_function__ internals>", line 5, in nonzero
File "C:\Users\KASH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\numpy\core\fromnumeric.py", line 1919, in nonzero
return _wrapfunc(a, 'nonzero')
File "C:\Users\KASH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\numpy\core\fromnumeric.py", line 55, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "C:\Users\KASH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\numpy\core\fromnumeric.py", line 48, in _wrapit
result = wrap(result)
File "C:\Users\KASH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\pandas\core\generic.py", line 2095, in __array_wrap__
return self._constructor(res, **d).__finalize__(self, method="__array_wrap__")
File "C:\Users\KASH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\pandas\core\frame.py", line 694, in __init__
mgr = ndarray_to_mgr(
File "C:\Users\KASH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\pandas\core\internals\construction.py", line 351, in ndarray_to_mgr
_check_values_indices_shape_match(values, index, columns)
File "C:\Users\KASH\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\pandas\core\internals\construction.py", line 422, in _check_values_indices_shape_match
raise ValueError(f"Shape of passed values is {passed}, indices imply
{implied}")
ValueError: Shape of passed values is (2, 2), indices imply (5, 1)
Here's a sample of the features data:
2 1.0 5 4.0 5.0 5.0 3 3.0 0 1.0 1.0 7.0 1.000000e+99
1 1.0 5 5.0 5.0 5.0 3 5.0 2 1.0 1.0 7.0 1.000000e+00
2 1.0 3 5.0 1.0 5.0 2 3.0 1 2.0 3.0 7.0 1.000000e+00
2 5.0 1 2.0 6.0 5.0 1 4.0 2 3.0 1.0 7.0 1.000000e+00
2 5.0 1 2.0 6.0 3.0 1 4.0 2 3.0 1.0 7.0 1.000000e+00
Here's a sample of the label data:
9
9
9
1
1
Any help is appreciated.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
