'ValueError: Found input variables with inconsistent numbers of samples: [650, 1300]
I am trying to run a multivariate (multiple y) regression algorithm. The code is as follows.
data = pd.read_csv('data.csv')
X = data[['PM', 'Na', 'Cl', 'Al', 'Si', 'Ti']].values
y = data[['AD', 'SS']].values
X_train, X_test, y_train, y_test = train_test_split(X, y.flatten(), test_size = 0.3, random_state = 42)
ValueError: Found input variables with inconsistent numbers of samples: [650, 1300] at the splitting step.
I tried searching on google but couldn't find anything. Someone please guide me on how to select multiple y values.
Thanks in advance!
Solution 1:[1]
X and y don't have the same length.
To see for yourself, evaluate with X.shape and y.shape
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Alex Metsai |
