'ValueError: Found input variables with inconsistent numbers of samples: [650, 1300]

I am trying to run a multivariate (multiple y) regression algorithm. The code is as follows.

data = pd.read_csv('data.csv')
X = data[['PM', 'Na', 'Cl', 'Al', 'Si', 'Ti']].values
y = data[['AD', 'SS']].values
X_train, X_test, y_train, y_test = train_test_split(X, y.flatten(), test_size = 0.3, random_state = 42)

ValueError: Found input variables with inconsistent numbers of samples: [650, 1300] at the splitting step.

I tried searching on google but couldn't find anything. Someone please guide me on how to select multiple y values.

Thanks in advance!



Solution 1:[1]

X and y don't have the same length.

To see for yourself, evaluate with X.shape and y.shape

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Alex Metsai