'Jupyter Notebook in VSCode computes wrong confusion matrix and gives different output than Google Colab and .py file with the exact same code

A jupyter notebook I'm editing in VSCode fails to compute a confusion matrix (the positive samples are flipped), while I get the right result when I run the exact same code on Google Colab

After training a ML model I'm trying to get the confusion matrix for the entire dataset by doing a KFold, getting the confusion matrix for each split and summing them, like this:

from sklearn.metrics import confusion_matrix
from sklearn.model_selection import KFold

kf = KFold(n_splits=5, shuffle=True, random_state=10)
test_model = LGBMClassifier(objective="binary", **study.best_params)
cm_array = []

for train_idx, test_idx in kf.split(X, y):
    X_train, X_test = X.iloc[train_idx, :], X.iloc[test_idx, :] #also tried X.iloc[train_idx], same result
    y_train, y_test = y[train_idx], y[test_idx]

    test_model.fit(
        X_train,
        y_train
    )
    
    y_pred = test_model.predict(X_test)
    cm = confusion_matrix(y_test, y_pred)
    cm_array.append(cm)
        
cm_sum = np.sum(cm_array, axis=0)
cm_sum

My problem is that the matrix I get when doing this has the values for the positive labels are switched, so instead of getting:

[[TP, FP], 
 [FN, TN]]

I get:

[[FP, TP], 
 [FN, TN]]

While this could be that the model is just bad, there are two reasons why I think this is not the problem:

When doing the confusion matrix with just one manual train_test_split I get the correct matrix.
When I run the exact same code on both a "normal" .py file and on Google Colab, it outputs the correct matrix:

array([[0, 26],
      [4, 145]] #on VSCode

array([[26, 0],
      [4, 145]]) #on Google Colab and .py script

The VSCode notebook file and the .py file are part of the same venv, so dependencies and versions should't be an issue, either.

Why could this be happening?

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Jupyter Notebook in VSCode computes wrong confusion matrix and gives different output than Google Colab and .py file with the exact same code

Sources

Related Questions