'How sklearn.metrics.r2_score works

I tried to implement formula from Wikipedia but results are different. Why is it so?

y_true = np.array([1, 1, 0])
y_pred = np.array([1, 0, 1])

r2 = r2_score(y_true, y_pred)
print(r2)

y_true_mean = statistics.mean(y_true)
r2 = 1 - np.sum((y_true - y_pred) ** 2) / np.sum((y_true - y_true_mean) ** 2)
print(r2)

-1.9999999999999996
0.0



Solution 1:[1]

Not sure what statistics package you use, but it seems that the different outcome originates there. Try to use np.mean instead. That gives the same R2 as sklearn:

import numpy as np

y_true = np.array([1, 1, 0])
y_pred = np.array([1, 0, 1])

y_true_mean = np.mean(y_true)
r2 = 1 - np.sum((y_true - y_pred) ** 2) / np.sum((y_true - y_true_mean) ** 2)
print(r2)

Try it online!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 agtoever