'ImportError when importing metric from sklearn
When I am trying to import a metric from sklearn, I get the following error:
from sklearn.metrics import mean_absolute_percentage_error
ImportError: cannot import name 'mean_absolute_percentage_error' from 'sklearn.metrics'
/Users/carter/opt/anaconda3/lib/python3.8/site-packages/sklearn/metrics/__init__.py)
I have used conda update all, and reinstalled scikit-learn to no avail. Any other reasons this might happen and solutions?
Solution 1:[1]
The answer above is the right one. For those who cannot upgrade/install from source, below is the required code.
The function itself relies on other functions - one defined in the same module and others is from sklearn.utils.validation.
Here is the required code I pulled from the source - if anyone needs it (and I hope I am not violating any license):
from sklearn.utils.validation import check_consistent_length, check_array
def mean_absolute_percentage_error(y_true, y_pred,
sample_weight=None,
multioutput='uniform_average'):
"""Mean absolute percentage error regression loss.
Note here that we do not represent the output as a percentage in range
[0, 100]. Instead, we represent it in range [0, 1/eps]. Read more in the
:ref:`User Guide <mean_absolute_percentage_error>`.
.. versionadded:: 0.24
Parameters
----------
y_true : array-like of shape (n_samples,) or (n_samples, n_outputs)
Ground truth (correct) target values.
y_pred : array-like of shape (n_samples,) or (n_samples, n_outputs)
Estimated target values.
sample_weight : array-like of shape (n_samples,), default=None
Sample weights.
multioutput : {'raw_values', 'uniform_average'} or array-like
Defines aggregating of multiple output values.
Array-like value defines weights used to average errors.
If input is list then the shape must be (n_outputs,).
'raw_values' :
Returns a full set of errors in case of multioutput input.
'uniform_average' :
Errors of all outputs are averaged with uniform weight.
Returns
-------
loss : float or ndarray of floats in the range [0, 1/eps]
If multioutput is 'raw_values', then mean absolute percentage error
is returned for each output separately.
If multioutput is 'uniform_average' or an ndarray of weights, then the
weighted average of all output errors is returned.
MAPE output is non-negative floating point. The best value is 0.0.
But note the fact that bad predictions can lead to arbitarily large
MAPE values, especially if some y_true values are very close to zero.
Note that we return a large value instead of `inf` when y_true is zero.
Examples
--------
>>> from sklearn.metrics import mean_absolute_percentage_error
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> mean_absolute_percentage_error(y_true, y_pred)
0.3273...
>>> y_true = [[0.5, 1], [-1, 1], [7, -6]]
>>> y_pred = [[0, 2], [-1, 2], [8, -5]]
>>> mean_absolute_percentage_error(y_true, y_pred)
0.5515...
>>> mean_absolute_percentage_error(y_true, y_pred, multioutput=[0.3, 0.7])
0.6198...
"""
y_type, y_true, y_pred, multioutput = _check_reg_targets(
y_true, y_pred, multioutput)
check_consistent_length(y_true, y_pred, sample_weight)
epsilon = np.finfo(np.float64).eps
mape = np.abs(y_pred - y_true) / np.maximum(np.abs(y_true), epsilon)
output_errors = np.average(mape,
weights=sample_weight, axis=0)
if isinstance(multioutput, str):
if multioutput == 'raw_values':
return output_errors
elif multioutput == 'uniform_average':
# pass None as weights to np.average: uniform mean
multioutput = None
return np.average(output_errors, weights=multioutput)
def _check_reg_targets(y_true, y_pred, multioutput, dtype="numeric"):
"""Check that y_true and y_pred belong to the same regression task.
Parameters
----------
y_true : array-like
y_pred : array-like
multioutput : array-like or string in ['raw_values', uniform_average',
'variance_weighted'] or None
None is accepted due to backward compatibility of r2_score().
Returns
-------
type_true : one of {'continuous', continuous-multioutput'}
The type of the true target data, as output by
'utils.multiclass.type_of_target'.
y_true : array-like of shape (n_samples, n_outputs)
Ground truth (correct) target values.
y_pred : array-like of shape (n_samples, n_outputs)
Estimated target values.
multioutput : array-like of shape (n_outputs) or string in ['raw_values',
uniform_average', 'variance_weighted'] or None
Custom output weights if ``multioutput`` is array-like or
just the corresponding argument if ``multioutput`` is a
correct keyword.
dtype : str or list, default="numeric"
the dtype argument passed to check_array.
"""
check_consistent_length(y_true, y_pred)
y_true = check_array(y_true, ensure_2d=False, dtype=dtype)
y_pred = check_array(y_pred, ensure_2d=False, dtype=dtype)
if y_true.ndim == 1:
y_true = y_true.reshape((-1, 1))
if y_pred.ndim == 1:
y_pred = y_pred.reshape((-1, 1))
if y_true.shape[1] != y_pred.shape[1]:
raise ValueError("y_true and y_pred have different number of output "
"({0}!={1})".format(y_true.shape[1], y_pred.shape[1]))
n_outputs = y_true.shape[1]
allowed_multioutput_str = ('raw_values', 'uniform_average',
'variance_weighted')
if isinstance(multioutput, str):
if multioutput not in allowed_multioutput_str:
raise ValueError("Allowed 'multioutput' string values are {}. "
"You provided multioutput={!r}".format(
allowed_multioutput_str,
multioutput))
elif multioutput is not None:
multioutput = check_array(multioutput, ensure_2d=False)
if n_outputs == 1:
raise ValueError("Custom weights are useful only in "
"multi-output cases.")
elif n_outputs != len(multioutput):
raise ValueError(("There must be equally many custom weights "
"(%d) as outputs (%d).") %
(len(multioutput), n_outputs))
y_type = 'continuous' if n_outputs == 1 else 'continuous-multioutput'
return y_type, y_true, y_pred, multioutput
Solution 2:[2]
You can go with one of these two solutions:
- Upgrade your sklearn version
!pip install scikit-learn==0.24
Then,
from sklearn.metrics import mean_absolute_percentage_error
- Build your own function to calculate MAPE
def MAPE(y_true, y_pred):
y_true, y_pred = np.array(y_true), np.array(y_pred)
return np.mean(np.abs((y_true - y_pred) / y_true)) * 100
But the problem with the above function is that when you have (0) true value your MAPE will go (inf). So, to solve this problem we use,
def MAPE(y_true, y_pred):
y_true, y_pred = np.array(y_true), np.array(y_pred)
return np.mean(np.abs((y_true - y_pred) / np.maximum(np.ones(len(y_true)), np.abs(y_true))))*100
Solution 3:[3]
Just had the same issue. Accessing Anaconda Prompt for the environment that one is working on, and running
pip install scikit-learn
Solved the problem.
It updated scikit-learn's version (at this precise moment it was upgraded to version 1.0.2, but it is present in versions starting at 0.24.), and now one is able to import and run sklearn.metrics.mean_absolute_percentage_error.
Here is an example (Source):
>>> from sklearn.metrics import mean_absolute_percentage_error
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> mean_absolute_percentage_error(y_true, y_pred)
0.3273...
Note: You may want to keep in mind that the MAPE can be problematic, as it may result in divisions by zero (see my answer here).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Mohana |
| Solution 3 |
