'Can I pass a dataframe to @pytest.mark.parametrize?
I would like to pass a dataframe to @pytest.mark.parametrize. The dataframes are stored on conftest.py. The unit tests that do not use @pytest.mark.parametrize that reference the dataframes successfully execute.
However, when I apply @pytest.mark.parameterize, the code returns TypeError: 'function' object is not subscriptable
The dataframes are developed as functions in conftest.py. For example:
@pytest.fixture(scope="module")
def df_vartypes():
data = {
"Name": ["tom", "nick", "krish", "jack"],
"City": ["London", "Manchester", "Liverpool", "Bristol"],
"Age": [20, 21, 19, 18],
"Marks": [0.9, 0.8, 0.7, 0.6],
"dob": pd.date_range("2020-02-24", periods=4, freq="T"),
}
df = pd.DataFrame(data)
return df
The unit tests:
_cat_num_vars = [
(df_enc, "var_A", ["var_A"], []),
(df_enc_numeric, "var_B", [], ["var_B"]),
# TODO: Datetime test
(df_vartypes, None, ["Name", "City"], ["Age", "Marks"]),
(df_enc_numeric, None, [], ["var_A", "var_B", "target"])
]
@pytest.mark.parametrize(
"_df, _variables, _categorical_vars, _numerical_vars", _cat_num_vars
)
def test_find_categorical_and_numeric_vars_pass_diff_var_permutations(
_df, _variables, _categorical_vars, _numerical_vars
):
assert (_find_categorical_and_numerical_variables(
_df, _variables) == (_categorical_vars, _numerical_vars)
)
Traceback:
X = <function df_vartypes at 0x7fa8a1647310>, variables = None
def _find_categorical_and_numerical_variables(
X: pd.DataFrame, variables: Variables = None
) -> Tuple[List[Union[str, int]], List[Union[str, int]]]:
"""
Find numerical and categorical variables.
Parameters
----------
X : pandas DataFrame
variables : List of variables. Defaults to None.
Returns
-------
variables : Tuple with List of numerical and list of categorical variables.
"""
# If the user passes just 1 variable outside a list.
if isinstance(variables, (str, int)):
if is_categorical(X[variables]) or is_object(X[variables]):
variables_cat = [variables]
variables_num = []
elif is_numeric(X[variables]):
variables_num = [variables]
variables_cat = []
else:
raise TypeError("The variable entered is neither numerical "
"nor categorical.")
# If user leaves default None parameter.
elif variables is None:
# find categorical variables
if variables is None:
variables_cat = [
column
> for column in X.select_dtypes(include=["O", "category"]).columns
if _is_categorical_and_is_not_datetime(X[column])
]
E AttributeError: 'function' object has no attribute 'select_dtypes'
feature_engine/variable_manipulation.py:321: AttributeError
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
