'Can I pass a dataframe to @pytest.mark.parametrize?

I would like to pass a dataframe to @pytest.mark.parametrize. The dataframes are stored on conftest.py. The unit tests that do not use @pytest.mark.parametrize that reference the dataframes successfully execute.

However, when I apply @pytest.mark.parameterize, the code returns TypeError: 'function' object is not subscriptable

The dataframes are developed as functions in conftest.py. For example:

@pytest.fixture(scope="module")
def df_vartypes():
    data = {
        "Name": ["tom", "nick", "krish", "jack"],
        "City": ["London", "Manchester", "Liverpool", "Bristol"],
        "Age": [20, 21, 19, 18],
        "Marks": [0.9, 0.8, 0.7, 0.6],
        "dob": pd.date_range("2020-02-24", periods=4, freq="T"),
    }

    df = pd.DataFrame(data)

    return df

The unit tests:

_cat_num_vars = [
    (df_enc, "var_A", ["var_A"], []),
    (df_enc_numeric, "var_B", [], ["var_B"]),
    # TODO: Datetime test
    (df_vartypes, None, ["Name", "City"], ["Age", "Marks"]),
    (df_enc_numeric, None, [], ["var_A", "var_B", "target"])
]

@pytest.mark.parametrize(
    "_df, _variables, _categorical_vars, _numerical_vars", _cat_num_vars
)
def test_find_categorical_and_numeric_vars_pass_diff_var_permutations(
        _df, _variables, _categorical_vars, _numerical_vars
):
    assert (_find_categorical_and_numerical_variables(
        _df, _variables) == (_categorical_vars, _numerical_vars)
            )

Traceback:

X = <function df_vartypes at 0x7fa8a1647310>, variables = None

    def _find_categorical_and_numerical_variables(
            X: pd.DataFrame, variables: Variables = None
    ) -> Tuple[List[Union[str, int]], List[Union[str, int]]]:
        """
        Find numerical and categorical variables.
    
        Parameters
        ----------
        X :  pandas DataFrame
        variables : List of variables. Defaults to None.
    
        Returns
        -------
        variables : Tuple with List of numerical and list of categorical variables.
        """
    
        # If the user passes just 1 variable outside a list.
        if isinstance(variables, (str, int)):
    
            if is_categorical(X[variables]) or is_object(X[variables]):
                variables_cat = [variables]
                variables_num = []
            elif is_numeric(X[variables]):
                variables_num = [variables]
                variables_cat = []
            else:
                raise TypeError("The variable entered is neither numerical "
                                "nor categorical.")
    
        # If user leaves default None parameter.
        elif variables is None:
            # find categorical variables
            if variables is None:
                variables_cat = [
                    column
>                   for column in X.select_dtypes(include=["O", "category"]).columns
                    if _is_categorical_and_is_not_datetime(X[column])
                ]
E               AttributeError: 'function' object has no attribute 'select_dtypes'

feature_engine/variable_manipulation.py:321: AttributeError


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source