'Use variable in Pandas query

I'm trying to query a Pandas dataframe like this:

        inv = pd.read_csv(infile)
        inv.columns = ['County','Site','Role','Hostname'] 
        clist = inv.County.unique() # Get list of counties
        for county in clist: # for each county
            csub=inv.query('County == county') # create a county subset
            ... do stuff on subset

But I get an error:

pandas.core.computation.ops.UndefinedVariableError: name 'county' is not defined

I'm sure it's a trivial error, but I can't figure it out. How do I pass a variable to the query method?



Solution 1:[1]

Format String Function

I found another (more generic) solution that might be interesting: The format string function (for examples, see 6.1.3.2. Format examples).

xyz = df.query('ColumnName >= {}'.format(VariableName))

The {} is replaced by VariableName.

f-Strings

In addition, user pciunkiewicz mentioned in a comment another solution using so-called f-strings which were introduced in Python 3.6 (August 2015):

xyz = df.query(f'ColumnName >= {VariableName}')

A more general f-strings example, taken from here:

>>> name = "Eric"
>>> age = 74
>>> f"Hello, {name}. You are {age}."
'Hello, Eric. You are 74.'

PS: I am new to Python.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1