'How to count the occurrences of certain rows in a column from Python dataframe

So I have a dataframe of wildfires in California by county. It looks a little something like:

Year Counties Name
2016 Monterey Soberanes
Kern Erskine
San Luis Bishop Chimney
San Bernardino Blue Cut
Siskiyou Gap
... ... ...

I also have some lists of certain counties I'm most interested in.

relevant_counties = ['Monterey', 'Kern', ' San Bernardino']

If I wanted to see how many wildfires occurred in my list of relevant counties, how would I go about doing that? I have tried df['Counties'].value_counts() and it gives me a series with a count for each county, but how do I write a function that will then add the number of wildfires that occurred in a county in my relevant_counties list?

Thank you!



Solution 1:[1]

Ideally filter before counting the values:

df.loc[df['Counties'].isin(relevant_counties), 'counties'].value_counts()

Unless you plan on reusing the counts with various lists of counties:

counts = df['counties'].value_counts()

counts[relevant_counties]

counts[relevant_counties_2]

Note on the spaces

I notice that you have leading spaces in the list:

relevant_counties = ['Monterey', 'Kern', ' San Bernardino']

Try to avoid this, if you potentially have those spaces or uncertain case in the data, you can homogenize using:

# no leading/trailing spaces
relevant_counties = ['Monterey', 'Kern', 'San Bernardino']

clean = df['counties'].str.strip().str.title() # capitalize each word "Abc Def"

clean[clean.isin(relevant_counties)].value_counts()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway