'Trying to filter out unique values in a pandas data frame

Hi is there a way to filter out unique values ina a pandas data frame. I am using the code below to filter out the unique values. However, I am getting different ordered combinations. For example, ['Creative, Modern Cuisine', 'Modern Cuisine, Creative'] is there a way to filter this out.

[Part of the data]

cuisine = df.Cuisine.unique()
cuisine_count = df.Cuisine.nunique()
print(cuisine, cuisine_count)


Solution 1:[1]

If I understand your intent, you are trying to get a list of all distinct cuisines which appear in your DataFrame. Try this:

df['Cuisine'].str.split(',').explode().str.strip().unique().tolist()

Explanation:

  • df['Cuisine'].str.split(','): split Cuisine strings at commas, producing a Series with a Python list in each row, where each list item holds an individual cuisine string
  • .explode(): for each list of cuisine strings, transform each string to a row
  • .str.strip(): strip whitespace
  • .unique().tolist(): get list of unique cuisines

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Peter Leimbigler