'Counting unique list items

Assuming I have a datatable dt.recipes which consists of lists with various items, for example:

recipe_id     ingredients
1             apple, banana, cucumber, water
2             apple, meat, water
3             water

How can I create a table, counting the amount of unique items present in dt.recipes$ingredients? In other words, I am looking for a result similar to this:

ingredient    count
water         3
apple         2
banana        1
cucumber      1
meat          1

Any pointers would be greatly appreciated, thanks in advance!



Solution 1:[1]

You can do:

 as.data.frame(table(unlist(strsplit(df$ingredients, ", "))))
#>       Var1 Freq
#> 1    apple    2
#> 2   banana    1
#> 3 cucumber    1
#> 4     meat    1
#> 5    water    3

Data

df <- structure(list(recipe_id = 1:3, 
               ingredients = c("apple, banana, cucumber, water", 
                               "apple, meat, water", 
                               "water")), 
          class = "data.frame", row.names = c(NA, -3L))

df
#>   recipe_id                    ingredients
#> 1         1 apple, banana, cucumber, water
#> 2         2             apple, meat, water
#> 3         3                          water

Created on 2022-03-07 by the reprex package (v2.0.1)

Solution 2:[2]

With functions from tidyverse:

library(tidyverse)
df %>% 
  separate_rows(ingredients) %>% 
  count(ingredients, name = "count") %>% 
  arrange(desc(count))

# A tibble: 5 x 2
#  ingredients count
#  <chr>       <int>
#1 water           3
#2 apple           2
#3 banana          1
#4 cucumber        1
#5 meat            1

Solution 3:[3]

A base R option with scan + table + as.data.frame

> with(df, as.data.frame(table(trimws(scan(text = ingredients, what = "", sep = ",", quiet = TRUE)))))
      Var1 Freq
1    apple    2
2   banana    1
3 cucumber    1
4     meat    1
5    water    3

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Allan Cameron
Solution 2
Solution 3 ThomasIsCoding