'Counting unique list items
Assuming I have a datatable dt.recipes which consists of lists with various items, for example:
recipe_id ingredients
1 apple, banana, cucumber, water
2 apple, meat, water
3 water
How can I create a table, counting the amount of unique items present in dt.recipes$ingredients? In other words, I am looking for a result similar to this:
ingredient count
water 3
apple 2
banana 1
cucumber 1
meat 1
Any pointers would be greatly appreciated, thanks in advance!
Solution 1:[1]
You can do:
as.data.frame(table(unlist(strsplit(df$ingredients, ", "))))
#> Var1 Freq
#> 1 apple 2
#> 2 banana 1
#> 3 cucumber 1
#> 4 meat 1
#> 5 water 3
Data
df <- structure(list(recipe_id = 1:3,
ingredients = c("apple, banana, cucumber, water",
"apple, meat, water",
"water")),
class = "data.frame", row.names = c(NA, -3L))
df
#> recipe_id ingredients
#> 1 1 apple, banana, cucumber, water
#> 2 2 apple, meat, water
#> 3 3 water
Created on 2022-03-07 by the reprex package (v2.0.1)
Solution 2:[2]
With functions from tidyverse:
library(tidyverse)
df %>%
separate_rows(ingredients) %>%
count(ingredients, name = "count") %>%
arrange(desc(count))
# A tibble: 5 x 2
# ingredients count
# <chr> <int>
#1 water 3
#2 apple 2
#3 banana 1
#4 cucumber 1
#5 meat 1
Solution 3:[3]
A base R option with scan + table + as.data.frame
> with(df, as.data.frame(table(trimws(scan(text = ingredients, what = "", sep = ",", quiet = TRUE)))))
Var1 Freq
1 apple 2
2 banana 1
3 cucumber 1
4 meat 1
5 water 3
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Allan Cameron |
| Solution 2 | |
| Solution 3 | ThomasIsCoding |
