'Counting Variables in R with 3 columns
I have been working on this but nothing seems to work. I have this dataset that is approximately 10k.
After cleaning the data. I want to count the products sold (There are more than 30 types that are repetitive) to see which one is sold the most and flagging the top 10. However, I would want to include the price of one unit next to the (n) column. For example, Apple was sold 1111 times I want $1 next to the count
| Product_name | Sold | Price |
|---|---|---|
| Apple | 1 | 1.00 |
| Orange | 1 | 2.00 |
| Apple | 1 | 1.00 |
| Orange | 1 | 2.00 |
| Apple | 1 | 1.00 |
| Orange | 1 | 2.00 |
Usning: df %>% count(Product_name) give this:
| Product_Name | n |
|---|---|
| Apple | 1111 |
| Orange | 2222 |
and I want to do this
| Product_name | n | Price |
|---|---|---|
| Apple | 1111 | 1.00 |
| Orange | 2222 | 2.00 |
In my data, I have something similar to this example and I have probably 30 different product_name I would really appreciate the help.
thanks,
Solution 1:[1]
If the price does not vary for a particular product you could just use Price = first(Price) within the summarise statement.
Solution 2:[2]
You can achieve this using native aggregate() function.
Product_name <- c("Apple", "Apple", "Orange", "Apple", "Orange")
Sold <- c(1,2,1,1,1)
Price <- c(1,1,2,1,2)
df <- data.frame(Product_name, Sold, Price)
new_df <- aggregate(Sold ~ Product_name + Price, data = df, sum)
Resulting in:
Product_name Price Sold
Apple 1 4
Orange 2 2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mkpt_uk |
| Solution 2 | edv |
