'obtaining the percentage of a repeated non zero values

my data is like this

df<-structure(list(team_3_F = c("browingal ", "browingal ", "browingal ", 
"browingal ", "browingal ", "browingal ", "browingal ", "browingal ", 
"browingal ", "browingal ", "browingal ", "browingal ", "newyorkish", 
"newyorkish", "newyorkish", "newyorkish", "site", "site", "site", 
"site", "site", "site", "team ", "team ", "team ", "team ", "team ", 
"team ", "team ", "team ", "team ", "team ", "team ", "team ", 
"team ", "team ", "team ", "team ", "team ", "team ", "team ", 
"team ", "team ", "team "), AAA_US = c(0L, 1L, 0L, 0L, 0L, 0L, 
1L, 0L, 0L, 0L, 0L, 0L, 88L, 5L, 11L, 1L, 0L, 0L, 0L, 45L, 0L, 
0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 19L), BBB_US = c(0L, 2L, 3L, 2L, 1L, 
0L, 1L, 0L, 0L, 2L, 1L, 0L, 0L, 3L, 0L, 0L, 8L, 0L, 0L, 0L, 0L, 
0L, 0L, 4L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 45L, 0L, 0L, 0L, 18L, 
0L, 0L, 0L, 1L, 0L, 0L, 0L, 19L), CCC_US = c(0L, 0L, 0L, 0L, 
0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 88L, 5L, 2L, 1L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 19L)), class = "data.frame", row.names = c(NA, 
-44L))

I want to obtain the the percentage of each combinations in regards to each category for instance

   AAA_BBB_US   AAA_CCC_US      
    2              1               12   browingal 
    2              2                4   newyorkish
    0              0                6   site
    4              2               22   team

which means it will be the following percentage

AAA_BBB_US                     AAA_CCC_US       
    2/12*100               1/12*100           
    2/4*100                2/4*100             
    0/6*100                0/6*100              
    4/22*100               2/22*100

so the output will be like this

AAA_BBB_US    AAA_CCC_US
16%            8.3%
50%            50%
0%             0%
18%            9%

Solution 1:^[1]

You can create your AAA_BBB_US, AAA_CCC_US and AAA_BBB_CCC_US columns as below (i.e. will be TRUE if the product is non-zero, then, by team sum the values, dividing by the number of rows (n()) in each group

library(dplyr)

df %>% 
  mutate(AAA_BBB_US = AAA_US*BBB_US!=0,
         AAA_CCC_US = AAA_US*CCC_US!=0,
         AAA_BBB_CCC_US = AAA_US*BBB_US*CCC_US!=0)%>% 
  group_by(team_3_F) %>%
  summarize(across(AAA_BBB_US:AAA_BBB_CCC_US, ~sum(.x)/n()))

Output:

# A tibble: 4 x 4
  team_3_F     AAA_BBB_US AAA_CCC_US AAA_BBB_CCC_US
  <chr>             <dbl>      <dbl>          <dbl>
1 "browingal "      0.167     0.0833         0.0833
2 "newyorkish"      0.25      1              0.25  
3 "site"            0         0              0     
4 "team "           0.182     0.0909         0.0909

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1

'obtaining the percentage of a repeated non zero values

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]