'Exclude observations below a certain threshold in a stacked bar chart in ggplot2

I need to exclude some observations below a certain threshold in stacked bar chart done with ggplot2.

An example of my dataframe:

enter image description here

My code:

  ggplot(df, aes(x=reorder(UserName,-Nb_Interrogations, sum), y=Nb_Interrogations, fill=Folder)) + 
  geom_bar(stat="identity") +
  theme_bw()+
  theme(legend.key.size = unit(0.5,"line"), legend.position = c(0.8,0.7)) +
  labs(x = "UserName") +
  ylim(0, 95000) +
  scale_y_continuous(breaks = seq(0, 95000, 10000)) +
  scale_fill_brewer(palette = "Blues") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) 

The problem is that I have many observations (UserName) with low values on the Y axes (Nb_Interrogations). So I'd like to exclude all the UserName below a certain threshold from the barplot, let's say 100. enter image description here

I tried with the which function changing my code:

ggplot(df[which(df$Nb_Interrogations>100),]aes(x=reorder(UserName,-Nb_Interrogations, sum), y=Nb_Interrogations, fill=Folder)) + 
  geom_bar(stat="identity") +
  theme_bw()+
  theme(legend.key.size = unit(0.5,"line"), legend.position = c(0.8,0.7)) +
  labs(x = "UserName") +
  ylim(0, 95000) +
  scale_y_continuous(breaks = seq(0, 95000, 10000)) +
  scale_fill_brewer(palette = "Blues") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) 

But it doesn't fit my case since it excludes all the observations below the threshold = 100 that are present in my DF from the general computation changing also the Y axes values. How can I solve this problem? thanks enter image description here



Solution 1:[1]

It looks like the simplest solution for you will involve subsetting your data first, and then plotting. Without workable data to test, this is just a theoretical answer, so you may have to adapt for your needs. You can pipe the subsetting and plotting together for ease. Something like this might do the trick for you:

df %>%
  group_by(UserName) %>%
  filter(sum(Nb_Interrogations > 100)) %>%
  ggplot(., aes(x=reorder(UserName,-Nb_Interrogations, sum), y=Nb_Interrogations, fill=Folder)) +
  ## the rest of your plotting code here ##

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1