'How to calculate SE and add error bars to R bar graph, SE only calculating NA

I am trying to calculate SE in R to add error bars to my species richness graph. Here is my code:

#calculate means and standard errors, by habitat type and richness
bmir<-rich%>% 
  group_by(habitat_type,habitat_richness)%>%
  summarise(mean = mean(habitat_richness),num_obs=n(),sum_habitat_richness=sum(habitat_richness),
            sd_habitat_richness= sd(habitat_richness),se_mean=sd_habitat_richness/sqrt(num_obs),
            se_upper=mean+se_mean,se_lower=mean-se_mean)

#add error bars
ggplot(data=bmir, aes(x=habitat_type,y=habitat_richness,fill=habitat_type))+
  geom_bar(stat="identity")+
  geom_errorbar (aes( x=habitat_type, ymin= se_lower , ymax = se_upper), width=0.2) + 
  ylab("Major Group Richness") + 
  xlab('Habitat')+
  theme_bw()+
  theme(text=element_text(size=18))+
  theme(axis.text.x = element_text(angle = 45,vjust=0.5))+
  theme(legend.position="none")+ #remove legend

and my data frames:

bmir
habitat_type        habitat_richness  mean num_obs sum_habitat_richness sd_habitat_richness se_mean se_upper se_lower
  <chr>                          <int> <dbl>   <int>                <int>               <dbl>   <dbl>    <dbl>    <dbl>
1 backwater                          6     6       1                    6                  NA      NA       NA       NA
2 main stem pool                    10    10       1                   10                  NA      NA       NA       NA
3 main stem riffle                   8     8       1                    8                  NA      NA       NA       NA
4 off-channel pond                  11    11       1                   11                  NA      NA       NA       NA
5 side channel pool                 11    11       1                   11                  NA      NA       NA       NA
6 side channel riffle                7     7       1                    7                  NA      NA       NA       NA
 
rich
habitat_type        habitat_richness
  <chr>                          <int>
1 backwater                          6
2 main stem pool                    10
3 main stem riffle                   8
4 off-channel pond                  11
5 side channel pool                 11
6 side channel riffle                7

I am not sure why it is only calculating NA as my SE. It seems simple enough, but I am not sure what the problem is.



Solution 1:[1]

So looking through your code and coming up with a reproducible example, I think the issue may simply be that you either a) calculate richness by habitat type, instead of by sample site or b) you included habitat richness in the group_by() function when you created bmir. Below is a reproducible example using fake data that separates the calculation of site species richness from mean habtiat species richness, and I think gives the results you are after:

##Loading Necessary Packages##
library(dplyr)
library(ggplot2)

set.seed(94)#For reproducibility

##Creating Fake Data#
#First get potential habitat types and sample sites##
habitats<-c("backwater", "main stem pool", "main stem pool", "main stem riffle", "off channel pond", "side channel pool", "side channel riffle")
sites<-paste("site", c(1:30), sep="_")

#Randomly assign a habitat type to each site##
habitat<-sample(habitats, length(sites), replace = TRUE)

#Create a dataframe of sites and habitat
site_info<-data.frame(sites, habitat)

#Next make species pool#
species<-c("Ephemeroptera A", "Ephemeroptera B", "Ephemeroptera C",
           "Ephemeroptera D", "Ephemeroptera E", "Plecoptera A","Plecoptera B",
           "Plecoptera C", "Plecoptera D")

#Randomly assign a species richness to each site#
tmp<-sample(1:9, length(sites), replace = TRUE)

##Using a loop, we will randomly draw 
#the number of species from the species pool 
#for each site and assign them to the site 
#and given them an abundance value

species_comp<-list()#A blank list to save iterative results of loop to

for(i in 1:length(sites)){##Seting the looping range, in this case we will perform the functions below for each site
  species_comp[[i]]<-data.frame(species=sample(species, tmp[i], replace=FALSE))
  species_comp[[i]][,"abundance"]<-sample(c(1:80), tmp[i], replace=TRUE)
  species_comp[[i]][,"sites"]<-sites[i]
}#end loop

#Make the species abundace data a single dataframe
species_info<-do.call(rbind, species_comp)

#Finalize our fake data set by merging the habitat and species datasets
aquatic_data<-merge(species_info, site_info, by="sites")

#Calculate richness for each site#
rich<-aquatic_data %>% group_by(sites, habitat) %>% summarize(richness=n())


#Calculate summary statistics for each habitat type#
bmir<-rich%>% 
  group_by(habitat)%>%
  summarise(mean = mean(richness),num_obs=n(),sum_habitat_richness=sum(richness),
            sd_habitat_richness= sd(richness),se_mean=sd_habitat_richness/sqrt(num_obs),
            se_upper=mean+se_mean,se_lower=mean-se_mean)

#plot the data adding error bars
ggplot(data=bmir, aes(x=habitat,y=mean,fill=habitat))+
  geom_bar(stat="identity")+
  geom_errorbar (aes( x=habitat, ymin= se_lower , ymax = se_upper), width=0.2) + 
  ylab("Major Group Richness") + 
  xlab('Habitat')+
  theme_bw()+
  theme(text=element_text(size=18))+
  theme(axis.text.x = element_text(angle = 45,vjust=0.5))+
  theme(legend.position="none") #remove legend

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Sean McKenzie