'How to calculate SE and add error bars to R bar graph, SE only calculating NA
I am trying to calculate SE in R to add error bars to my species richness graph. Here is my code:
#calculate means and standard errors, by habitat type and richness
bmir<-rich%>%
group_by(habitat_type,habitat_richness)%>%
summarise(mean = mean(habitat_richness),num_obs=n(),sum_habitat_richness=sum(habitat_richness),
sd_habitat_richness= sd(habitat_richness),se_mean=sd_habitat_richness/sqrt(num_obs),
se_upper=mean+se_mean,se_lower=mean-se_mean)
#add error bars
ggplot(data=bmir, aes(x=habitat_type,y=habitat_richness,fill=habitat_type))+
geom_bar(stat="identity")+
geom_errorbar (aes( x=habitat_type, ymin= se_lower , ymax = se_upper), width=0.2) +
ylab("Major Group Richness") +
xlab('Habitat')+
theme_bw()+
theme(text=element_text(size=18))+
theme(axis.text.x = element_text(angle = 45,vjust=0.5))+
theme(legend.position="none")+ #remove legend
and my data frames:
bmir
habitat_type habitat_richness mean num_obs sum_habitat_richness sd_habitat_richness se_mean se_upper se_lower
<chr> <int> <dbl> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 backwater 6 6 1 6 NA NA NA NA
2 main stem pool 10 10 1 10 NA NA NA NA
3 main stem riffle 8 8 1 8 NA NA NA NA
4 off-channel pond 11 11 1 11 NA NA NA NA
5 side channel pool 11 11 1 11 NA NA NA NA
6 side channel riffle 7 7 1 7 NA NA NA NA
rich
habitat_type habitat_richness
<chr> <int>
1 backwater 6
2 main stem pool 10
3 main stem riffle 8
4 off-channel pond 11
5 side channel pool 11
6 side channel riffle 7
I am not sure why it is only calculating NA as my SE. It seems simple enough, but I am not sure what the problem is.
Solution 1:[1]
So looking through your code and coming up with a reproducible example, I think the issue may simply be that you either a) calculate richness by habitat type, instead of by sample site or b) you included habitat richness in the group_by() function when you created bmir. Below is a reproducible example using fake data that separates the calculation of site species richness from mean habtiat species richness, and I think gives the results you are after:
##Loading Necessary Packages##
library(dplyr)
library(ggplot2)
set.seed(94)#For reproducibility
##Creating Fake Data#
#First get potential habitat types and sample sites##
habitats<-c("backwater", "main stem pool", "main stem pool", "main stem riffle", "off channel pond", "side channel pool", "side channel riffle")
sites<-paste("site", c(1:30), sep="_")
#Randomly assign a habitat type to each site##
habitat<-sample(habitats, length(sites), replace = TRUE)
#Create a dataframe of sites and habitat
site_info<-data.frame(sites, habitat)
#Next make species pool#
species<-c("Ephemeroptera A", "Ephemeroptera B", "Ephemeroptera C",
"Ephemeroptera D", "Ephemeroptera E", "Plecoptera A","Plecoptera B",
"Plecoptera C", "Plecoptera D")
#Randomly assign a species richness to each site#
tmp<-sample(1:9, length(sites), replace = TRUE)
##Using a loop, we will randomly draw
#the number of species from the species pool
#for each site and assign them to the site
#and given them an abundance value
species_comp<-list()#A blank list to save iterative results of loop to
for(i in 1:length(sites)){##Seting the looping range, in this case we will perform the functions below for each site
species_comp[[i]]<-data.frame(species=sample(species, tmp[i], replace=FALSE))
species_comp[[i]][,"abundance"]<-sample(c(1:80), tmp[i], replace=TRUE)
species_comp[[i]][,"sites"]<-sites[i]
}#end loop
#Make the species abundace data a single dataframe
species_info<-do.call(rbind, species_comp)
#Finalize our fake data set by merging the habitat and species datasets
aquatic_data<-merge(species_info, site_info, by="sites")
#Calculate richness for each site#
rich<-aquatic_data %>% group_by(sites, habitat) %>% summarize(richness=n())
#Calculate summary statistics for each habitat type#
bmir<-rich%>%
group_by(habitat)%>%
summarise(mean = mean(richness),num_obs=n(),sum_habitat_richness=sum(richness),
sd_habitat_richness= sd(richness),se_mean=sd_habitat_richness/sqrt(num_obs),
se_upper=mean+se_mean,se_lower=mean-se_mean)
#plot the data adding error bars
ggplot(data=bmir, aes(x=habitat,y=mean,fill=habitat))+
geom_bar(stat="identity")+
geom_errorbar (aes( x=habitat, ymin= se_lower , ymax = se_upper), width=0.2) +
ylab("Major Group Richness") +
xlab('Habitat')+
theme_bw()+
theme(text=element_text(size=18))+
theme(axis.text.x = element_text(angle = 45,vjust=0.5))+
theme(legend.position="none") #remove legend
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Sean McKenzie |

