'phyloseq: Discrepancies in otu counts before and after using tax_glom

Maybe I missed something in how tax_glom works but as I did not find any info here nor elsewhere on the web, maybe someone here can help. I do not provide data but I can on request. Here is the code highlighting the issue I have

colSums(CYANO%>%otu_table())

CYANO_gen <- CYANO %>%
  tax_glom(taxrank = "Genus")
colSums(CYANO_gen%>%otu_table())

CYANO is a phyloseq object that I wanted to agglomerate at the Genus rank but I noticed that a sample (named 100) was not present in a dataviz. This led me to check where the issue happened. 7 samples out of 54 present discrepancies as shown in the last line of the attached image, weird isn't it?

Results given by the code above and 2 additional lines which highlight the importance of discrepancies and the fact that this is not always the case

Thank, Guillaume



Solution 1:[1]

The NArm term in the tax_glom function is, by default, set as TRUE. To avoid losing observations with NA cells you need to set the NArm = FALSE. Cheers

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Zina