'How to log transform the y-axis of R geom_histogram in the right direction?

I'm forgetting something very fundamental which would explain why I'm seeing very inflated y values after a log10 transformation of the y-axis.

I have the following stacked ggplot + geom_histogram.

ggTherapy <- ggplot(genderTherapyDF, aes(freq, fill=name)) +
 geom_histogram(data=genderTherapyDF, binwidth = 1, alpha=0.5, color="black") + theme_bw() +
 theme(legend.position="none", axis.title = element_text(size=14), legend.text = element_text(size=14), axis.text.y = element_text(size=12, angle=45), axis.text.x = element_text(size=12), legend.background = element_rect(fill="transparent")) +
 ylab("No. of patients") + xlab("Events") + labs(fill="") +  ggtitle("Therapy")

enter image description here

The y-values are true to form, exactly what I expect. However, it's so skewed that to the naked eye I'm finding this very unsatisfying. I'd rather see a transformed plot.

I tried transforming x, quickly to realise that transforming along the binned axis was very difficult to interpret. So I transformed the frequency on the y axis:

ggTherapy <- ggplot(genderTherapyDF, aes(freq, fill=name)) +
 geom_histogram(data=genderTherapyDF, binwidth = 1, alpha=0.5, color="black") + theme_bw() +
 theme(legend.position="none", axis.title = element_text(size=14), legend.text = element_text(size=14), axis.text.y = element_text(size=12, angle=45), axis.text.x = element_text(size=12), legend.background = element_rect(fill="transparent")) +
 ylab("No. of patients") + xlab("Events") + labs(fill="") +  ggtitle("Therapy") +
scale_y_log10()

enter image description here

Visually, the plot makes sense. However, I'm struggling to come to terms with the y-axis labels! Why are they so huge after a log10 transformation?



Solution 1:[1]

I'm guessing that you should transform the data manually as described here https://ggplot2-book.org/scales.html#continuous-position-scales:

"Note that there is nothing preventing you from performing the transformation manually. For example, instead of using scale_x_log10() to transform the scale, you could transform the data instead and plot log10(x). The appearance of the geom will be the same, but the tick labels will be different. Specifically, if you use a transformed scale, the axes will be labelled in the original data space; if you transform the data, the axes will be labelled in the transformed space. Regardless of which method you use, the transformation occurs before any statistical summaries. To transform after statistical computation use coord_trans(). See Section 14.1 for more details."

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 AlbertRapp