'How to write a loop to plot histograms and box-plots for too many colums of my Data frame in R?
I am new at using R.
I have a Data frame of 80 colums(variables), each one with 100 observations (rows); all the observations are numeric. I need to plot histograms and box plots for each column (variable). I think the easier way is creating a loop, but I don't know how to do it.
Solution 1:[1]
Here is one approach to plotting histograms/boxplots with a similar 'sample' dataset:
library(ggplot2)
library(tidyr)
set.seed(1)
# create sample data
df <- data.frame(matrix(sample(1:1000, size = 8000, replace = TRUE),
ncol = 80,
nrow = 100,
dimnames = list(c(), paste0("Variable_", 1:80))))
# check format of first 5 columns and 5 rows
head(df[,1:5], n = 5)
#> Variable_1 Variable_2 Variable_3 Variable_4 Variable_5
#> 1 836 620 218 441 464
#> 2 679 304 610 294 674
#> 3 129 545 194 62 733
#> 4 930 557 19 390 493
#> 5 509 661 273 644 675
# reformat the data to 'long' format (https://tidyr.tidyverse.org/reference/pivot_longer.html)
df_long <- pivot_longer(df, everything(), names_to = "variable", values_to = "observation")
# specify that 'Variables' are ordered (i.e. Variable_2 is after Variable_1)
df_long$variable <- factor(df_long$variable, levels = unique(df_long$variable), ordered = TRUE)
# plot histograms with ggplot
ggplot(df_long, aes(x = observation)) +
geom_histogram(bins = 15) +
facet_wrap(~variable, ncol = 6, strip.position = "right")

# plot boxplots with ggplot
ggplot(df_long, aes(x = variable, y = observation)) +
geom_boxplot(outlier.shape = NA) +
theme(axis.text.x = element_text(angle = 90))

Created on 2022-01-28 by the reprex package (v2.0.1)
These figures are 'squashed', but you can resize them to see the axes/text properly when you export or save. Does this solve your problem?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jared_mamrot |
