'Plotting multiple columns in R ggplot and have the legend right

I have a data frame, with 5 columns and 51 rows like the one below:

year var1 Var2
1990 2.2 4.1
1991 2.3 4.4

I wanted to graph as many lines s are the columns to see the trend of each var over the years.

I tried with:

lt <- c("VAR2" = "twodash", "VAR3" = "dotted", "VAR4" = "dashed", "VAR5" = "solid")

ggp1 <- ggplot(DATAFRAME, aes(VAR1)) +       # Create ggplot2 plot
  geom_line(aes(y = VAR2, linetype = "twodash", group=1, )) +
  geom_line(aes(y = VAR3, linetype = "dotted", group=1)) +
  geom_line(aes(y = VAR4, linetype = "dashed", group=1)) + 
  geom_line(aes(y = VAR5, linetype = "solid", group=1)) +
  labs(y = "number of parties in government", x= "year", lintype = "Legend")+
  scale_linetype_manual(values = lt) 
ggp1 

But it doesn't work. How can I do?

thank you all for the help.



Solution 1:[1]

This should do it:

library(ggplot2)
DATAFRAME <- data.frame(
  VAR1 = 1:20, 
  VAR2 = runif(20,0,1), 
  VAR3 = runif(20,1,2), 
  VAR4 = runif(20,2,3), 
  VAR5 = runif(20,3,4)
)


lt <- c("VAR2" = "twodash", "VAR3" = "dotted", "VAR4" = "dashed", "VAR5" = "solid")

ggplot(DATAFRAME, aes(VAR1)) +       # Create ggplot2 plot
  geom_line(aes(y = VAR2, linetype = "VAR2", group=1, )) +
  geom_line(aes(y = VAR3, linetype = "VAR3", group=1)) +
  geom_line(aes(y = VAR4, linetype = "VAR4", group=1)) + 
  geom_line(aes(y = VAR5, linetype = "VAR5", group=1)) +
  labs(y = "number of parties in government", x= "year", lintype = "Legend") +
  scale_linetype_manual(values = unname(lt) )

The trick here is that when you create aesthetics this way (e.g., linetype = "VAR2" or linetype = "twodash", ggplot2 will turn them into a factor and order the levels alphabetically. In your original code, the levels would be, in order, dashed, dotted, solid and twodash. What I did above was to put the levels in the intended order and then gave the vector of line-type values. Another catch is that ggplot wants that vector to be un-named.

You could also specify breaks in scale_linetype_manual to set the levels of the factor that is created in the background. Then, your code would work as well.


ggplot(DATAFRAME, aes(VAR1)) +       # Create ggplot2 plot
  geom_line(aes(y = VAR2, linetype = "twodash", group=1, )) +
  geom_line(aes(y = VAR3, linetype = "dotted", group=1)) +
  geom_line(aes(y = VAR4, linetype = "dashed", group=1)) + 
  geom_line(aes(y = VAR5, linetype = "solid", group=1)) +
  labs(y = "number of parties in government", x= "year", lintype = "Legend") +
  scale_linetype_manual(values = unname(lt), breaks=c("twodash", "dotted", "dashed", "solid"))

Created on 2022-04-12 by the reprex package (v2.0.1)

Solution 2:[2]

To get the names into the legend, you can follow the answer from u/DaveArmstrong, but the arguably much better way would be to transform your dataset into a "long form" or Tidy Dataframe.

The general idea is that each column represents one variable and each row represents an observation. The column layout should be year | variable | value instead of year | VAR1 | VAR2 | VAR3.... There are lots of ways to convert your dataframe. A pretty straightforward way is to use pivot_longer() as you'll see below:

library(dplyr)
library(tidyr)
library(ggplot2)

set.seed(8675309)  # so you get the same randomness I do

df <- data.frame(
  year = 2000:2020,
  VAR1 = rnorm(21, 10, 1),
  VAR2 = rnorm(21, 15, 2),
  VAR3 = rbinom(21, size=20, prob = 0.9),
  VAR4 = rcauchy(21, location=12, scale=2)
)

df %>% pivot_longer(cols = -year, names_to = "variable", values_to = "value") %>%
  ggplot(aes(x=year, y=value)) +
  geom_line(aes(linetype=variable)) +
  theme_bw()

enter image description here

Solution 3:[3]

you can also transpose the data with pivot_wider() so you don't have to call geom_line multiple times. You then can feed in the new variable with all of your variables to linetype as well.

library(ggplot2)
library(dplyr)
library(tidyr)

DATAFRAME <- data.frame(
  VAR1 = 1:20, 
  VAR2 = runif(20,0,1), 
  VAR3 = runif(20,1,2), 
  VAR4 = runif(20,2,3), 
  VAR5 = runif(20,3,4)
)
#transpose to long format except your year variable
df2 <- pivot_longer(DATAFRAME, cols =  paste0("VAR",2:5))

ggplot(df2) +
  geom_line(aes(x=VAR1 , y = value,  group =name, linetype = name)) +
  labs(y = "number of parties in government", x= "year", lintype = "Legend")
 

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 chemdork123
Solution 3 Mike