'is there a fucntion in R to colour code data points on regression of another column?
I am trying to make a regression where there are points of two variables and they are colour coded based on another column that has "Yes" or "NO" in it.
Here is a random example with data I have made up, I hope this okay as I'm not sure how to make repeatable example.
| km | litre | WOF |
|---|---|---|
| 133 | 1 | Yes |
| 88 | 2 | Yes |
| 222 | 1 | No |
I have tried something like this by looking at lecture notes but it is not working.
cars <- read.csv("random_cardata.csv")
plot(km ~ litre, data = cars, pch = 19, col=c("black", "red")[car$WOF])
I am also trying to figure out how to make a legend where on the side it says:
"Green = has WOF" and
"Red = no WOF"
Any help would be appreciated.
Solution 1:[1]
library(ggplot2)
data <- tibble::tribble(
~km, ~litre, ~WOF,
133L, 1L, "Yes",
88L, 2L, "Yes",
222L, 1L, "No"
)
qplot(litre, km, color = WOF, data = data)

Created on 2022-03-30 by the reprex package (v2.0.0)
or more comprehensive
library(tidyverse)
data <- tibble::tribble(
~km, ~litre, ~WOF,
133L, 1L, "Yes",
88L, 2L, "Yes",
222L, 1L, "No"
)
data %>%
mutate(WOF2 = ifelse(WOF == "Yes", "has WOF", "no WOF")) %>%
qplot(litre, km, color = WOF2, data = .) +
scale_color_manual(values = c("has WOF" = "green", "no WOF" = "red")) +
labs(color = "WOF")

Created on 2022-03-30 by the reprex package (v2.0.0)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | danlooo |
