'How to create a table with if statements for each row in each column in r
I would like to create a new output that is based on an if statement for each row in each column for example:
If the value of the year for each flower in table 1 is smaller than the min for each flower in each year in table 2 then the value will be the value of min in table 2, else the value of year in table 1
I have an idea of how the code could look but not sure how to type it to create the output I need so that it is applied to each flower for each year.
IF (Table1$year < Table2$min, Table2$min, Table1$year)
(table 1)
| flower | 1902 | 1950 | 2010 | 2012 | 2021 |
|---|---|---|---|---|---|
| lily | 23 | 23 | 23 | 31 | 36 |
| rose | 50 | 110 | 115 | 131 | 131 |
| daisy | 30 | 37 | 47 | 49 | 49 |
(table 2)
| flower | year | min |
|---|---|---|
| lily | 1902 | 15 |
| rose | 1902 | 100 |
| daisy | 1902 | 35 |
| lily | 1950 | 23 |
| rose | 1950 | 50 |
| daisy | 1950 | 30 |
| lily | 2010 | 16 |
| rose | 2010 | 50 |
| daisy | 2010 | 67 |
The output for the first year will look like:
| flower | 1902 |
|---|---|
| lily | 23 |
| rose | 100 |
| daisy | 35 |
Solution 1:[1]
I think it will help to make the first table match the "long" form of the 2nd table. Then you could join the two tables together, pick the max value for each flower-year combination, and then reshape wide:
library(dplyr)
t1_long %>%
left_join(t2) %>%
group_by(flower, year) %>%
summarize(max = max(value, min, na.rm = TRUE), .groups = "drop") %>%
tidyr::pivot_wider(names_from = year, values_from = max, names_prefix = "x"))
Result
Joining, by = c("flower", "year")
# A tibble: 3 × 6
flower x1902 x1950 x2010 x2012 x2021
<chr> <int> <int> <int> <int> <int>
1 daisy 35 37 67 49 49
2 lily 23 23 23 31 36
3 rose 100 110 115 131 131
To make your first table long, I used:
t1_long <- t1 %>%
tidyr::pivot_longer(-flower, names_to = "year") %>%
mutate(year = parse_number(year))
Starting from the data below. Note that R advises against column names that start with a number: https://cran.r-project.org/doc/FAQ/R-FAQ.html#What-are-valid-names_003f:
A "syntactic" name for a data frame column "starts with either a letter or a dot not followed by a number."
t1 <- data.frame(
stringsAsFactors = FALSE,
flower = c("lily", "rose", "daisy"),
x1902 = c(23L, 50L, 30L),
x1950 = c(23L, 110L, 37L),
x2010 = c( 23L, 115L, 47L),
x2012 = c( 31L, 131L, 49L),
x2021 = c( 36L, 131L, 49L)
)
t2 <- data.frame(
stringsAsFactors = FALSE,
flower = c("lily","rose","daisy",
"lily","rose","daisy","lily","rose","daisy"),
year = c(1902L,1902L,1902L,1950L,
1950L,1950L,2010L,2010L,2010L),
min = c(15L, 100L, 35L, 23L, 50L, 30L, 16L, 50L, 67L)
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
