'Assign values in one vector based on matching values in two other vectors in R

I have dataset1 with an id column and a number column. The id number appears multiple times, it can be 5 times or 60 times, or something else.

id    number
2       NA
4       NA
9       ...
1
2
2
3
5
5
5
12

I have dataset2 where each id only appears once and then a value for the number

id  number
1     12 
2     25
3     33
4     121
5     35
9     1500

There are id's in dataset1 tat do not exist in dataset2.

I want to assign the number of dataset2 to the empty number column in dataset1 for the right id's. I tried a few things, but it always gives me an error that the number of observations I want to replace doesn´t match with the input of observations. I know this is the case because my dataset2 has less observations than dataset1. Some things I tried:

dataset1[na.omit(match(dataset1$id, dataset2$id)), ]$number <- data_new[dataset2$id %in% dataset1$id, ]$number

dataset1$number <- ifelse(dataset2$id %in% dataset1$id, dataset2$number, NA)

I would appreciate any help! Thanks!!



Solution 1:[1]

Assuming that the "number" column in your dataset1 only contains NAs, the simplest solution would be to tidy up your dataset1 to a vector of ids and left_join dataset2:

library(dplyr)

tidy <- dataset1 %>%
          group_by(id) %>%
          summarise()

merge <- left_join(tidy, dataset2, by = "id")

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Peter Kamal