'Function in R for getting a 0 in new column when value in another column equals any of the rows in another dataset [closed]
I have a list of names in one dataset and a column for 'name' in another dataset. I was R to give me a new column where it says 1 if any of the names in my first dataset appear in the column 'name' in that row. In other words, I want it to go row by row, and for a value in a cell of that row, look in my first dataset. If the value appears in my first dataset, I want it to code it as a 1 in a new column. Can you help? I apologize for not providing the data structure - it's my first time posting. Here is what I am trying to do.
myDataSet1 <- as.data.frame( cbind( "firstname" = c("Jenny", "Jane", "Jessica", "Jamie", "Hannah"), "year" = c(2018, 2019, 2020, 2021, 2022) ) )
myDataSet2 <- as.data.frame( cbind( "name" = c("Jenny", "John", "Andy", "Jamie", "Hannah", "Donny"), "dob" = c(1, 2, 3, 4, 5, 6) ) )
I want to know if each of the names listed in column myDataSet1$firstname's each row appear anywhere in mydataset2$name column. So, in this case, an ideal result would look like this.
myDataSet1
firstname year namematch
Jenny 2018 1
Jane 2019 0
Jessica 2020 0
Jamie 2021 1
Hannah 2022 0
Solution 1:[1]
Please supply some example of your data, i'm trying to guess with some random data:
myDataSet1 <- as.data.frame( cbind( "PersonName" = c("Peter", "Jane", "John", "Louis", "Hannah"),
"NumberOfDogs" = c(9, 2, 5, 3, 5) ) )
myDataSet2 <- as.data.frame( cbind( "Name" = c("Nora", "John", "Andy", "Louis", "Hannah", "Donny"),
"NumberOfCats" = c(1, 2, 3, 4, 5, 6) ) )
myDataSet1
myDataSet2
# This applies anonymous function to each name of Mydataset1 -- PersonName,
# tests whether it is contained anywhere inside MyDataSet2 -- Name and return result of 0/1.
myDataSet1$IsInDataSet2 <- sapply(myDataSet1$PersonName,
function(currentName) as.integer( currentName %in% myDataSet2$Name) )
Result
myDataSet1
PersonName NumberOfDogs IsInDataSet2
1 Peter 9 0
2 Jane 2 0
3 John 5 1 #contained in DataSet2
4 Louis 3 1 #contained in DataSet2
5 Hannah 5 1 #contained in DataSet2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | L D |
