'Compare one col with another col in another df containing multiple entries
We are trying to find if the char in A$symbol matches any of the char in B$symbol. Result should be 3 df+ one with matches, one with only in A, one with only in B.
Data example:
### task dfs
A = data.frame(
symbol_A = c("ABC", "ABB", "ACC", "BCG", "AAG"),
id_A = c("1", "2", "3", "4", "5"))
B = data.frame(
symbol_B = c("XXC", "XCT ABB", "TTG WHO ACC", "AAG", "TTR, YHD"),
id_B = c("ab", "dy", "hu", "uh", "yz"))
### expected solution
solution_overlaps <- data.frame(
symbol_A = c("ABB", "ACC", "AAG"),
symbol_B = c("XCT ABB", "TTG WHO ACC", 'AAG'),
id_x = c("2", "3", "5"),
id_y = c('dy', 'hu', "uh" )
)
solution_only_in_A <- data.frame(
symbol_A = c("ABC", "BCG"),
id_A = c('ab', '4')
)
solution_only_in_B <- data.frame(
symbol_B = c("XXC", "TTR, YHD"),
id_B = c('ab', "yz")
)
Thanks a lot for helping, hope this is relevant for others as well! (as always, dplyr appreciated!)
Sebastian
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
