'Compare one col with another col in another df containing multiple entries

We are trying to find if the char in A$symbol matches any of the char in B$symbol. Result should be 3 df+ one with matches, one with only in A, one with only in B.

Data example:

### task dfs

A = data.frame(
  symbol_A = c("ABC", "ABB", "ACC", "BCG", "AAG"),
  id_A = c("1", "2", "3", "4", "5"))

B = data.frame(
  symbol_B = c("XXC", "XCT ABB", "TTG WHO ACC", "AAG", "TTR, YHD"),
  id_B = c("ab", "dy", "hu", "uh", "yz"))


### expected solution 

solution_overlaps <- data.frame(
  symbol_A = c("ABB", "ACC", "AAG"),
  symbol_B = c("XCT ABB", "TTG WHO ACC", 'AAG'),
  id_x = c("2", "3", "5"),
  id_y = c('dy', 'hu', "uh" )
)

solution_only_in_A <- data.frame(
  symbol_A = c("ABC", "BCG"),
  id_A = c('ab', '4')
)

solution_only_in_B <- data.frame(
  symbol_B = c("XXC", "TTR, YHD"),
  id_B = c('ab', "yz")
)

Thanks a lot for helping, hope this is relevant for others as well! (as always, dplyr appreciated!)

Sebastian



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source