'Advanced Lookup in R: How can I look through strings and add a value from a dataframe?
Neither Excel vlookup function nor R join functions do help. I am attempting to look up for a specific string from one dataframe and add new columns based on the match from different dataframe. But as far as I see the _join functions don't do the justice for my particular problem. I present two dataframes and my code here below:
**id** **address**
3811 bb
4803 dd
4820 dd
852 aa
4031 dd
I want to look through this address variable and match from local variable in another dataframe below. Then I want to add values from a column district.
**local** **district**
aa AA
bb BB
cc CC
dd DD
I ran this code to complete the task. It performs well when I ran without for loop, I guess. However, with for loop it produces an error.
distr <- data.frame(1:7000)
for (word in df2$local) {
ind = stringi::stri_detect_fixed(train$address, word) %>% which(.==T)
ind2 = stringi::stri_detect_fixed(df2$local, word) %>% which(.==T)
distr[ind, 2] <- df2[ind2, 3]
}
The code is designed this way so I could add the column of dataframe distr to train dataframe later on. Where am I making specific errors to run code this properly? Anyone with string expertise?
P.S. By the way, I chose stri_detect_fixed function because regex expressions couldn't work for each values here.
Solution 1:[1]
As I_O sugggests, this seems to work fine with fuzzyjoin::fuzzy_join():
library(fuzzyjoin)
fuzzy_join(d1, d2, match_fun = stringi::stri_detect_fixed,
by = c("address" = "local"))
gives
id address local district
1 3811 Yntymak,???????/???? Yntymak leninskyi
2 4803 JD station JD station pervomayskyi
3 4820 JD station, Panfilov JD station pervomayskyi
4 852 Ak-Bata Ak-Bata sverdlovskyi
5 4031 JD station JD station pervomayskyi
d1 <- read.table(header = TRUE,
sep = ";",
text = "
id;address
3811;Yntymak,???????/????
4803;JD station
4820;JD station, Panfilov
852;Ak-Bata
4031;JD station
")
d2 <- read.table(header = TRUE,
sep = ";",
text = "
local;district
Ak-Bata;sverdlovskyi
Yntymak;leninskyi
Zhilgorodok Sovmina;oktyabrskyi
JD station;pervomayskyi
")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ben Bolker |
