'Merge rows with specific name then rename them
I have this sample dataset
province region_vn region_en sub_region_vn sub_region_en province_latin
<chr> <chr> <chr> <chr> <chr> <chr>
1 Điện Biên Bắc Bộ Northern Tây Bắc Bộ Northwest Dien Bien
2 Lạng Sơn Bắc Bộ Northern Tây Bắc Bộ Northeast Lang Son
How do I join the two sub_region_en of Northwest and Northeast and rename it to Northern midlands and mountain areas?
The outcome would be
province region_vn region_en sub_region_vn sub_region_en province_latin
<chr> <chr> <chr> <chr> <chr> <chr>
1 Điện Biên Bắc Bộ Northern Tây Bắc Bộ Northern midlands and mountain areas Dien Bien
2 Lạng Sơn Bắc Bộ Northern Tây Bắc Bộ Northern midlands and mountain areas Lang Son
I would appreciate any help.
Solution 1:[1]
For example, if your dataset is called "df"
You can simply do the following:
for(i in 1:dim(df)[1]){
if(df$sub_region_en[i] %in% c("Northwest", "Northeast")){
df$sub_region_en[i] <- "Northern midlands and mountain areas"
}
}
Solution 2:[2]
Another option is to use regular expressions to identify the pattern, and then use gsub() function to substitute the pattern. Here is the step:
# A simplified version of your data
yourdf <- structure(list(region_en = c("Northern", "Northern"), sub_region_en = c("Northwest",
"Northeast")), class = "data.frame", row.names = c(NA, -2L))
yourdf
# region_en sub_region_en
#1 Northern Northwest
#2 Northern Northeast
# Substitute the data
yourdf$sub_region_en <- gsub("Northwest|Northeast",
"Northern midlands and mountain areas",
yourdf$sub_region_en)
# The result
yourdf
# region_en sub_region_en
#1 Northern Northern midlands and mountain areas
#2 Northern Northern midlands and mountain areas
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Fujibayashi Kyou |
| Solution 2 |
