'How to join a dataset by row index?
I have one dataset that looks like this
| Rownumber | Name |
|---|---|
| 5 | John |
| 7 | Ken |
| 9 | Marvin |
I have another that looks like this
| LastName |
|---|
| Ming |
| Roe |
| Martin |
| Hugo |
| Sawyer |
| Wallace |
| Thomas |
| Wang |
| Abdul |
Note that I want to join second dataset to first. In the first I have a column called row number and the second I don't, but I have a backend index.
I would like my dataset to look like this
| Rownumber | Name | Last name |
|---|---|---|
| 5 | John | Sawyer |
| 7 | Ken | Thomas |
| 9 | Marvin | Abdul |
You see that in the second dataset Sawyer is the 5th row, Thomas is the 7th, and Abdul is the 9th that is way it merged.
Solution 1:[1]
Suppose your data are stored in two data.frames named df and df2, you could use a dplyr join:
library(dplyr)
df2 %>%
mutate(Rownumber = row_number()) %>%
right_join(df, by = "Rownumber") %>%
select(Rownumber, Name, LastName)
This returns
Rownumber Name LastName
1 5 John Sawyer
2 7 Ken Thomas
3 9 Marvin Abdul
- First create row numbers for your data.frame containing the last names.
- Next
right_jointhe first data.frame byRownumber.
Data
df <- structure(list(Rownumber = c(5, 7, 9), Name = c("John", "Ken",
"Marvin")), row.names = c(NA, -3L), class = "data.frame")
df2 <- structure(list(LastName = c("Ming", "Roe", "Martin", "Hugo",
"Sawyer", "Wallace", "Thomas", "Wang", "Abdul")), row.names = c(NA,
-9L), class = "data.frame")
Solution 2:[2]
I guess you can try this without left_join
> transform(df,LastName = df2$LastName[Rownumber])
Rownumber Name LastName
1 5 John Sawyer
2 7 Ken Thomas
3 9 Marvin Abdul
Solution 3:[3]
Another option is to use merge, where we use Rownumber from df1 for by.x, then use 0 from df2 for by.y, which refers to the row names.
merge(df1, df2, by.x = "Rownumber", by.y = 0, all.x = TRUE, all.y = FALSE)
Output
Rownumber Name LastName
1 5 John Sawyer
2 7 Ken Thomas
3 9 Marvin Abdul
Data
df1 <- structure(list(Rownumber = c(5, 7, 9), Name = c("John", "Ken",
"Marvin")), row.names = c(NA, -3L), class = "data.frame")
df2 <- structure(list(LastName = c("Ming", "Roe", "Martin", "Hugo",
"Sawyer", "Wallace", "Thomas", "Wang", "Abdul")), row.names = c(NA,
-9L), class = "data.frame")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Martin Gal |
| Solution 2 | ThomasIsCoding |
| Solution 3 |
