'How to join a dataset by row index?

I have one dataset that looks like this

Rownumber Name
5 John
7 Ken
9 Marvin

I have another that looks like this

LastName
Ming
Roe
Martin
Hugo
Sawyer
Wallace
Thomas
Wang
Abdul

Note that I want to join second dataset to first. In the first I have a column called row number and the second I don't, but I have a backend index.

I would like my dataset to look like this

Rownumber Name Last name
5 John Sawyer
7 Ken Thomas
9 Marvin Abdul

You see that in the second dataset Sawyer is the 5th row, Thomas is the 7th, and Abdul is the 9th that is way it merged.



Solution 1:[1]

Suppose your data are stored in two data.frames named df and df2, you could use a dplyr join:

library(dplyr)
df2 %>% 
  mutate(Rownumber = row_number()) %>% 
  right_join(df, by = "Rownumber") %>% 
  select(Rownumber, Name, LastName)

This returns

  Rownumber   Name LastName
1         5   John   Sawyer
2         7    Ken   Thomas
3         9 Marvin    Abdul
  • First create row numbers for your data.frame containing the last names.
  • Next right_join the first data.frame by Rownumber.

Data

df <- structure(list(Rownumber = c(5, 7, 9), Name = c("John", "Ken", 
"Marvin")), row.names = c(NA, -3L), class = "data.frame")

df2 <- structure(list(LastName = c("Ming", "Roe", "Martin", "Hugo", 
"Sawyer", "Wallace", "Thomas", "Wang", "Abdul")), row.names = c(NA, 
-9L), class = "data.frame")

Solution 2:[2]

I guess you can try this without left_join

> transform(df,LastName = df2$LastName[Rownumber])
  Rownumber   Name LastName
1         5   John   Sawyer
2         7    Ken   Thomas
3         9 Marvin    Abdul

Solution 3:[3]

Another option is to use merge, where we use Rownumber from df1 for by.x, then use 0 from df2 for by.y, which refers to the row names.

merge(df1, df2, by.x = "Rownumber", by.y = 0, all.x = TRUE, all.y = FALSE)

Output

  Rownumber   Name LastName
1         5   John   Sawyer
2         7    Ken   Thomas
3         9 Marvin    Abdul

Data

df1 <- structure(list(Rownumber = c(5, 7, 9), Name = c("John", "Ken", 
"Marvin")), row.names = c(NA, -3L), class = "data.frame")

df2 <- structure(list(LastName = c("Ming", "Roe", "Martin", "Hugo", 
"Sawyer", "Wallace", "Thomas", "Wang", "Abdul")), row.names = c(NA, 
-9L), class = "data.frame")

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Martin Gal
Solution 2 ThomasIsCoding
Solution 3