'R dplyr: how do I use left_join keeping only the first match?

I have two tables I want to join with a left_join.

table_joined <- left_join(table_x,table_y, by = type)

The problem is that type has multiple entries per type, such that there are multiple matches for table_y in table_x. This makes the resulting table_joined have more rows than table_x, accommodating for all the multiple matches. Is there any way I could keep only the first match, such that table_joined has the same number of rows as table_x?

I'm connecting to SQL tables on a server, so I'd love to keep using dplyr if possible. Thanks!

Here are examples of my tables.

table_x

x1 type
1 A
2 B
3 C

table_y

type y1
A AA
A AA
B BB
B BB
C CC

table_joined:

x1 type y1
1 A AA
1 A AA
2 B BB
2 B BB
3 C CC

desired output:

x1 type y1
1 A AA
2 B BB
3 C CC


Solution 1:[1]

What you're describing isn't how SQL joins work.

If you do want only one line per type, you need to make sure that each table used in the join only has one line per type.

You can use something like distinct to remove duplicates on table_y but it requires the rows completely match. If they don't, you need to figure out the logic about which row to keep on your own.

table_joined <- left_join(table_x, distinct(table_y), by = type)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Roger-123