'Joining pandas DataFrames by Column names

I have two DataFrames with the following column names:

frame_1:
event_id, date, time, county_ID

frame_2:
countyid, state

I would like to get a DataFrame with the following columns by joining (left) on county_ID = countyid:

joined_dataframe
event_id, date, time, county, state

I cannot figure out how to do it if the columns on which I want to join are not the index.



Solution 1:[1]

you need to make county_ID as index for the right frame:

frame_2.join ( frame_1.set_index( [ 'county_ID' ], verify_integrity=True ),
               on=[ 'countyid' ], how='left' )

for your information, in pandas left join breaks when the right frame has non unique values on the joining column. see this bug.

so you need to verify integrity before joining by , verify_integrity=True

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1