'Pandas index match without merge

So I have two dataframes. activity_log records when a client logs in based on their client_id. A client_id can appear multiple times given the client logs in multiple times over a period.

I need to create a third column in this activity_log looks up the date that the client was created. This created_date is calculated by looking at the earliest user_created date in the user_table.

activity_log

client_id activity_date created_date
1 12/12/2022
1 11/12/2022
1 9/12/2022
1 8/12/2022
2 12/12/2022
2 11/12/2022
3 10/12/2022
3 9/12/2022

user_table

client_id user_id user_created
1 12asdasd3 12/12/2021
1 1sads23 11/12/2021
1 asasdsa2 10/12/2021
2 32asdasd1 12/12/2021
2 3asdasd21 11/12/2021
3 1asdsaa22 2/12/2021

I have tried using pandas merge

activity_log.merge(client_table[['client_id','user_created']], how='inner', on='client_id')

The problem with this is that I end up with a table that is grater in size than the original activity_log because the client_id appears multiple times in the activity_log and multiple times in the user_table.

I want to look up the client_id in the user_table, get the earliest user_created value and put that into the created_date column in the activity_log.

Any ideas on what else I need to do to achieve this?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source