'finding the nearest date to the another date column in python pandas dataframe

epm_vin repm_milg repm_ro_dtime complaints_flag date_flag Dates
MALAN51CMBM972380 25645 2020-10-01 Yes Nan 2021-08-07
MALAN51CMBM972380 25845 2021-08-06 Yes Nan 2021-08-07
MALPC813MLM018007 39305 2021-06-12 Yes Nan 2020-10-01
MALPC813MLM018007 28795 2021-03-05 Yes Nan 2020-10-01
|   repm_vin         |repm_milg |repm_ro_dtime  |complaints_flag    |date_flag  |Dates|
| :--------      |:-------|:---------|:-----|:----------|:------|:---------|:-----|
|MALAN51CMBM972380  |25845| 2021-08-06| Yes |Yes|   2021-08-07|
|MALPC813MLM018007| 28795|  2021-03-05| Yes|    Yes |2020-10-01|

solution should be like above second table means i want the rows where repm_ro_dtime is nearest to Dates column and flag date_flag as Yes.



Solution 1:[1]

Use:

idx = df['repm_ro_dtime'].sub(df['Dates']).dt.days.abs().groupby(df['epm_vin']).idxmin()
df = df.loc[idx].assign(date_flag = 'Yes')
print (df)
             epm_vin  repm_milg repm_ro_dtime complaints_flag date_flag  \
1  MALAN51CMBM972380      25845    2021-08-06             Yes       Yes   
3  MALPC813MLM018007      28795    2021-03-05             Yes       Yes   

       Dates  
1 2021-08-07  
3 2020-10-01 

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 jezrael