'finding the nearest date to the another date column in python pandas dataframe
| epm_vin | repm_milg | repm_ro_dtime | complaints_flag | date_flag | Dates |
|---|---|---|---|---|---|
| MALAN51CMBM972380 | 25645 | 2020-10-01 | Yes | Nan | 2021-08-07 |
| MALAN51CMBM972380 | 25845 | 2021-08-06 | Yes | Nan | 2021-08-07 |
| MALPC813MLM018007 | 39305 | 2021-06-12 | Yes | Nan | 2020-10-01 |
| MALPC813MLM018007 | 28795 | 2021-03-05 | Yes | Nan | 2020-10-01 |
| repm_vin |repm_milg |repm_ro_dtime |complaints_flag |date_flag |Dates|
| :-------- |:-------|:---------|:-----|:----------|:------|:---------|:-----|
|MALAN51CMBM972380 |25845| 2021-08-06| Yes |Yes| 2021-08-07|
|MALPC813MLM018007| 28795| 2021-03-05| Yes| Yes |2020-10-01|
solution should be like above second table means i want the rows where repm_ro_dtime is nearest to Dates column and flag date_flag as Yes.
Solution 1:[1]
Use:
idx = df['repm_ro_dtime'].sub(df['Dates']).dt.days.abs().groupby(df['epm_vin']).idxmin()
df = df.loc[idx].assign(date_flag = 'Yes')
print (df)
epm_vin repm_milg repm_ro_dtime complaints_flag date_flag \
1 MALAN51CMBM972380 25845 2021-08-06 Yes Yes
3 MALPC813MLM018007 28795 2021-03-05 Yes Yes
Dates
1 2021-08-07
3 2020-10-01
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jezrael |
