'How to exclude future dates from excel data file using pandas?
I'm trying to limit my dataset to dates before today. Below creates a graph but the mask doesn't have any impact. Any help appreciated.
df = pd.read_excel("./data/Weight.xlsx", sheet_name='Data')
date = df.iloc[:, 0]
kilos = df.iloc[:, 3]
date = pd.to_datetime(date, format='%Y-%m-%d')
mask = (date < pd.to_datetime('today'))
df = df.loc[mask]
df.head()
2021-02-06 00:00:00 18 2 115.208
42 2022-12-11 14.0 4.0 90.716
43 2022-12-18 14.0 0.0 88.900
44 2022-12-25 14.0 0.0 88.900
45 2023-01-01 14.0 2.0 89.808
46 2023-01-08 14.0 1.0 89.354
Solution 1:[1]
please check if you have a header in your excel file. The df may not have a header. Also, assuming the numbers - 41, 42, ... are all indeces. This appears to work fine with making header=None while reading the file.
The updated code is here.
df = pd.read_excel("input.xlsx", sheet_name='Data', header=None)
date = df.iloc[:, 0]
kilos = df.iloc[:, 3]
date = pd.to_datetime(date, format='%Y-%m-%d')
mask = (date < pd.to_datetime('today'))
df = df.loc[mask]
Output looks fine. Here is what I am getting...
>> df
0 1 2 3
0 2021-02-06 18 2 115.208
>> mask
0 True
1 False
2 False
3 False
4 False
5 False
Name: 0, dtype: bool
Solution 2:[2]
Thanks, that fixed it for me.
Chris
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Redox |
| Solution 2 | Chris Matchett |

