'ValueError: Missing column provided to 'parse_dates': 'date'
I am working on this ML project; here is a look of the training dataset

Now since the training dataset is really large I am trying to get 1% of the random data from training using the following code:
from numpy import float32
dtypes={'id': float32,
'store_nbr':float32,
'item_nbr':float32,
'unit_sales':float32,
'onpromotion': bool
}
def skip_row(row_idx):
if row_idx==0:
return False
return random.random() > sample_fraction\
# random.random randomly retuns numbers that lie between 0 and 1
# So for 1% of the rows it returns false, meaning that it asks to keep the row and for the rest 99% of the data it returns True meaning that it it has to frop the value
random.seed(42)
# by setting the seed to a number it ensures that we get the same random outputs everytime we run this notebook
df= pd.read_csv(data_dir + "/train.csv",
usecols=selected_cols,
parse_dates=['date'],
dtype=dtypes,
skiprows=skip_row)
Solution 1:[1]
Well, I just rechecked and it turns out that I hadn't selected the column. This solved the error.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Vishnu |

