'Extracting columns of a dataframe by comparing with date in the file name
I am working with a csv file where date is present in the file name. The csv file I have read into dataframe. The dataframe columns also contain dates. However, the format of date present in the column and file name are different. I want to retain only those columns where the date is less or equal than the date in the file name.
Following is the code I have tried.
import pandas as pd
import os
import sys
import glob
from datetime import datetime
path = r"C:\Users\DELL\Downloads\stocksData\drive-download-20210904T164255Z-001"
csv_files = glob.glob(os.path.join(path, "*.csv"))
niftyFilesList=[]
bankNiftyFilesList=[]
for efile in csv_files:
if 'BANKNIFTY' in efile:
bankNiftyFilesList.append(efile)
# print(len(bankNiftyFilesList))
else:
niftyFilesList.append(efile)
# print(len(niftyFilesList))
# niftyFilesList.append(efile)
# print(len(bankNiftyFilesList))
file1 = niftyFilesList[1]
print("NAME OF FILE",file1)
data = pd.read_csv(file1)
print("columns in the file:",data.columns)
output of the above code:
NAME OF FILE C:\Users\DELL\Downloads\stocksData\drive-download-20210904T164255Z-001\_data_minutelyData_2021-02-23NIFTY.csv
columns in the file: Index(['timestamp', 'NIFTY 25 FEB2021 16800.00 CE',
'NIFTY 25 FEB2021 16750.00 CE', 'NIFTY 25 FEB2021 16700.00 CE',
'NIFTY 25 FEB2021 16650.00 CE', 'NIFTY 25 FEB2021 16600.00 CE',
'NIFTY 25 FEB2021 16550.00 CE', 'NIFTY 25 FEB2021 16500.00 CE',
'NIFTY 25 FEB2021 16450.00 CE', 'NIFTY 25 FEB2021 16400.00 CE',
...
'NIFTY 29 APR2021 12500.00 PE', 'NIFTY 29 APR2021 12450.00 PE',
'NIFTY 29 APR2021 12400.00 PE', 'NIFTY 29 APR2021 12350.00 PE',
'NIFTY 29 APR2021 12300.00 PE', 'NIFTY 29 APR2021 12250.00 PE',
'NIFTY 29 APR2021 12200.00 PE', 'NIFTY 29 APR2021 12150.00 PE', 'index',
'spotPrice'],
dtype='object', length=1796)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
