'ValueError: unconverted data remains: 00:00:00: Extract from Netcdf and arrange by location (rows) and values (precipitation) in columns
I want to extract precipitation data from Netcdf files for 10 different locations (10 longitude/ latitude), daily (365), from 2014 to 2017. I want the end product to have: 10 rows (corresponding to 10 locations) and 368 columns (longitude, latitude, year, and 365 columns for precipitation from day 1 to day 365). I tried the following code and it the following chunk gives me error:
ValueError: unconverted data remains: 00:00:00
Here is my code
all_years = []
for file in glob.glob('*.nc'):
#print(file)
data = Dataset(file, 'r')
time = data.variables['time']
year = file[0:4]
all_years.append(year)
year_start = min(all_years)
#type(year_start)
end_year = max(all_years)
date_range = pd.date_range(start = str(year_start) + '-01-01', end = str(end_year)
+'-12-31', freq = 'D')
df = pd.DataFrame(0.0, columns=['Latitude','Longitude','Precipitation'], index =
date_range)
print(df)
#Accessing locations to extract data (This has 3 columns: latitude, longitude, year)
locations = pd.read_csv('LocationsToBeExtracted.csv')
def datestdtojd (stddate):
stddate.strip()
fmt='%Y-%m-%d'
sdtdate = datetime.strptime(stddate, fmt)
sdtdate = sdtdate.timetuple()
jdate = sdtdate.tm_yday
for index, row in locations.iterrows():
year = row['Year']
location_latitude = row['Latitude']
location_longitude = row['Longitude']
all_years.sort()
for yr in all_years:
data = Dataset(str(yr)+'.nc', 'r')
#storing lon lat data of the netCDF file into variable
lat = data.variables['lat'][:]
lon = data.variables['lon'][:]
#Trying to access the nearest location in the netcdf file
#squared difference between the specified lat, lon and lat lon of the netCDF
sq_diff_lat = (lat - location_latitude)**2
sq_diff_lon = (lon - location_longitude)**2
#identify the index of the min value for lat and lon
min_index_lat = sq_diff_lat.argmin()
min_index_lon = sq_diff_lon.argmin()
#accessing precipitation values
precip = data.variables['precip']
#creating date range for every year during iteration
start = str(yr) + '-01-01'
end = str(yr) + '-12-31'
d_range = pd.date_range(start = start,
end = end,
freq = 'D')
# now a nested loop that goes for all years dates
for t_index in np.arange(0,len(d_range)):
print(d_range[t_index])
#This is the culprit line that throws me error (below)
df.loc[d_range[t_index]]['Day'] = datestdtojd(str(d_range[t_index]))
#I tried the line below also
#df.loc[d_range[t_index]]['Day'] =
datestdtojd(datetime.strptime(str(d_range[t_index]), '%Y-%m-%d
%H:%M:%S'))
df.loc[d_range[t_index]]['Year'] = year
df.loc[d_range[t_index]]['Latitude'] = location_latitude
df.loc[d_range[t_index]]['Longitude'] = location_longitude
df.loc[d_range[t_index]]['Precipitation'] = precip[t_index,
min_index_lat, min_index_lon]
#save as csv
df = df.pivot_table(values='Precipitation', index=.
['Year','Latitude','Longitude'],columns='Day')
df.to_csv('Final_Output_With_368Column_10rows.csv')
ValueError: unconverted data remains: 00:00:00
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
