'How to combine more than 300 NetCDF files?
i tried combine more than 300 NetCDF files into one with Xarray. But it is running over three days and the final NetCDF file has about 5 GB. All single NetCDF files have about 1.5 GB. Can you help me how combine these NetCDF files into one with this structure?
<xarray.Dataset>
Dimensions: (lat: 124, lon: 499, time: 79)
Coordinates:
* lat (lat) float64 50.96 50.96 50.97 50.97 ... 51.27 51.27 51.27 51.27
* lon (lon) float64 16.52 16.53 16.53 16.53 ... 17.77 17.77 17.77 17.77
* time (time) datetime64[ns] 2015-07-10 2015-07-22 ... 2017-08-10
Data variables:
vel (lat, lon) float64 ...
coh (lat, lon) float64 ...
cum (time, lat, lon) float64 ...
Process finished with exit code 0
I tried it with whis code, but it still running (more than 3 days) and final file have over 5 GB.
import netCDF4
import numpy
import xarray
import dask
dask.config.set({"array.slicing.split_large_chunks": False})
ds = xarray.open_mfdataset('../data/all-nc/*.nc', combine='nested', concat_dim="time")
ds.to_netcdf('../data/all-nc-files.nc')
Thank you a lot!
Solution 1:[1]
You might want to try this with nctoolkit, which uses CDO as a backend. This will probably be faster:
ds = nc.open_data('../data/all-nc/*.nc')
ds.merge("time")
ds.to_nc('../data/all-nc-files.nc', zip = True)
Note: Though I am not sure why you are merging files, which are already very large, into an even larger file. If these are zipped netCDF files, then you will end up with a single file of over 300 GB. I have worked with a lot of netCDF data in my time, but I have never seen anyone produce a file that large. It is almost certainly more efficient to simply leave the files as they are instead of merging them.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Robert Wilson |
