'Merge multiple xarray Datasets with overlapping coordinates

I'm trying to merge multiple Datasets having overlapping coordinates into one. When I set compat= to 'override', only the values of the first Dataset are kept and the rest of the resulting Dataset is set to nan. I'm fine using any of the intersecting values for cells with conflicts.

See the example below

import numpy as np
import pandas as pd
import xarray as xr


temperature = np.random.randint(1,255,size=(9,10,10))
precipitation = np.random.randint(1,255,size=(9,10,10))
lon = np.linspace(10,100,10)
lat = np.linspace(10,100,10)
time_0 = pd.date_range("2014-09-06", periods=9, freq='2D')
time_1 = pd.date_range("2014-09-06", periods=9, freq='3D')

ds_0 = xr.Dataset(
       data_vars=dict(
           temperature=(('time', 'y', 'x'), temperature),
           precipitation=(('time', 'y', 'x'), precipitation)),
       coords=dict(
           x=lon,
           y=lat,
           time=time_0)
)
ds_1 = xr.Dataset(
       data_vars=dict(
           temperature=(('time', 'y', 'x'), temperature),
           precipitation=(('time', 'y', 'x'), precipitation)),
       coords=dict(
           x=lon + 90,
           y=lat,
           time=time_0)
)
ds_2 = xr.Dataset(
       data_vars=dict(
           temperature=(('time', 'y', 'x'), temperature),
           precipitation=(('time', 'y', 'x'), precipitation)),
       coords=dict(
           x=lon + 90,
           y=lat + 90,
           time=time_1)
)
ds_3 = xr.Dataset(
       data_vars=dict(
           temperature=(('time', 'y', 'x'), temperature),
           precipitation=(('time', 'y', 'x'), precipitation)),
       coords=dict(
           x=lon,
           y=lat + 90,
           time=time_1)
)

# Combine
ds = xr.merge([ds_0, ds_1, ds_2, ds_3], compat='override')
print(ds)


Solution 1:[1]

Yep - the intended & documented behavior of xr.merge when compat='override' is to use the first passed data object's coordinates. From the xr.merge docs:

compat ({"identical", "equals", "broadcast_equals", "no_conflicts", "override"}, optional) – String indicating how to compare variables of the same name for potential conflicts:

  • “broadcast_equals”: all values must be equal when variables are broadcast against each other to ensure common dimensions.

  • “equals”: all values and dimensions must be the same.

  • “identical”: all values, dimensions and attributes must be the same.

  • “no_conflicts”: only values which are not null in both datasets must be equal. The returned dataset then contains the combination of all non-null values.

  • “override”: skip comparing and pick variable from first dataset

So I think you're probably looking for compat='no_conflicts'.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Michael Delgado