You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
After concatenating two NetCDF DataSets with different cftime.DateTimeNoLeap coordinates, attempting to write to a Zarr store with ds.to_zarr() fails with an OutOfBoundsDatetime exception.
What you expected to happen:
I expect to_zarr() to execute successfully.
Minimal Complete Verifiable Example:
importxarrayasxrimportcftimeimportpandasaspd# open a generic CESM dataset containing a time_bnds variableurl='http://adss.apcc21.org/opendap/CMIP5DB/cmip5_daily_BT/pr_day_CESM1-BGC_rcp85_r1i1p1_20760101-21001231.nc'ds=xr.open_dataset(url)
# create two new DataSets with different, overlapping time indexes.ds2=ds.sel(time=slice(None, cftime.DatetimeNoLeap(2076, 3, 1, 1, 0, 0, 0)))
ds3=ds.sel(time=slice(None, cftime.DatetimeNoLeap(2076, 2, 1, 1, 0, 0, 0)))
# concatenate the two DataSets, using the default fillvalueds4=xr.concat([ds2, ds3], dim=pd.Index(['ds2','ds3'], name='ds'))
# fails with OutOfBoundsDatetime exceptionzs=ds4.to_zarr('/tmp/my_zarr.zarr')
Anything else we need to know?:
I believe the problem is related to the implicit NaN fillvalue used in concatenating the time_bnds variable. I arrived at the code while trying to produce a minimal example of a similar error, where a DataSet concatenation would fail with a SerializationError if I tried to concatenate multiple datasets containing time_bnds variables. In the above example and in my earlier troubleshooting in production code, removing time_bnds with ds = ds.drop('time_bnds') made the to_zarr() command work.
This is a common use case since time_bnds indexes are generated by CESM climate model output.
Thanks for raising this issue. Indeed, there is currently not a well-defined way of handling missing cftime values, so I'm not surprised that encoding failed with an obscure error. I think a good path forward here would be to start with addressing Unidata/cftime#145, which we could build off of in xarray.
Thank you! I second the idea for a more useful error, Deepak. Hopefully this Github issue also helps people whose concatenations mysteriously fail, find a workaround in the meantime.
What happened:
After concatenating two NetCDF DataSets with different cftime.DateTimeNoLeap coordinates, attempting to write to a Zarr store with
ds.to_zarr()
fails with an OutOfBoundsDatetime exception.What you expected to happen:
I expect
to_zarr()
to execute successfully.Minimal Complete Verifiable Example:
Anything else we need to know?:
I believe the problem is related to the implicit NaN fillvalue used in concatenating the
time_bnds
variable. I arrived at the code while trying to produce a minimal example of a similar error, where a DataSet concatenation would fail with a SerializationError if I tried to concatenate multiple datasets containingtime_bnds
variables. In the above example and in my earlier troubleshooting in production code, removingtime_bnds
withds = ds.drop('time_bnds')
made theto_zarr()
command work.This is a common use case since
time_bnds
indexes are generated by CESM climate model output.Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.7 (default, Mar 23 2020, 22:36:06)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.3.0-1032-aws
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.1
xarray: 0.16.0
pandas: 1.0.5
numpy: 1.19.1
scipy: 1.5.0
netCDF4: 1.4.2
pydap: None
h5netcdf: 0.8.0
h5py: 2.10.0
Nio: None
zarr: 2.3.2
cftime: 1.2.1
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.20.0
distributed: 2.20.0
matplotlib: 3.2.2
cartopy: 0.17.0
seaborn: None
numbagg: None
pint: None
setuptools: 49.2.0.post20200714
pip: 20.1.1
conda: None
pytest: None
IPython: 7.16.1
sphinx: None
/home/ubuntu/a
The text was updated successfully, but these errors were encountered: