Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable passing a CFTimedeltaCoder to decode_timedelta #9966

Merged
merged 18 commits into from
Jan 29, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions doc/internals/time-coding.rst
Original file line number Diff line number Diff line change
Expand Up @@ -473,3 +473,65 @@ on-disk resolution, if possible.

coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-datetimes2.nc", decode_times=coder)

Similar logic applies for decoding timedelta values. The default resolution is
``"ns"``:

.. ipython:: python

attrs = {"units": "hours"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-timedeltas1.nc")

.. ipython:: python
:okwarning:

xr.open_dataset("test-timedeltas1.nc")

By default, timedeltas will be decoded to the same resolution as datetimes:

.. ipython:: python
:okwarning:

coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-timedeltas1.nc", decode_times=coder)

but if one would like to decode timedeltas to a different resolution, one can
provide a coder specifically for timedeltas to ``decode_timedelta``:

.. ipython:: python

timedelta_coder = xr.coders.CFTimedeltaCoder(time_unit="ms")
dcherian marked this conversation as resolved.
Show resolved Hide resolved
xr.open_dataset(
"test-timedeltas1.nc", decode_times=coder, decode_timedelta=timedelta_coder
)

As with datetimes, if a coarser unit is requested the timedeltas are decoded
into their native on-disk resolution, if possible:

.. ipython:: python

attrs = {"units": "milliseconds"}
ds = xr.Dataset({"time": ("time", [0, 1, 2, 3], attrs)})
ds.to_netcdf("test-timedeltas2.nc")

.. ipython:: python
:okwarning:

xr.open_dataset("test-timedeltas2.nc")

.. ipython:: python
:okwarning:

coder = xr.coders.CFDatetimeCoder(time_unit="s")
xr.open_dataset("test-timedeltas2.nc", decode_times=coder)
spencerkclark marked this conversation as resolved.
Show resolved Hide resolved

To opt-out of timedelta decoding (see issue `Undesired decoding to timedelta64 <https://github.com/pydata/xarray/issues/1621>`_) pass ``False`` to ``decode_timedelta``:

.. ipython:: python

xr.open_dataset("test-timedeltas2.nc", decode_timedelta=False)

.. note::
Note that in the future the default value of ``decode_timedelta`` will be
``False`` rather than ``None``.
43 changes: 31 additions & 12 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,23 +19,36 @@ What's New
v2025.01.2 (unreleased)
-----------------------

This release brings non-nanosecond datetime resolution to xarray. In the
last couple of releases xarray has been prepared for that change. The code had
to be changed and adapted in numerous places, affecting especially the test suite.
The documentation has been updated accordingly and a new internal chapter
on :ref:`internals.timecoding` has been added.

To make the transition as smooth as possible this is designed to be fully backwards
compatible, keeping the current default of ``'ns'`` resolution on decoding.
To opt-in decoding into other resolutions (``'us'``, ``'ms'`` or ``'s'``) the
new :py:class:`coders.CFDatetimeCoder` is used as parameter to ``decode_times``
kwarg (see also :ref:`internals.default_timeunit`):
This release brings non-nanosecond datetime and timedelta resolution to xarray.
In the last couple of releases xarray has been prepared for that change. The
code had to be changed and adapted in numerous places, affecting especially the
test suite. The documentation has been updated accordingly and a new internal
chapter on :ref:`internals.timecoding` has been added.

To make the transition as smooth as possible this is designed to be fully
backwards compatible, keeping the current default of ``'ns'`` resolution on
decoding. To opt-into decoding to other resolutions (``'us'``, ``'ms'`` or
``'s'``) an instance of the newly public :py:class:`coders.CFDatetimeCoder`
class can be passed through the ``decode_times`` keyword argument (see also
:ref:`internals.default_timeunit`):

.. code-block:: python

coder = xr.coders.CFDatetimeCoder(time_unit="s")
ds = xr.open_dataset(filename, decode_times=coder)

Similar control of the resoution of decoded timedeltas can be achieved through
passing a :py:class:`coders.CFTimedeltaCoder` instance to the
``decode_timedelta`` keyword argument:

.. code-block:: python

coder = xr.coders.CFTimedeltaCoder(time_unit="s")
ds = xr.open_dataset(filename, decode_timedelta=coder)

though by default timedeltas will be decoded to the same ``time_unit`` as
datetimes.

There might slight changes when encoding/decoding times as some warning and
error messages have been removed or rewritten. Xarray will now also allow
non-nanosecond datetimes (with ``'us'``, ``'ms'`` or ``'s'`` resolution) when
Expand All @@ -50,7 +63,7 @@ eventually be deprecated.

New Features
~~~~~~~~~~~~
- Relax nanosecond datetime restriction in CF time decoding (:issue:`7493`, :pull:`9618`, :pull:`9977`).
- Relax nanosecond datetime / timedelta restriction in CF time decoding (:issue:`7493`, :pull:`9618`, :pull:`9966`, :pull:`9977`).
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_ and `Spencer Clark <https://github.com/spencerkclark>`_.
- Enable the ``compute=False`` option in :py:meth:`DataTree.to_zarr`. (:pull:`9958`).
By `Sam Levang <https://github.com/slevang>`_.
Expand All @@ -72,6 +85,12 @@ Breaking changes

Deprecations
~~~~~~~~~~~~
- In a future version of xarray decoding of variables into
:py:class:`numpy.timedelta64` values will be disabled by default. To silence
warnings associated with this, set ``decode_timedelta`` to ``True``,
``False``, or a :py:class:`coders.CFTimedeltaCoder` instance when opening
data (:issue:`1621`, :pull:`9966`). By `Spencer Clark
<https://github.com/spencerkclark>`_.


Bug fixes
Expand Down
43 changes: 31 additions & 12 deletions xarray/backends/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
_normalize_path,
)
from xarray.backends.locks import _get_scheduler
from xarray.coders import CFDatetimeCoder
from xarray.coders import CFDatetimeCoder, CFTimedeltaCoder
from xarray.core import indexing
from xarray.core.combine import (
_infer_concat_order_from_positions,
Expand Down Expand Up @@ -487,7 +487,10 @@ def open_dataset(
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | Mapping[str, bool] | None = None,
decode_timedelta: bool
| CFTimedeltaCoder
| Mapping[str, bool | CFTimedeltaCoder]
| None = None,
use_cftime: bool | Mapping[str, bool] | None = None,
concat_characters: bool | Mapping[str, bool] | None = None,
decode_coords: Literal["coordinates", "all"] | bool | None = None,
Expand Down Expand Up @@ -555,11 +558,14 @@ def open_dataset(
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
decode_timedelta : bool or dict-like, optional
decode_timedelta : bool, CFTimedeltaCoder, or dict-like, optional
If True, decode variables and coordinates with time units in
{"days", "hours", "minutes", "seconds", "milliseconds", "microseconds"}
into timedelta objects. If False, leave them encoded as numbers.
If None (default), assume the same value of decode_time.
If None (default), assume the same value of ``decode_times``; if
``decode_times`` is a :py:class:`coders.CFDatetimeCoder` instance, this
takes the form of a :py:class:`coders.CFTimedeltaCoder` instance with a
matching ``time_unit``.
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
Expand Down Expand Up @@ -712,7 +718,7 @@ def open_dataarray(
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | None = None,
decode_timedelta: bool | CFTimedeltaCoder | None = None,
use_cftime: bool | None = None,
concat_characters: bool | None = None,
decode_coords: Literal["coordinates", "all"] | bool | None = None,
Expand Down Expand Up @@ -785,7 +791,10 @@ def open_dataarray(
If True, decode variables and coordinates with time units in
{"days", "hours", "minutes", "seconds", "milliseconds", "microseconds"}
into timedelta objects. If False, leave them encoded as numbers.
If None (default), assume the same value of decode_time.
If None (default), assume the same value of ``decode_times``; if
``decode_times`` is a :py:class:`coders.CFDatetimeCoder` instance, this
takes the form of a :py:class:`coders.CFTimedeltaCoder` instance with a
matching ``time_unit``.
This keyword may not be supported by all the backends.
use_cftime: bool, optional
Only relevant if encoded dates come from a standard calendar
Expand Down Expand Up @@ -927,7 +936,10 @@ def open_datatree(
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | Mapping[str, bool] | None = None,
decode_timedelta: bool
| CFTimedeltaCoder
| Mapping[str, bool | CFTimedeltaCoder]
| None = None,
use_cftime: bool | Mapping[str, bool] | None = None,
concat_characters: bool | Mapping[str, bool] | None = None,
decode_coords: Literal["coordinates", "all"] | bool | None = None,
Expand Down Expand Up @@ -995,7 +1007,10 @@ def open_datatree(
If True, decode variables and coordinates with time units in
{"days", "hours", "minutes", "seconds", "milliseconds", "microseconds"}
into timedelta objects. If False, leave them encoded as numbers.
If None (default), assume the same value of decode_time.
If None (default), assume the same value of ``decode_times``; if
``decode_times`` is a :py:class:`coders.CFDatetimeCoder` instance, this
takes the form of a :py:class:`coders.CFTimedeltaCoder` instance with a
matching ``time_unit``.
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
This keyword may not be supported by all the backends.
Expand Down Expand Up @@ -1150,7 +1165,10 @@ def open_groups(
| CFDatetimeCoder
| Mapping[str, bool | CFDatetimeCoder]
| None = None,
decode_timedelta: bool | Mapping[str, bool] | None = None,
decode_timedelta: bool
| CFTimedeltaCoder
| Mapping[str, bool | CFTimedeltaCoder]
| None = None,
use_cftime: bool | Mapping[str, bool] | None = None,
concat_characters: bool | Mapping[str, bool] | None = None,
decode_coords: Literal["coordinates", "all"] | bool | None = None,
Expand Down Expand Up @@ -1222,9 +1240,10 @@ def open_groups(
If True, decode variables and coordinates with time units in
{"days", "hours", "minutes", "seconds", "milliseconds", "microseconds"}
into timedelta objects. If False, leave them encoded as numbers.
If None (default), assume the same value of decode_time.
Pass a mapping, e.g. ``{"my_variable": False}``,
to toggle this feature per-variable individually.
If None (default), assume the same value of ``decode_times``; if
``decode_times`` is a :py:class:`coders.CFDatetimeCoder` instance, this
takes the form of a :py:class:`coders.CFTimedeltaCoder` instance with a
matching ``time_unit``.
This keyword may not be supported by all the backends.
use_cftime: bool or dict-like, optional
Only relevant if encoded dates come from a standard calendar
Expand Down
6 changes: 2 additions & 4 deletions xarray/coders.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@
"encoding/decoding" process.
"""

from xarray.coding.times import CFDatetimeCoder
from xarray.coding.times import CFDatetimeCoder, CFTimedeltaCoder

__all__ = [
"CFDatetimeCoder",
]
__all__ = ["CFDatetimeCoder", "CFTimedeltaCoder"]
30 changes: 27 additions & 3 deletions xarray/coding/times.py
Original file line number Diff line number Diff line change
Expand Up @@ -1343,6 +1343,21 @@ def decode(self, variable: Variable, name: T_Name = None) -> Variable:


class CFTimedeltaCoder(VariableCoder):
"""Coder for CF Timedelta coding.

Parameters
----------
time_unit : PDDatetimeUnitOptions
Target resolution when decoding timedeltas. Defaults to "ns".
"""

def __init__(
self,
time_unit: PDDatetimeUnitOptions = "ns",
) -> None:
self.time_unit = time_unit
self._emit_decode_timedelta_future_warning = False

def encode(self, variable: Variable, name: T_Name = None) -> Variable:
if np.issubdtype(variable.data.dtype, np.timedelta64):
dims, data, attrs, encoding = unpack_for_encoding(variable)
Expand All @@ -1359,12 +1374,21 @@ def encode(self, variable: Variable, name: T_Name = None) -> Variable:
def decode(self, variable: Variable, name: T_Name = None) -> Variable:
units = variable.attrs.get("units", None)
if isinstance(units, str) and units in TIME_UNITS:
if self._emit_decode_timedelta_future_warning:
emit_user_level_warning(
"In a future version of xarray decode_timedelta will "
"default to False rather than None. To silence this "
"warning, set decode_timedelta to True, False, or a "
"'CFTimedeltaCoder' instance.",
FutureWarning,
)
dims, data, attrs, encoding = unpack_for_decoding(variable)

units = pop_to(attrs, encoding, "units")
transform = partial(decode_cf_timedelta, units=units)
# todo: check, if we can relax this one here, too
dtype = np.dtype("timedelta64[ns]")
dtype = np.dtype(f"timedelta64[{self.time_unit}]")
transform = partial(
decode_cf_timedelta, units=units, time_unit=self.time_unit
)
data = lazy_elemwise_func(data, transform, dtype=dtype)

return Variable(dims, data, attrs, encoding, fastpath=True)
Expand Down
Loading
Loading