Skip to content

Latest commit

 

History

History
331 lines (235 loc) · 10.2 KB

interpolation.rst

File metadata and controls

331 lines (235 loc) · 10.2 KB

Interpolating data

.. ipython:: python
    :suppress:

    import numpy as np
    import pandas as pd
    import xarray as xr

    np.random.seed(123456)

Xarray offers flexible interpolation routines, which have a similar interface to our :ref:`indexing <indexing>`.

Note

interp requires scipy installed.

Scalar and 1-dimensional interpolation

Interpolating a :py:class:`~xarray.DataArray` works mostly like labeled indexing of a :py:class:`~xarray.DataArray`,

.. ipython:: python

    da = xr.DataArray(
        np.sin(0.3 * np.arange(12).reshape(4, 3)),
        [("time", np.arange(4)), ("space", [0.1, 0.2, 0.3])],
    )
    # label lookup
    da.sel(time=3)

    # interpolation
    da.interp(time=2.5)


Similar to the indexing, :py:meth:`~xarray.DataArray.interp` also accepts an array-like, which gives the interpolated result as an array.

.. ipython:: python

    # label lookup
    da.sel(time=[2, 3])

    # interpolation
    da.interp(time=[2.5, 3.5])

To interpolate data with a :py:doc:`numpy.datetime64 <numpy:reference/arrays.datetime>` coordinate you can pass a string.

.. ipython:: python

    da_dt64 = xr.DataArray(
        [1, 3], [("time", pd.date_range("1/1/2000", "1/3/2000", periods=2))]
    )
    da_dt64.interp(time="2000-01-02")

The interpolated data can be merged into the original :py:class:`~xarray.DataArray` by specifying the time periods required.

.. ipython:: python

    da_dt64.interp(time=pd.date_range("1/1/2000", "1/3/2000", periods=3))

Interpolation of data indexed by a :py:class:`~xarray.CFTimeIndex` is also allowed. See :ref:`CFTimeIndex` for examples.

Note

Currently, our interpolation only works for regular grids. Therefore, similarly to :py:meth:`~xarray.DataArray.sel`, only 1D coordinates along a dimension can be used as the original coordinate to be interpolated.

Multi-dimensional Interpolation

Like :py:meth:`~xarray.DataArray.sel`, :py:meth:`~xarray.DataArray.interp` accepts multiple coordinates. In this case, multidimensional interpolation is carried out.

.. ipython:: python

    # label lookup
    da.sel(time=2, space=0.1)

    # interpolation
    da.interp(time=2.5, space=0.15)

Array-like coordinates are also accepted:

.. ipython:: python

    # label lookup
    da.sel(time=[2, 3], space=[0.1, 0.2])

    # interpolation
    da.interp(time=[1.5, 2.5], space=[0.15, 0.25])


:py:meth:`~xarray.DataArray.interp_like` method is a useful shortcut. This method interpolates an xarray object onto the coordinates of another xarray object. For example, if we want to compute the difference between two :py:class:`~xarray.DataArray` s (da and other) staying on slightly different coordinates,

.. ipython:: python

    other = xr.DataArray(
        np.sin(0.4 * np.arange(9).reshape(3, 3)),
        [("time", [0.9, 1.9, 2.9]), ("space", [0.15, 0.25, 0.35])],
    )

it might be a good idea to first interpolate da so that it will stay on the same coordinates of other, and then subtract it. :py:meth:`~xarray.DataArray.interp_like` can be used for such a case,

.. ipython:: python

    # interpolate da along other's coordinates
    interpolated = da.interp_like(other)
    interpolated

It is now possible to safely compute the difference other - interpolated.

Interpolation methods

We use :py:class:`scipy.interpolate.interp1d` for 1-dimensional interpolation. For multi-dimensional interpolation, an attempt is first made to decompose the interpolation in a series of 1-dimensional interpolations, in which case :py:class:`scipy.interpolate.interp1d` is used. If a decomposition cannot be made (e.g. with advanced interpolation), :py:func:`scipy.interpolate.interpn` is used.

The interpolation method can be specified by the optional method argument.

.. ipython:: python

    da = xr.DataArray(
        np.sin(np.linspace(0, 2 * np.pi, 10)),
        dims="x",
        coords={"x": np.linspace(0, 1, 10)},
    )

    da.plot.line("o", label="original")
    da.interp(x=np.linspace(0, 1, 100)).plot.line(label="linear (default)")
    da.interp(x=np.linspace(0, 1, 100), method="cubic").plot.line(label="cubic")
    @savefig interpolation_sample1.png width=4in
    plt.legend()

Additional keyword arguments can be passed to scipy's functions.

.. ipython:: python

    # fill 0 for the outside of the original coordinates.
    da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={"fill_value": 0.0})
    # 1-dimensional extrapolation
    da.interp(x=np.linspace(-0.5, 1.5, 10), kwargs={"fill_value": "extrapolate"})
    # multi-dimensional extrapolation
    da = xr.DataArray(
        np.sin(0.3 * np.arange(12).reshape(4, 3)),
        [("time", np.arange(4)), ("space", [0.1, 0.2, 0.3])],
    )

    da.interp(
        time=4, space=np.linspace(-0.1, 0.5, 10), kwargs={"fill_value": "extrapolate"}
    )


Advanced Interpolation

:py:meth:`~xarray.DataArray.interp` accepts :py:class:`~xarray.DataArray` as similar to :py:meth:`~xarray.DataArray.sel`, which enables us more advanced interpolation. Based on the dimension of the new coordinate passed to :py:meth:`~xarray.DataArray.interp`, the dimension of the result are determined.

For example, if you want to interpolate a two dimensional array along a particular dimension, as illustrated below, you can pass two 1-dimensional :py:class:`~xarray.DataArray` s with a common dimension as new coordinate.

advanced indexing and interpolation

For example:

.. ipython:: python

    da = xr.DataArray(
        np.sin(0.3 * np.arange(20).reshape(5, 4)),
        [("x", np.arange(5)), ("y", [0.1, 0.2, 0.3, 0.4])],
    )
    # advanced indexing
    x = xr.DataArray([0, 2, 4], dims="z")
    y = xr.DataArray([0.1, 0.2, 0.3], dims="z")
    da.sel(x=x, y=y)

    # advanced interpolation, without extrapolation
    x = xr.DataArray([0.5, 1.5, 2.5, 3.5], dims="z")
    y = xr.DataArray([0.15, 0.25, 0.35, 0.45], dims="z")
    da.interp(x=x, y=y)

where values on the original coordinates (x, y) = ((0.5, 0.15), (1.5, 0.25), (2.5, 0.35), (3.5, 0.45)) are obtained by the 2-dimensional interpolation and mapped along a new dimension z. Since no keyword arguments are passed to the interpolation routine, no extrapolation is performed resulting in a nan value.

If you want to add a coordinate to the new dimension z, you can supply :py:class:`~xarray.DataArray` s with a coordinate. Extrapolation can be achieved by passing additional arguments to SciPy's interpnd function,

.. ipython:: python

    x = xr.DataArray([0.5, 1.5, 2.5, 3.5], dims="z", coords={"z": ["a", "b", "c", "d"]})
    y = xr.DataArray(
        [0.15, 0.25, 0.35, 0.45], dims="z", coords={"z": ["a", "b", "c", "d"]}
    )
    da.interp(x=x, y=y, kwargs={"fill_value": None})

For the details of the advanced indexing, see :ref:`more advanced indexing <more_advanced_indexing>`.

Interpolating arrays with NaN

Our :py:meth:`~xarray.DataArray.interp` works with arrays with NaN the same way that scipy.interpolate.interp1d and scipy.interpolate.interpn do. linear and nearest methods return arrays including NaN, while other methods such as cubic or quadratic return all NaN arrays.

.. ipython:: python

    da = xr.DataArray([0, 2, np.nan, 3, 3.25], dims="x", coords={"x": range(5)})
    da.interp(x=[0.5, 1.5, 2.5])
    da.interp(x=[0.5, 1.5, 2.5], method="cubic")

To avoid this, you can drop NaN by :py:meth:`~xarray.DataArray.dropna`, and then make the interpolation

.. ipython:: python

    dropped = da.dropna("x")
    dropped
    dropped.interp(x=[0.5, 1.5, 2.5], method="cubic")

If NaNs are distributed randomly in your multidimensional array, dropping all the columns containing more than one NaNs by :py:meth:`~xarray.DataArray.dropna` may lose a significant amount of information. In such a case, you can fill NaN by :py:meth:`~xarray.DataArray.interpolate_na`, which is similar to :py:meth:`pandas.Series.interpolate`.

.. ipython:: python

    filled = da.interpolate_na(dim="x")
    filled

This fills NaN by interpolating along the specified dimension. After filling NaNs, you can interpolate:

.. ipython:: python

    filled.interp(x=[0.5, 1.5, 2.5], method="cubic")

For the details of :py:meth:`~xarray.DataArray.interpolate_na`, see :ref:`Missing values <missing_values>`.

Example

Let's see how :py:meth:`~xarray.DataArray.interp` works on real data.

.. ipython:: python

    # Raw data
    ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)
    fig, axes = plt.subplots(ncols=2, figsize=(10, 4))
    ds.air.plot(ax=axes[0])
    axes[0].set_title("Raw data")

    # Interpolated data
    new_lon = np.linspace(ds.lon[0], ds.lon[-1], ds.sizes["lon"] * 4)
    new_lat = np.linspace(ds.lat[0], ds.lat[-1], ds.sizes["lat"] * 4)
    dsi = ds.interp(lat=new_lat, lon=new_lon)
    dsi.air.plot(ax=axes[1])
    @savefig interpolation_sample3.png width=8in
    axes[1].set_title("Interpolated data")

Our advanced interpolation can be used to remap the data to the new coordinate. Consider the new coordinates x and z on the two dimensional plane. The remapping can be done as follows

.. ipython:: python

    # new coordinate
    x = np.linspace(240, 300, 100)
    z = np.linspace(20, 70, 100)
    # relation between new and original coordinates
    lat = xr.DataArray(z, dims=["z"], coords={"z": z})
    lon = xr.DataArray(
        (x[:, np.newaxis] - 270) / np.cos(z * np.pi / 180) + 270,
        dims=["x", "z"],
        coords={"x": x, "z": z},
    )

    fig, axes = plt.subplots(ncols=2, figsize=(10, 4))
    ds.air.plot(ax=axes[0])
    # draw the new coordinate on the original coordinates.
    for idx in [0, 33, 66, 99]:
        axes[0].plot(lon.isel(x=idx), lat, "--k")
    for idx in [0, 33, 66, 99]:
        axes[0].plot(*xr.broadcast(lon.isel(z=idx), lat.isel(z=idx)), "--k")
    axes[0].set_title("Raw data")

    dsi = ds.interp(lon=lon, lat=lat)
    dsi.air.plot(ax=axes[1])
    @savefig interpolation_sample4.png width=8in
    axes[1].set_title("Remapped data")