Skip to content

Commit

Permalink
Merge branch 'master' of github.com:xray/xray into feature-plotting
Browse files Browse the repository at this point in the history
  • Loading branch information
Clark Fitzgerald committed Jul 17, 2015
2 parents 28047e4 + b09c8a5 commit d657bc5
Show file tree
Hide file tree
Showing 31 changed files with 1,460 additions and 660 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,5 @@ doc/_build
doc/generated
doc/_static/*.png
xray/version.py

.ipynb_checkpoints
2 changes: 2 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ xray: N-D labeled arrays and datasets
:target: https://coveralls.io/r/xray/xray
.. image:: https://img.shields.io/pypi/v/xray.svg
:target: https://pypi.python.org/pypi/xray/
.. image:: https://badges.gitter.im/Join%20Chat.svg
:target: https://gitter.im/xray/xray

**xray** is an open source project and Python package that aims to bring the
labeled data power of pandas_ to the physical sciences, by providing
Expand Down
1 change: 1 addition & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -346,6 +346,7 @@ Dataset methods
open_dataset
open_mfdataset
Dataset.to_netcdf
save_mfdataset
Dataset.to_array
Dataset.to_dataframe
Dataset.from_dataframe
Expand Down
7 changes: 4 additions & 3 deletions doc/combining.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,10 @@ Of course, ``concat`` also works on ``Dataset`` objects:
xray.concat([ds.sel(x='a'), ds.sel(x='b')], 'x')
:py:func:`~xray.concat` has a number of options which provide deeper control
over which variables and coordinates are concatenated and how it handles
conflicting variables between datasets. However, these should rarely be
necessary.
over which variables are concatenated and how it handles conflicting variables
between datasets. With the default parameters, xray will load some coordinate
variables into memory to compare them between datasets. This may be prohibitively
expensive if you are manipulating your dataset lazily using :ref:`dask`.

.. _merge:

Expand Down
91 changes: 29 additions & 62 deletions doc/examples/monthly-means.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@
Calculating Seasonal Averages from Timeseries of Monthly Means
==============================================================

Author: `Joe Hamman <http://www.hydro.washington.edu/~jhamman/>`_
Author: `Joe Hamman <http://uw-hydro.github.io/current_member/joe_hamman/>`_

The data for this example can be found in the `xray-data <https://github.com/xray/xray-data>`_ repository. This example is also available in an IPython Notebook that is available `here <https://github.com/xray/xray/tree/master/examples/xray_seasonal_means.ipynb>`_.

Suppose we have a netCDF or xray Dataset of monthly mean data and we
want to calculate the seasonal average. To do this properly, we need to
Expand All @@ -12,6 +14,7 @@ different number of days.

.. code:: python
%matplotlib inline
import numpy as np
import pandas as pd
import xray
Expand All @@ -24,9 +27,9 @@ different number of days.
.. parsed-literal::
numpy version : 1.9.1
pandas version : 0.15.2
xray version : 0.4rc1-20-g52bbca3
numpy version : 1.9.2
pandas version : 0.16.2
xray version : 0.5.1
Some calendar information so we can support any netCDF calendar.
Expand All @@ -42,6 +45,7 @@ Some calendar information so we can support any netCDF calendar.
'all_leap': [0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],
'366_day': [0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31],
'360_day': [0, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30]}
A few calendar functions to determine the number of days in each month
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -80,59 +84,39 @@ the ``calendar.month_range`` function.
if leap_year(year, calendar=calendar):
month_length[i] += 1
return month_length
Open the ``Dataset``
^^^^^^^^^^^^^^^^^^^^

.. code:: python
monthly_mean_file = '/raid2/jhamman/projects/RASM/data/processed/R1002RBRxaaa01a/lnd/monthly_mean_timeseries/R1002RBRxaaa01a.vic.hmm.197909-201212.nc'
monthly_mean_file = 'RASM_example_data.nc'
ds = xray.open_dataset(monthly_mean_file, decode_coords=False)
ds.attrs['history'] = '' # get rid of the history attribute because its obnoxiously long
print(ds)
.. parsed-literal::
<xray.Dataset>
Dimensions: (depth: 3, time: 400, x: 275, y: 205)
Dimensions: (time: 36, x: 275, y: 205)
Coordinates:
* time (time) datetime64[ns] 1979-09-16T12:00:00 1979-10-17 ...
* depth (depth) int64 0 1 2
* x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ...
* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ...
* time (time) datetime64[ns] 1980-09-16T12:00:00 1980-10-17 ...
* x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
Data variables:
Precipitation (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Evap (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Runoff (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Baseflow (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Soilw (time, depth, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan ...
Swq (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Swd (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Swnet (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Lwnet (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Lwin (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Netrad (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Swin (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Latht (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Senht (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Grdht (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Albedo (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Radt (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Surft (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Relhum (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Tair (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Tsoil (time, depth, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan ...
Wind (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan nan nan nan ...
Tair (time, y, x) float64 nan nan nan nan nan nan nan nan nan nan ...
Attributes:
title: /workspace/jhamman/processed/R1002RBRxaaa01a/lnd/temp/R1002RBRxaaa01a.vic.ha.1979-09-01.nc
institution: U.W.
source: RACM R1002RBRxaaa01a
output_frequency: daily
output_mode: averaged
convention: CF-1.4
history:
references: Based on the initial model of Liang et al., 1994, JGR, 99, 14,415- 14,429.
comment: Output from the Variable Infiltration Capacity (VIC) model.
nco_openmp_thread_number: 1
NCO: 4.3.7
history: history deleted for brevity
Now for the heavy lifting:
Expand All @@ -148,8 +132,10 @@ allong the time dimension.
.. code:: python
# Make a DataArray with the number of days in each month, size = len(time)
month_length = xray.DataArray(get_dpm(ds.time.to_index(), calendar='noleap'),
month_length = xray.DataArray(get_dpm(ds.time.to_index(),
calendar='noleap'),
coords=[ds.time], name='month_length')
# Calculate the weights by grouping by 'time.season'
weights = month_length.groupby('time.season') / month_length.groupby('time.season').sum()
Expand All @@ -158,49 +144,30 @@ allong the time dimension.
# Calculate the weighted average
ds_weighted = (ds * weights).groupby('time.season').sum(dim='time')
.. code:: python
print(ds_weighted)
.. parsed-literal::
<xray.Dataset>
Dimensions: (depth: 3, season: 4, x: 275, y: 205)
Dimensions: (season: 4, x: 275, y: 205)
Coordinates:
* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ...
* x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ...
* depth (depth) int64 0 1 2
* season (season) object 'DJF' 'JJA' 'MAM' 'SON'
* y (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
* x (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ...
* season (season) object 'DJF' 'JJA' 'MAM' 'SON'
Data variables:
Baseflow (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Tsoil (season, depth, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Wind (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Swin (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Swq (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Netrad (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Albedo (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Evap (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Swd (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Radt (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Lwin (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Relhum (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Soilw (season, depth, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Lwnet (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Senht (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Surft (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Latht (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Runoff (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Tair (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Grdht (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Swnet (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Precipitation (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
Tair (season, y, x) float64 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ...
.. code:: python
# only used for comparisons
ds_unweighted = ds.groupby('time.season').mean('time')
ds_diff = ds_weighted - ds_unweighted
.. code:: python
# Quick plot to show the results
Expand Down
13 changes: 10 additions & 3 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,20 +49,27 @@ Documentation
See also
--------

- `Stephan Hoyer's PyData talk`_ introducing the original version of xray.
- Stephan Hoyer's `SciPy2015 talk`_ introducing xray to a general audience.
- Stephan Hoyer's `2015 Unidata Users Workshop talk`_ and `tutorial`_ (`with answers`_) introducing
xray to users familiar with netCDF.
- `Nicolas Fauchereau's tutorial`_ on xray for netCDF users.

.. _Stephan Hoyer's PyData talk: https://www.youtube.com/watch?v=T5CZyNwBa9c
.. _SciPy2015 talk: https://www.youtube.com/watch?v=X0pAhJgySxk
.. _2015 Unidata Users Workshop talk: https://www.youtube.com/watch?v=J9ypQOnt5l8
.. _tutorial: https://github.com/Unidata/unidata-users-workshop/blob/master/notebooks/xray-tutorial.ipynb
.. _with answers: https://github.com/Unidata/unidata-users-workshop/blob/master/notebooks/xray-tutorial-with-answers.ipynb
.. _Nicolas Fauchereau's tutorial: http://nbviewer.ipython.org/github/nicolasfauchereau/metocean/blob/master/notebooks/xray.ipynb

Get in touch
------------

- To ask questions or discuss xray, use the `mailing list`_.
- Report bugs or view the source code `on GitHub`_.
- Report bugs, suggest feature ideas or view the source code `on GitHub`_.
- For interactive discussion, we have a chatroom `on Gitter`_.
- You can also get in touch `on Twitter`_.

.. _mailing list: https://groups.google.com/forum/#!forum/xray-dev
.. _on Gitter: https://gitter.im/xray/xray
.. _on GitHub: http://github.com/xray/xray
.. _on Twitter: http://twitter.com/shoyer

Expand Down
56 changes: 56 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,62 @@ What's New
import xray
np.random.seed(123456)
v0.5.2 (16 July 2015)
---------------------

This release contains bug fixes, several additional options for opening and
saving netCDF files, and a backwards incompatible rewrite of the advanced
options for ``xray.concat``.

Backwards incompatible changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- The optional arguments ``concat_over`` and ``mode`` in :py:func:`~xray.concat` have
been removed and replaced by ``data_vars`` and ``coords``. The new arguments are both
more easily understood and more robustly implemented, and allowed us to fix a bug
where ``concat`` accidentally loaded data into memory. If you set values for
these optional arguments manually, you will need to update your code. The default
behavior should be unchanged.

Enhancements
~~~~~~~~~~~~

- :py:func:`~xray.open_mfdataset` now supports a ``preprocess`` argument for
preprocessing datasets prior to concatenaton. This is useful if datasets
cannot be otherwise merged automatically, e.g., if the original datasets
have conflicting index coordinates (:issue:`443`).
- :py:func:`~xray.open_dataset` and :py:func:`~xray.open_mfdataset` now use a
global thread lock by default for reading from netCDF files with dask. This
avoids possible segmentation faults for reading from netCDF4 files when HDF5
is not configured properly for concurrent access (:issue:`444`).
- Added support for serializing arrays of complex numbers with `engine='h5netcdf'`.
- The new :py:func:`~xray.save_mfdataset` function allows for saving multiple
datasets to disk simultaneously. This is useful when processing large datasets
with dask.array. For example, to save a dataset too big to fit into memory
to one file per year, we could write:

.. ipython::
:verbatim:

In [1]: years, datasets = zip(*ds.groupby('time.year'))
In [2]: paths = ['%s.nc' % y for y in years]

In [3]: xray.save_mfdataset(datasets, paths)

Bug fixes
~~~~~~~~~

- Fixed ``min``, ``max``, ``argmin`` and ``argmax`` for arrays with string or
unicode types (:issue:`453`).
- :py:func:`~xray.open_dataset` and :py:func:`~xray.open_mfdataset` support
supplying chunks as a single integer.
- Fixed a bug in serializing scalar datetime variable to netCDF.
- Fixed a bug that could occur in serialization of 0-dimensional integer arrays.
- Fixed a bug where concatenating DataArrays was not always lazy (:issue:`464`).
- When reading datasets with h5netcdf, bytes attributes are decoded to strings.
This allows conventions decoding to work properly on Python 3 (:issue:`451`).

v0.5.1 (15 June 2015)
---------------------

Expand Down
Loading

0 comments on commit d657bc5

Please sign in to comment.