Skip to content

Commit

Permalink
DOC: tweak paragraph regarding cut and IntervalIndex (pandas-dev#27132)
Browse files Browse the repository at this point in the history
  • Loading branch information
pilkibun authored and jorisvandenbossche committed Jun 30, 2019
1 parent f58a1fe commit b870dee
Showing 1 changed file with 8 additions and 3 deletions.
11 changes: 8 additions & 3 deletions doc/source/user_guide/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -965,21 +965,26 @@ If you select a label *contained* within an interval, this will also select the
df.loc[2.5]
df.loc[[2.5, 3.5]]
``Interval`` and ``IntervalIndex`` are used by ``cut`` and ``qcut``:
:func:`cut` and :func:`qcut` both return a ``Categorical`` object, and the bins they
create are stored as an ``IntervalIndex`` in its ``.categories`` attribute.

.. ipython:: python
c = pd.cut(range(4), bins=2)
c
c.categories
Furthermore, ``IntervalIndex`` allows one to bin *other* data with these same
bins, with ``NaN`` representing a missing value similar to other dtypes.
:func:`cut` also accepts an ``IntervalIndex`` for its ``bins`` argument, which enables
a useful pandas idiom. First, We call :func:`cut` with some data and ``bins`` set to a
fixed number, to generate the bins. Then, we pass the values of ``.categories`` as the
``bins`` argument in subsequent calls to :func:`cut`, supplying new data which will be
binned into the same bins.

.. ipython:: python
pd.cut([0, 3, 5, 1], bins=c.categories)
Any value which falls outside all bins will be assigned a ``NaN`` value.

Generating ranges of intervals
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down

0 comments on commit b870dee

Please sign in to comment.