Skip to content

Commit

Permalink
fix docs performance section (pytorch#1409)
Browse files Browse the repository at this point in the history
Co-authored-by: Andrew Ho <[email protected]>
  • Loading branch information
andrewkho and Andrew Ho authored Dec 14, 2024
1 parent 383893e commit 2707d18
Showing 1 changed file with 9 additions and 7 deletions.
16 changes: 9 additions & 7 deletions docs/source/what_is_torchdata_nodes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ hoops with a special sampler.

``torchdata.nodes`` follows a streaming data model, where operators are
Iterators that can be combined together to define a dataloading and
pre-proc pipeline. Samplers are still supported (see example above) and
pre-proc pipeline. Samplers are still supported (see :ref:`migrate-to-nodes-from-utils`) and
can be combined with a Mapper to produce an Iterator

Multi-Datasets do not fit well with the current implementation in ``torch.utils.data``
Expand Down Expand Up @@ -102,12 +102,14 @@ where we showed that:

* With GIL python, torchdata.nodes with multi-threading performs better than
multi-processing in some scenarios, but makes features like GPU pre-proc
easier to perform which can boost

We ran a benchmark loading the Imagenet dataset from disk,
and manage to saturate main-memory bandwidth with Free-Threaded Python (3.13t)
at a significantly lower CPU utilization than with multi-process workers
(blogpost expected eary 2025). See ``examples/nodes/imagenet_benchmark.py``.
easier to perform, which can boost throughput for many use cases.

* With No-GIL / Free-Threaded python (3.13t), we ran a benchmark loading the
Imagenet dataset from disk, and manage to saturate main-memory bandwidth
at a significantly lower CPU utilization than with multi-process workers
(blogpost expected eary 2025). See
`imagenet_benchmark.py <https://github.com/pytorch/data/blob/main/examples/nodes/imagenet_benchmark.py>`_
to try on your own hardware.


Design choices
Expand Down

0 comments on commit 2707d18

Please sign in to comment.