Skip to content

Commit

Permalink
Update README.
Browse files Browse the repository at this point in the history
  • Loading branch information
fheinsen committed Jan 13, 2023
1 parent 2184d66 commit 222b95b
Showing 1 changed file with 10 additions and 1 deletion.
11 changes: 10 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,16 @@ x_inp = torch.randn(n_inp, d_inp)
x_out = model(x_inp)
```

Note: If the long input sequences have varying lengths, or if all input vectors are in the same feature space, you can set `n_inp` to -1, which can reduce memory footprint but may increase computation (see [here](#routing-sequences-of-varying-length)).
Note: If the long input sequences have varying lengths, or if your application will work with all input vectors in each sequence being in the same feature space, you can set `n_inp` to -1 to reduce parameter count and memory footprint. For example, in the code snippet above, you can replace `model` with:

```python
model = nn.Sequential(
Routing(n_inp=-1, n_out=n_hid, d_inp=d_inp, d_out=d_hid), # "summarize"
Routing(n_inp=-1, n_out=n_out, d_inp=d_hid, d_out=d_out), # "rewrite"
)
```

and the memory required to route 250,000 to 1,000 vectors of size 1024 at 32-bit precision, keeping track of gradients, is now only ~3.9GB (on a recent Nvidia GPU, excluding ~1GB of PyTorch and CUDA overhead).


### Recurrent Routings
Expand Down

0 comments on commit 222b95b

Please sign in to comment.