Skip to content

Commit

Permalink
fix comment
Browse files Browse the repository at this point in the history
  • Loading branch information
epwalsh committed Jun 21, 2023
1 parent 0929e6c commit 1cc00e6
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion olmo/data/memmap_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ class MemMapDataset(Dataset[Dict[str, Any]]):
remainder of the tokens will be ignored.
No special tokens are added to the input IDs so it's assumed that if you want
EOS tokens between documents, for example, those will already by in the memory-mapped array.
EOS tokens between documents, for example, those will already be in the memory-mapped array.
:param paths: Paths to memory-mapped token arrays.
:param chunk_size: The number of tokens to chunk together into a single instance.
Expand Down

0 comments on commit 1cc00e6

Please sign in to comment.