Skip to content

Commit

Permalink
Merge pull request karpathy#120 from nynyg/remove_cpu_pin_mem
Browse files Browse the repository at this point in the history
Pin memory only when training on GPU
  • Loading branch information
karpathy authored Feb 4, 2023
2 parents 77e7e04 + b8286f3 commit dc14989
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion train.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,12 @@ def get_batch(split):
x = torch.stack([torch.from_numpy((data[i:i+block_size]).astype(np.int64)) for i in ix])
y = torch.stack([torch.from_numpy((data[i+1:i+1+block_size]).astype(np.int64)) for i in ix])
# pin arrays x,y, which allows us to move them to GPU asynchronously (non_blocking=True)
x, y = x.pin_memory().to(device, non_blocking=True), y.pin_memory().to(device, non_blocking=True)
if "cuda" in device:
# GPU training
x, y = x.pin_memory().to(device, non_blocking=True), y.pin_memory().to(device, non_blocking=True)
else:
# CPU or MPS training
x, y = x.to(device), y.to(device)
return x, y

# init these up here, can override if init_from='resume' (i.e. from a checkpoint)
Expand Down

0 comments on commit dc14989

Please sign in to comment.