Skip to content

Conversation

bintang-aswam
Copy link

@bintang-aswam bintang-aswam commented Jul 10, 2025

Manually zero the gradients after updating weights by using machine epsilon for standard float (64-bit).

Fixes #ISSUE_NUMBER

Description

Checklist

  • The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
  • Only one issue is addressed in this pull request
  • Labels from the issue that this PR is fixing are added to this pull request
  • No unnecessary issues are included into this pull request.

cc @albanD @jbschlosser

Manually zero the gradients after updating weights by using machine epsilon for standard float (64-bit).
Copy link

pytorch-bot bot commented Jul 10, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3447

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5f1fbf5 with merge base b78fc75 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@svekars svekars added the core Tutorials of any level of difficulty related to the core pytorch functionality label Jul 18, 2025
Copy link
Contributor

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be able to share details on why do you want to do this change.
Setting them to None is equivalent to setting them to exactly 0. which is best no?

@bintang-aswam
Copy link
Author

Would you be able to share details on why do you want to do this change. Setting them to None is equivalent to setting them to exactly 0. which is best no?

answer:
@albanD, In other words, setting them to weight.grad = None is equivalent to setting them to weight.grad.zero_(). However, there are few conceptual difference in the context of numerically zeroing between zeroing the gradients vs tensor-multiplication of loss by machine-epsilon as follows:
image

@albanD
Copy link
Contributor

albanD commented Jul 28, 2025

Thanks for sharing these details.
I'm still not sure why this particular tutorial should be updated?
Maybe you want to do momentum-like updates here? If so, we can add a momentum term there?

Copy link

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as stale.
Feel free to remove the stale label if you feel this was a mistake.
If you are unable to remove the stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the stale Stale PRs label Sep 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed core Tutorials of any level of difficulty related to the core pytorch functionality stale Stale PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants