Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trainer does not save checkpoints every n steps as it should #23

Merged
merged 5 commits into from
Mar 21, 2023

Conversation

albertogaspar
Copy link
Contributor

Pull request type

Please check the type of change your PR introduces:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Documentation content changes
  • Other (please describe):

What is the current behavior?

The trainer is supposed to save a new checkpoint every n steps but it does not do that.

What is the new behavior?

The trainer does save a new checkpoint every n steps.

Other information

@gianlucadetommaso gianlucadetommaso changed the title Trainer does not save checkpoints every n stesps as it should Trainer does not save checkpoints every n steps as it should Mar 20, 2023
@gianlucadetommaso gianlucadetommaso merged commit ac9a79f into awslabs:main Mar 21, 2023
@albertogaspar albertogaspar deleted the fix_ckptr branch April 16, 2023 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants