You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all! I find this interesting, and I would like to participate.
However, it's unclear to me what the "goal" is. I.e., when should we stop the clock?
When we reach a certain training validation loss?
When we reach a certain generation quality, according to some fidelity metric?
Both?
Additionally, when should the clock be running? In the modded-nanogpt speedrun, we only allow the clock to run during training loops, including data fetching between steps, but not during validation. I propose we do the same as modded-nanogpt and make this explicit and also log everything into text files.
And IMO, it's best to have an initial, downloadable benchmark logs we can compare against.
The text was updated successfully, but these errors were encountered:
Hi all! I find this interesting, and I would like to participate.
However, it's unclear to me what the "goal" is. I.e., when should we stop the clock?
Additionally, when should the clock be running? In the
modded-nanogpt
speedrun, we only allow the clock to run during training loops, including data fetching between steps, but not during validation. I propose we do the same asmodded-nanogpt
and make this explicit and also log everything into text files.And IMO, it's best to have an initial, downloadable benchmark logs we can compare against.
The text was updated successfully, but these errors were encountered: