Exploring new optimī optimizers #819
Replies: 3 comments 1 reply
-
oh cool! that's a pretty nice test. makes me think it could be useful for jamming more creativity into the model |
Beta Was this translation helpful? Give feedback.
-
I added Lion to the mix: https://huggingface.co/mikaelh/flux-sanna-marin-v0.4-fp8-lion The challenge with Lion is that it converges extremely fast. I ended up dropping learning rate by 10x, increasing weight decay by 10x, increasing warmup steps to 400 and switching to polynomial schedule. The final validation image at 6000 steps is looking good:
I did a little 3-way prompt comprehension test. AdamW seems to be the winner with Lion also performing well. Adan is the worst so maybe I need to try adjusting some parameters. sanna marin playing tennis
sanna marin running a marathron
sanna marin sitting on a plane
sanna marin opening a can in the kitchen
sanna marin petting a cat
|
Beta Was this translation helpful? Give feedback.
-
I managed to improve results with Adan by setting beta3 to 0.999 (instead of 0.99). Adan is still producing results which are different from AdamW and Lion with the same seed. https://huggingface.co/mikaelh/flux-sanna-marin-v0.4-fp8-adan2 I added three new prompts to the comparison. sanna marin playing tennis
sanna marin running a marathon
sanna marin sitting on a plane
sanna marin opening a can in the kitchen
sanna marin petting a cat
sanna marin catching a red ball
sanna marin picking up a strawberry
sanna marin picking up a strawberry
|
Beta Was this translation helpful? Give feedback.
-
I've been playing around with the new optimī optimizers. One of the new optimizers is Adan which is supposed to be superior to AdamW while using more VRAM:
https://optimi.benjaminwarner.dev/optimizers/adan/
The Adan optimizer requires a higher learning rate. I ended up setting it to 1e-3 which is 2.5 times higher than with AdamW. As expected, Adan is clearly converging faster. Given how slow Flux training is, Adan seems to be worth considering.
The LoRAs are available on the Huggingface Hub:
https://huggingface.co/mikaelh/flux-sanna-marin-v0.4-fp8-adamw-stochastic
https://huggingface.co/mikaelh/flux-sanna-marin-v0.4-fp8-adan
The exact config.env settings can also be found on the Hub.
Beta Was this translation helpful? Give feedback.
All reactions