You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The checkpoint is stored in float32, not TF32. It should be fine to convert it to bfloat16, but a few parameters should ideally stay in float32 (one pos_frequencies tensor, and q_norm_x.weight, q_norm_y.weight, k_norm_x.weight, k_norm_y.weight in each block).
The checkpoint is stored in float32, not TF32. It should be fine to convert it to bfloat16, but a few parameters should ideally stay in float32 (one pos_frequencies tensor, and q_norm_x.weight, q_norm_y.weight, k_norm_x.weight, k_norm_y.weight in each block).
The checkpoint is stored in float32, not TF32. It should be fine to convert it to bfloat16, but a few parameters should ideally stay in float32 (one pos_frequencies tensor, and q_norm_x.weight, q_norm_y.weight, k_norm_x.weight, k_norm_y.weight in each block).
Hi
I'd like to ask 2 questions
The checkpoint is stored in float32, but why run in bfloat16?
How to run in float32 dtype? I changed model_dtype="bf16" to model_dtype="fp32" in cli.py, but it prompts "assert self.kwargs["model_dtype"] == "bf16", "FP8 is not supported for multi-GPU inference""
Hi
The diffusion model has 10B parameter, but I found the dit.safetensors is 40GB in size? What's the dtype store in the model? In TF32?
Looking forward to the feedback, thanks.
The text was updated successfully, but these errors were encountered: