Tags: subhankar-ghosh/NeMo
Tags
Support TE-DPA For Stable Diffusion (NVIDIA#10314) * [SD] Add te-dpa support Signed-off-by: Wil Kong <[email protected]> * [SD] Add te-dpa support, resolve compatibility with TE-master Signed-off-by: Wil Kong <[email protected]> * [SD] Add te-dpa support, add check for attention configs. Signed-off-by: Wil Kong <[email protected]> * Fix bugs of flash-attn and dpa in SD. Signed-off-by: Wil Kong <[email protected]> * Fix the issue of DPA API change. Signed-off-by: Wil Kong <[email protected]> * Apply isort and black reformatting Signed-off-by: alpha0422 <[email protected]> Signed-off-by: Wil Kong <[email protected]> * [SD] TE-DPA: disbale use te-dpa in inference flow. --------- Signed-off-by: Wil Kong <[email protected]> Signed-off-by: alpha0422 <[email protected]> Co-authored-by: Mengdi Wang <[email protected]>
add manifest file (NVIDIA#10161) Signed-off-by: Oliver Koenig <[email protected]>
Add option for mutex timeout in distributed optimizer backward hook (N… …VIDIA#9087) * Tim: Add option for timeout in distopt callback mutex Signed-off-by: Jaemin Choi <[email protected]> * Replace parent's _lock Signed-off-by: Jaemin Choi <[email protected]> * Revert "Replace parent's _lock" This reverts commit 972d1b6. Signed-off-by: Jaemin Choi <[email protected]> * Raise RuntimeError when timeout Signed-off-by: Jaemin Choi <[email protected]> * Change RuntimeError to print Signed-off-by: Jaemin Choi <[email protected]> --------- Signed-off-by: Jaemin Choi <[email protected]> Co-authored-by: Jaemin Choi <[email protected]>
update github raw content link (NVIDIA#8517) Signed-off-by: Chen Cui <[email protected]>
Update Apex install command in Dockerfile (NVIDIA#7794) * move core install to /workspace (NVIDIA#7706) Signed-off-by: Abhinav Khattar <[email protected]> * update apex install in dockerfile Signed-off-by: eharper <[email protected]> * use fetch head Signed-off-by: eharper <[email protected]> --------- Signed-off-by: Abhinav Khattar <[email protected]> Signed-off-by: eharper <[email protected]> Co-authored-by: Abhinav Khattar <[email protected]>
Eagerly accumulate embedding grads into fp32 buffer (NVIDIA#6958) Signed-off-by: Tim Moon <[email protected]>
PreviousNext