Skip to content

Tags: pytorch/ao

Tags

ciflow/rocm/1868

Toggle ciflow/rocm/1868's commit message
Update ROCm MFMA instruction syntax in sparse Marlin MMA implementation

Modify the MFMA instruction assembly for AMD GPUs to use correct syntax and operand handling. Replace register constraints with vector register constraints and simplify the instruction format to improve compatibility and readability on ROCm platforms.

ciflow/rocm/1847

Toggle ciflow/rocm/1847's commit message
wip

v0.9.0

Toggle v0.9.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
formating config.py

v0.9.0-rc1

Toggle v0.9.0-rc1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
formating config.py

v0.8.0

Toggle v0.8.0's commit message
Verify that submodules are checked out (#1536)

v0.8.0-rc3

Toggle v0.8.0-rc3's commit message
Revert "Remove setup changes"

This reverts commit fbe7ac2.

v0.8.0-rc2

Toggle v0.8.0-rc2's commit message
Verify that submodules are checked out (#1536)

v0.8.0-rc1

Toggle v0.8.0-rc1's commit message
Check binaries for release/0.8

v0.7.0-rc3

Toggle v0.7.0-rc3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
update test-infra to release version (#1391)

* update test-infra to release version

Summary:

pytorch/test-infra#6016 landed recently which is
breaking our ROCm builds

We point to a special branch of test-infra created just before this PR
to unblock the v0.7.0 release.

Test Plan: CI

Reviewers:

Subscribers:

Tasks:

Tags:

* Update .github/workflows/build_wheels_linux.yml

---------

Co-authored-by: Andrey Talman <[email protected]>

v0.7.0

Toggle v0.7.0's commit message
Add TTFT benchmarks + update sparsity benchmarks (#1140)

This PR adds in TTFT token benchmarks to torchAO, and also updates the benchmarking script to handle sparsity a bit nicer + use the 2:4 sparse checkpoints that are available.

Additionally also adds in padding support for int8 dynamic quant + 2:4 sparsity, which we were missing before.