Tags · pytorch/ao

ciflow/rocm/1868

Update ROCm MFMA instruction syntax in sparse Marlin MMA implementation

Modify the MFMA instruction assembly for AMD GPUs to use correct syntax and operand handling. Replace register constraints with vector register constraints and simplify the instruction format to improve compatibility and readability on ROCm platforms.

Mar 11, 2025
3e5a411
zip
tar.gz

ciflow/rocm/1847

wip

Mar 6, 2025
f580c78
zip
tar.gz

v0.9.0

formating config.py

Feb 20, 2025
14cfbc7
zip
tar.gz
Notes

v0.9.0-rc1

formating config.py

Feb 20, 2025
14cfbc7
zip
tar.gz

v0.8.0

Verify that submodules are checked out (#1536)

Jan 15, 2025
192eed5
zip
tar.gz
Notes

v0.8.0-rc3

Revert "Remove setup changes"

This reverts commit fbe7ac2.

Jan 15, 2025
9bbd9a1
zip
tar.gz

v0.8.0-rc2

Verify that submodules are checked out (#1536)

Jan 15, 2025
192eed5
zip
tar.gz

v0.8.0-rc1

Check binaries for release/0.8

Jan 13, 2025
a14f749
zip
tar.gz

v0.7.0-rc3

update test-infra to release version (#1391)

* update test-infra to release version

Summary:

pytorch/test-infra#6016 landed recently which is
breaking our ROCm builds

We point to a special branch of test-infra created just before this PR
to unblock the v0.7.0 release.

Test Plan: CI

Reviewers:

Subscribers:

Tasks:

Tags:

* Update .github/workflows/build_wheels_linux.yml

---------

Co-authored-by: Andrey Talman <[email protected]>

Dec 9, 2024
e39126a
zip
tar.gz
Notes

v0.7.0

Add TTFT benchmarks + update sparsity benchmarks (#1140)

This PR adds in TTFT token benchmarks to torchAO, and also updates the benchmarking script to handle sparsity a bit nicer + use the 2:4 sparse checkpoints that are available.

Additionally also adds in padding support for int8 dynamic quant + 2:4 sparsity, which we were missing before.

Dec 5, 2024
f04aec7
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ciflow/rocm/1868

ciflow/rocm/1847

v0.9.0

v0.9.0-rc1

v0.8.0

v0.8.0-rc3

v0.8.0-rc2

v0.8.0-rc1

v0.7.0-rc3

v0.7.0

Tags: pytorch/ao