-
Notifications
You must be signed in to change notification settings - Fork 257
Insights: pytorch/ao
Overview
Could not load contribution data
Please try again later
18 Pull requests merged by 11 people
-
Add a triton kernel for swizziling
#2168 merged
May 9, 2025 -
Enabling MOE Quantization using linear decomposition
#2043 merged
May 8, 2025 -
Remove broken test
#2188 merged
May 8, 2025 -
Add serialization support for
AOPerModuleConfig
#2186 merged
May 8, 2025 -
Generate speedup for inference
#2151 merged
May 7, 2025 -
Fix cuda compile error with bf16
#2122 merged
May 7, 2025 -
[BE] Fix MPS experimental workflow
#2181 merged
May 7, 2025 -
Bump version to 0.12.0
#2178 merged
May 6, 2025 -
Fix linux cpu builds. Resolves nightly build for mac stops on 0422
#2170 merged
May 6, 2025 -
[reland] Fixing aliasing behavior for slice in AQT int4wo layout
#2176 merged
May 6, 2025 -
Revert "Fixing aliasing behavior for slice in AQT TensorCoreTiledLayout"
#2175 merged
May 6, 2025 -
Fixing aliasing behavior for slice in AQT TensorCoreTiledLayout
#2174 merged
May 6, 2025 -
Update ruff version in dev-requirements to match CI
#2172 merged
May 5, 2025 -
Remove fix not needed anymore after moving CUTLASS pin to v3.9.0
#2160 merged
May 3, 2025 -
Update QAT README.md
#2162 merged
May 2, 2025 -
Removes pinned version for pytest
#2158 merged
May 2, 2025 -
[MX] Remove mxfp8 kernel and rely on cublas
#2130 merged
May 2, 2025 -
Uses torch.version.cuda to compile CUDA extensions
#2163 merged
May 2, 2025
10 Pull requests opened by 10 people
-
Update utils_parallel_dequant.cuh
#2164 opened
May 2, 2025 -
metal lowbit kernels: qmv_fast optimization
#2167 opened
May 3, 2025 -
Add support for KleidiAI int4 kernels on aarch64 Linux
#2169 opened
May 4, 2025 -
tesor scaling added
#2171 opened
May 5, 2025 -
[PT2E] Fix per-tensor observer issue with varying shape & rank
#2177 opened
May 6, 2025 -
Eval hf models using lm_eval
#2179 opened
May 6, 2025 -
Set eps in end-to-end QAT flow
#2180 opened
May 6, 2025 -
[Do not Land] Re-land "Add INT8 SDPA path for CPU" (#2093)
#2183 opened
May 7, 2025 -
[Not for land] remove workaround for slow rowwise cutlass gemm
#2185 opened
May 8, 2025 -
[ONLY FOR TEST] test macos whl issue
#2187 opened
May 8, 2025
5 Issues closed by 4 people
-
nightly build for mac stops on 0422
#2157 closed
May 6, 2025 -
Torchao's CPU overhead counteracts the performance benefit of using quantization kernel.
#1930 closed
May 6, 2025 -
QAT docs
#2155 closed
May 2, 2025 -
[Doc] gemlite version
#1653 closed
May 2, 2025
2 Issues opened by 1 person
-
[float8] Support power of 2 scales with PerRow scales for inference
#2182 opened
May 7, 2025
17 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Add subclass based method for inference w/ MXFP8
#2132 commented on
May 8, 2025 • 9 new comments -
Support microbenchmarking for low precision training
#2101 commented on
May 8, 2025 • 5 new comments -
Remove preserve_zero and zero_point_domain from choose_qparams_affine
#2149 commented on
May 5, 2025 • 3 new comments -
Support INT8 SDPA template for CPU
#2148 commented on
May 7, 2025 • 1 new comment -
Arm_inductor_quantizer for Pt2e quantization
#2139 commented on
May 9, 2025 • 1 new comment -
[testing]Triaging ROCm wheel build
#2161 commented on
May 9, 2025 • 0 new comments -
Implement dtensor.shard_dim_alltoall, aten.contiguous, aten.chunk
#2154 commented on
May 5, 2025 • 0 new comments -
[PT2E][X86] Migrate fusion passes in Inductor to torchao
#2140 commented on
May 7, 2025 • 0 new comments -
[CPU] enable int8_dynamic_activation_int4_weight with Int4CPULayout
#2128 commented on
May 7, 2025 • 0 new comments -
Enhance test_autoquant_compile to support ROCm
#2100 commented on
May 9, 2025 • 0 new comments -
ROCm mx-fp8 Gemm
#2066 commented on
May 6, 2025 • 0 new comments -
Feat: Implementation of the DeepSeek blockwise quantization for fp8 tensors
#1763 commented on
May 8, 2025 • 0 new comments -
MX single node performance tracker
#1768 commented on
May 8, 2025 • 0 new comments -
[PT2E] observers do not handle inputs with different shapes correctly
#2112 commented on
May 8, 2025 • 0 new comments -
KleidiAI int4 kernels not loading properly on aarch64 Linux
#2143 commented on
May 5, 2025 • 0 new comments -
QAT model drops accuracy after converting with torch.ao.quantization.convert
#2138 commented on
May 5, 2025 • 0 new comments -
[tracker] Low precision training for MoEs
#2147 commented on
May 3, 2025 • 0 new comments