Pulse · intel/auto-round · GitHub

February 8, 2025 – February 15, 2025

Overview

6 Active pull requests

8 Active issues

5 Pull requests merged by 3 people

[HPU]Fix compile bug when quant layer
#441 merged Feb 14, 2025
align auto_quantizer with main branch in Transformers
#437 merged Feb 14, 2025
rm gc collect in packing
#438 merged Feb 13, 2025
fix nblocks issues
#432 merged Feb 11, 2025
Fix packing hang, torch compile and force to fp16 at exporting
#430 merged Feb 10, 2025

1 Pull request opened by 1 person

Bump transformers from 4.41.0 to 4.48.0 in /examples/multimodal-modeling/Phi-3-vision
#433 opened Feb 11, 2025

4 Issues closed by 2 people

Problem Inference LLAMA-3.3 70B Instruct
#436 closed Feb 13, 2025
Problem inference Deepseek V3
#431 closed Feb 12, 2025
force to fp16 when evaluation on cuda for int4 models
#427 closed Feb 10, 2025
packing hang
#429 closed Feb 10, 2025

4 Issues opened by 2 people

reduce the cpu usage at packing stage for deepseekv3
#440 opened Feb 14, 2025
LLAMA 3.2 Vision 90B GPU Loads on A40, fails
#439 opened Feb 13, 2025
align auto_quantizer with main branch in Transformers
#435 opened Feb 12, 2025
Support q2-k to q4-k
#434 opened Feb 12, 2025