-
Notifications
You must be signed in to change notification settings - Fork 29
Insights: intel/auto-round
Overview
Loading
Could not load contribution data
Please try again later
Loading
5 Pull requests merged by 3 people
-
[HPU]Fix compile bug when quant layer
#441 merged
Feb 14, 2025 -
align auto_quantizer with main branch in Transformers
#437 merged
Feb 14, 2025 -
rm gc collect in packing
#438 merged
Feb 13, 2025 -
fix nblocks issues
#432 merged
Feb 11, 2025 -
Fix packing hang, torch compile and force to fp16 at exporting
#430 merged
Feb 10, 2025
1 Pull request opened by 1 person
-
Bump transformers from 4.41.0 to 4.48.0 in /examples/multimodal-modeling/Phi-3-vision
#433 opened
Feb 11, 2025
4 Issues closed by 2 people
-
Problem Inference LLAMA-3.3 70B Instruct
#436 closed
Feb 13, 2025 -
Problem inference Deepseek V3
#431 closed
Feb 12, 2025 -
force to fp16 when evaluation on cuda for int4 models
#427 closed
Feb 10, 2025 -
packing hang
#429 closed
Feb 10, 2025
4 Issues opened by 2 people
-
reduce the cpu usage at packing stage for deepseekv3
#440 opened
Feb 14, 2025 -
LLAMA 3.2 Vision 90B GPU Loads on A40, fails
#439 opened
Feb 13, 2025 -
align auto_quantizer with main branch in Transformers
#435 opened
Feb 12, 2025 -
Support q2-k to q4-k
#434 opened
Feb 12, 2025