-
Notifications
You must be signed in to change notification settings - Fork 16
Insights: HabanaAI/vllm-hpu-extension
Overview
-
0 Active issues
-
- 8 Merged pull requests
- 2 Open pull requests
- 0 Closed issues
- 0 New issues
Could not load contribution data
Please try again later
8 Pull requests merged by 6 people
-
Return dummy value as build for fake-hpu
#82 merged
Jan 22, 2025 -
[SW-199650] Add HPU fp8 DynamicMOE Op
#81 merged
Jan 22, 2025 -
[SW-216413] - Remove shutdown_inc call in calibration process
#80 merged
Jan 22, 2025 -
Fix condition to compile_one_hot flag
#79 merged
Jan 21, 2025 -
Add version range to fix issues with one_hot on t.compile on 1.19 builds
#78 merged
Jan 21, 2025 -
Capabilities overhaul
#76 merged
Jan 20, 2025 -
Remove repeat KV cache
#69 merged
Jan 20, 2025 -
Add error handing in unify measurements.
#77 merged
Jan 20, 2025
2 Pull requests opened by 2 people
-
Create CODEOWNERS
#83 opened
Jan 22, 2025 -
Add change for interleave sliding window
#84 opened
Jan 22, 2025
3 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[H3C] quantization step 4 failed with Llama 3.1 70B on TP2 config.
#74 commented on
Jan 20, 2025 • 0 new comments -
vLLM-Ext: Full enabling of ALiBi
#60 commented on
Jan 22, 2025 • 0 new comments -
Add exponential bucketing PoC
#61 commented on
Jan 20, 2025 • 0 new comments