-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Support for Apple Silicon #1289
base: main
Are you sure you want to change the base?
Conversation
- No gguf support yet. - Build Triton and bitsandbytes from source - `cmake -DCOMPUTE_BACKEND=hip -S .` for bitsandbytes building
Is this working? |
Hi there thank you for this we will need a bit more time to review! :) |
Hi @shashikanth-a - thank you for this. Could you please provide information about the environment and package versions you used for development? |
Hey, does this works with newly released vision support? |
Currently I can run this if:
|
- lazy loading of model - minor refactoring - optimizers and lr schedulers - gc - should improve memory consumption
With the changes I can run this out of the box with the steps outlined above:
On a M4 Pro getting around 100 t/s for llama3-8b. Can confirm it will also now work with llama-3.2-3b |
Thanks a lot - would anyone be so kind to benchmark this against MLX itself and share results? Time it took, amount of VRAM, context length, if the losses match - ofcourse it's a lot so just time and checking to see if the losses match would be more than helpful. Thank you so much! :) |
cmake -DCOMPUTE_BACKEND=mps -S .
for bitsandbytes building