Stars
3
stars
written in Jupyter Notebook
Clear filter
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads