You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Because top-k sampling is used in the draft phase, when a previous draft token is rejected, the sampling probability of the current draft token is 1. This still aligns with the theoretical guarantees of speculative sampling, ensuring losslessness and greater efficiency. You can also refer to #50, which includes validation on the invariance of the distribution.
Thank you for the detailed explanation! Since the speculative decoding method in this implementation differs from the approach outlined in the paper, would it be possible for you to help me correct the following algorithm diagram? Do I understand the current implementation correctly?
This would help clarify the modifications and make it easier for everyone to understand the design choices.
Hi, Teams,
Thanks for the great work!
I have a question about the
evaluate_posterior
function, especially about this line:EAGLE/eagle/model/utils.py
Line 387 in 35c78f6
Although it seems like a more strict version, it doesn't align with the original speculative sampling method. May I understand why?
The text was updated successfully, but these errors were encountered: