Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: How does the token-level question-aware compression work? #141

Closed
acnagle opened this issue Apr 30, 2024 · 1 comment
Closed
Assignees
Labels
question Further information is requested

Comments

@acnagle
Copy link

acnagle commented Apr 30, 2024

Describe the issue

I saw #103 asked a similar question, but I'm not sure I understand how this works with respect to Equation 5 from the first LLMLingua paper. If I have a query with condition_in_question=after and condition_compare=True, then my understanding is that this means that the probability of a compressed segment will not be actually conditioned on the query since the context (which is the thing we are compressing) comes before the query appears. I know this is probably not actually a problem in the code implementation, but I don't fully understand the implementation and I'm trying to connect between what the paper says and what the code is doing.

I see that Equation 3 in the LongLLMLingua uses the contrastive complexity to score each token, but are we still first segmenting the context and then pruning tokens from the context in a similar way as the original LLMLingua paper? Based on Equation 3, I'm confused how to properly condition on the query. So, regardless of what condition_in_question is, do we put the query the LLM context before the rest of the context in order to compute the ppl on x_i?

Any help is greatly appreciated!

@acnagle acnagle added the question Further information is requested label Apr 30, 2024
@iofu728 iofu728 self-assigned this May 7, 2024
@iofu728
Copy link
Contributor

iofu728 commented May 7, 2024

Hi @acnagle, Thank you for your question. At the segment level, we only use the condition_in_question parameter, which specifies whether the question is positioned before or after the context.
At the token level, we only use the condition_compare parameter to choose between using perplexity or conditional perplexity.

Therefore, with condition_in_question=after and condition_compare=True, the compression of segment $P(context|question)$ is not based on the question.

Yes, we will still segment the context and use a method similar to equation (3). Also, the condition_in_question parameter does not take effect at the token level; it is controlled solely by condition_compare.

@acnagle acnagle closed this as completed Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants