[Question]: How does the token-level question-aware compression work? #141

acnagle · 2024-04-30T23:19:26Z

Describe the issue

I saw #103 asked a similar question, but I'm not sure I understand how this works with respect to Equation 5 from the first LLMLingua paper. If I have a query with condition_in_question=after and condition_compare=True, then my understanding is that this means that the probability of a compressed segment will not be actually conditioned on the query since the context (which is the thing we are compressing) comes before the query appears. I know this is probably not actually a problem in the code implementation, but I don't fully understand the implementation and I'm trying to connect between what the paper says and what the code is doing.

I see that Equation 3 in the LongLLMLingua uses the contrastive complexity to score each token, but are we still first segmenting the context and then pruning tokens from the context in a similar way as the original LLMLingua paper? Based on Equation 3, I'm confused how to properly condition on the query. So, regardless of what condition_in_question is, do we put the query the LLM context before the rest of the context in order to compute the ppl on x_i?

Any help is greatly appreciated!

The text was updated successfully, but these errors were encountered:

iofu728 · 2024-05-07T10:27:18Z

Hi @acnagle, Thank you for your question. At the segment level, we only use the condition_in_question parameter, which specifies whether the question is positioned before or after the context.
At the token level, we only use the condition_compare parameter to choose between using perplexity or conditional perplexity.

Therefore, with condition_in_question=after and condition_compare=True, the compression of segment $P(context|question)$ is not based on the question.

Yes, we will still segment the context and use a method similar to equation (3). Also, the condition_in_question parameter does not take effect at the token level; it is controlled solely by condition_compare.

acnagle added the question Further information is requested label Apr 30, 2024

iofu728 self-assigned this May 7, 2024

acnagle closed this as completed Aug 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: How does the token-level question-aware compression work? #141

[Question]: How does the token-level question-aware compression work? #141

acnagle commented Apr 30, 2024

iofu728 commented May 7, 2024

[Question]: How does the token-level question-aware compression work? #141

[Question]: How does the token-level question-aware compression work? #141

Comments

acnagle commented Apr 30, 2024

Describe the issue

iofu728 commented May 7, 2024