Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support G-Pass@k and LiveMathBench #1772

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

jnanliu
Copy link
Contributor

@jnanliu jnanliu commented Dec 20, 2024

Motivation

Support the evaluation using G-Pass@k metric and update the configurations of LiveMathBench.

Modification

  • Implement GPassKEvaluator, which supports all reasoning tasks through the abstract preprocess, group, and reduce function.
  • Support to load LiveMathBench from huggingface.
  • Implement LiveMathBenchEvaluator which is inherited from GPassKEvaluator for mathematical reasoning tasks and support restart eval from the checkpoint.
  • Modify turbomind_with_tf_above_v4_33.py to ensure that gen_cfg of model_cfg is passed into the lmdeploy pipeline.
  • Print gen_config of inference backend for ease of debugging.
  • Modify openai_api.py to print error url for ease of debugging.

Checklist

Before PR:

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
  • The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • The documentation has been modified accordingly, like docstring or example tutorials.

gen_config['temperature'] = temperature
# gen_config['top_k'] = 40
# gen_config['temperature'] = temperature
pass # use the parameters passed from gen_config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this modification will introduce BC? @MaiziXiao

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old implementation will overwrite top_k and temperature if 'do_sample','top_k' and 'temperature' are all set. This change will make sure 'top_k' and 'temperature' will not be overwritten.

@jnanliu We should remove these lines here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants