Skip to content
/ helm Public
forked from stanford-crfm/helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).

License

Notifications You must be signed in to change notification settings

danielz02/helm

Repository files navigation

Welcome! This repository contains all the assets for the CRFM benchmarking project, which includes the following features:

  • Collection of datasets in a standard format (e.g., NaturalQuestions)
  • Collection of models accessible via a unified API (e.g., GPT-3, MT-NLG, OPT, BLOOM)
  • Collection of metrics beyond accuracy (efficiency, bias, toxicity, etc.)
  • Collection of perturbations for evaluating robustness and fairness (e.g., typos, dialect)
  • Modular framework for constructing prompts from datasets
  • Proxy server for managing accounts and providing unified interface to access models

To read more:

About

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.2%
  • JavaScript 3.5%
  • Jupyter Notebook 0.6%
  • HTML 0.5%
  • Shell 0.1%
  • CSS 0.1%