Skip to content
/ heim Public
forked from stanford-crfm/helm

Holistic Evaluation of Text-to-Image Models (HEIM), a fork of HELM to evaluate to text-to-image models (paper coming soon).

License

Notifications You must be signed in to change notification settings

nahidalam/heim

 
 

Repository files navigation

Holistic Evaluation of Text-To-Image Models

Significant effort has recently been made in developing text-to-image generation models, which take textual prompts as input and generate images. As these models are widely used in real-world applications, there is an urgent need to comprehensively understand their capabilities and risks. However, existing evaluations primarily focus on image-text alignment and image quality. To address this limitation, we introduce a new benchmark, Holistic Evaluation of Text-To-Image Models (HEIM).

We identify 12 different aspects that are important in real-world model deployment, including:

  • image-text alignment
  • image quality
  • aesthetics
  • originality
  • reasoning
  • knowledge
  • bias
  • toxicity
  • fairness
  • robustness
  • multilinguality
  • efficiency

By curating scenarios encompassing these aspects, we evaluate state-of-the-art text-to-image models using this benchmark. Unlike previous evaluations that focused on alignment and quality, HEIM significantly improves coverage by evaluating all models across all aspects. Our results reveal that no single model excels in all aspects, with different models demonstrating strengths in different aspects.

This repository contains the code used to produce the results on the website and paper. To get started, refer to the documentation.

About

Holistic Evaluation of Text-to-Image Models (HEIM), a fork of HELM to evaluate to text-to-image models (paper coming soon).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 93.0%
  • JavaScript 4.9%
  • Jupyter Notebook 0.8%
  • HTML 0.8%
  • Shell 0.3%
  • CSS 0.2%