Skip to content

Tags: ray-project/llmperf

Tags

v2.0

Toggle v2.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
LLMPerfV2 (#19)

* LLMPerfV2

The latest version of LLMPerf brings a suite of significant updates designed to provide more in-depth and customizable benchmarking capabilities for LLM inference. These updates include:

- Expanded metrics with quantile distribution (P25-99): Comprehensive data representation for deeper insights.
- Customizable benchmarking parameters: Tailor parameters to fit specific use case scenarios.
- Introduction of load test and correctness test: Assessing performance and accuracy under stress.
- Broad compatibility: Supports a range of products including [Anyscale Endpoints](https://www.anyscale.com/endpoints), [OpenAI](https://openai.com/blog/openai-api), [Anthropic](https://docs.anthropic.com/claude/reference/getting-started-with-the-api), [together.ai](http://together.ai/), [Fireworks.ai](https://app.fireworks.ai/), [Perplexity](https://www.perplexity.ai/), [Huggingface](https://huggingface.co/inference-endpoints), [Lepton AI](https://www.lepton.ai/docs/overview/model_apis), and various APIs supported by the [LiteLLM project](https://litellm.ai/)).
- Easy addition of new LLMs via the LLMClient API.

Signed-off-by: Avnish Narayan <[email protected]>