A collection of tools to measure inference latency for foundations models in Amazon Bedrook and OpenAI. Reports time to first token, and total time.
- bedrock-latency-benchmark.ipynb - Measure LLM latency across scenarios like:
- Different number of in/out tokes
- Compare latency across models from Amazon Bedrock and OpenAI.
- Latency for the same model acorss AWS Regions.
- Measure latency for text-to-image models - Coming soon.
git clone https://github.com/gilinachum/bedrock-latency
- Open relevant notebook.