Skip to content
@inference-serving

inference-serving

Popular repositories Loading

  1. energy-inference energy-inference Public

    Forked from grantwilkins/energy-inference

    Code for MPhil Thesis at University of Cambridge

    Python 1

  2. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python

  3. Sequence-Scheduling Sequence-Scheduling Public

    Forked from zhengzangw/Sequence-Scheduling

    PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".

    Python

  4. LLM-serving-with-proxy-models LLM-serving-with-proxy-models Public

    Forked from James-QiuHaoran/LLM-serving-with-proxy-models

    Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction

    Jupyter Notebook

  5. pytorch-transformer pytorch-transformer Public

    Forked from hkproj/pytorch-transformer

    Attention is all you need implementation

    Jupyter Notebook

  6. text-generation-inference text-generation-inference Public

    Forked from huggingface/text-generation-inference

    Large Language Model Text Generation Inference

    Python

Repositories

Showing 10 of 22 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…