inference-serving
Popular repositories Loading
-
energy-inference
energy-inference PublicForked from grantwilkins/energy-inference
Code for MPhil Thesis at University of Cambridge
Python 1
-
vllm
vllm PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
-
Sequence-Scheduling
Sequence-Scheduling PublicForked from zhengzangw/Sequence-Scheduling
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
Python
-
LLM-serving-with-proxy-models
LLM-serving-with-proxy-models PublicForked from James-QiuHaoran/LLM-serving-with-proxy-models
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction
Jupyter Notebook
-
pytorch-transformer
pytorch-transformer PublicForked from hkproj/pytorch-transformer
Attention is all you need implementation
Jupyter Notebook
-
text-generation-inference
text-generation-inference PublicForked from huggingface/text-generation-inference
Large Language Model Text Generation Inference
Python
Repositories
- energy-inference Public Forked from grantwilkins/energy-inference
Code for MPhil Thesis at University of Cambridge
inference-serving/energy-inference’s past year of commit activity - ni-science-festival Public
inference-serving/ni-science-festival’s past year of commit activity - TAO-Amodal Public Forked from WesleyHsieh0806/TAO-Amodal
Official Code for Tracking Any Object Amodally
inference-serving/TAO-Amodal’s past year of commit activity - LLM-serving-with-proxy-models Public Forked from James-QiuHaoran/LLM-serving-with-proxy-models
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction
inference-serving/LLM-serving-with-proxy-models’s past year of commit activity - deepstream_python_apps Public Forked from NVIDIA-AI-IOT/deepstream_python_apps
DeepStream SDK Python bindings and sample applications
inference-serving/deepstream_python_apps’s past year of commit activity - Programming-Massively-Parallel-Processors Public Forked from R100001/Programming-Massively-Parallel-Processors
inference-serving/Programming-Massively-Parallel-Processors’s past year of commit activity - jetson-inference Public Forked from dusty-nv/jetson-inference
Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
inference-serving/jetson-inference’s past year of commit activity - vision-transformer-from-scratch Public Forked from tintn/vision-transformer-from-scratch
A Simplified PyTorch Implementation of Vision Transformer (ViT)
inference-serving/vision-transformer-from-scratch’s past year of commit activity - ATS Public Forked from adaptivetokensampling/ATS
Adaptive Token Sampling for Efficient Vision Transformers (ECCV 2022 Oral Presentation)
inference-serving/ATS’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…