Skip to content
@InferenceNexus

InferenceNexus

Popular repositories Loading

  1. text-generation-inference text-generation-inference Public

    Forked from huggingface/text-generation-inference

    Large Language Model Text Generation Inference

    Python 1

  2. T-MAC T-MAC Public

    Forked from microsoft/T-MAC

    Low-bit LLM inference on CPU with lookup table

    C++ 1

  3. ipex-llm ipex-llm Public

    Forked from intel/ipex-llm

    Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc,…

    Python 1

  4. litellm litellm Public

    Forked from BerriAI/litellm

    Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

    Python 1

  5. litgpt litgpt Public

    Forked from Lightning-AI/litgpt

    20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

    Python 1

  6. inference-benchmarker inference-benchmarker Public

    Forked from huggingface/inference-benchmarker

    Inference server benchmarking tool

    Rust 1

Repositories

Showing 10 of 52 repositories
  • triton-server Public Forked from triton-inference-server/server

    The Triton Inference Server provides an optimized cloud and edge inferencing solution.

    InferenceNexus/triton-server’s past year of commit activity
    Python 0 BSD-3-Clause 1,550 0 0 Updated Jan 29, 2025
  • litgpt Public Forked from Lightning-AI/litgpt

    20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

    InferenceNexus/litgpt’s past year of commit activity
    Python 1 Apache-2.0 1,194 0 0 Updated Jan 24, 2025
  • litellm Public Forked from BerriAI/litellm

    Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

    InferenceNexus/litellm’s past year of commit activity
    Python 1 2,251 0 0 Updated Jan 24, 2025
  • FastChat Public Forked from lm-sys/FastChat

    An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

    InferenceNexus/FastChat’s past year of commit activity
    Python 0 Apache-2.0 4,786 0 0 Updated Jan 14, 2025
  • ncnn Public Forked from Tencent/ncnn

    ncnn is a high-performance neural network inference framework optimized for the mobile platform

    InferenceNexus/ncnn’s past year of commit activity
    C++ 0 4,262 0 0 Updated Jan 13, 2025
  • onnxruntime Public Forked from microsoft/onnxruntime

    ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

    InferenceNexus/onnxruntime’s past year of commit activity
    C++ 0 MIT 3,130 0 0 Updated Jan 13, 2025
  • optimum-intel Public Forked from huggingface/optimum-intel

    🤗 Optimum Intel: Accelerate inference with Intel optimization tools

    InferenceNexus/optimum-intel’s past year of commit activity
    Jupyter Notebook 0 Apache-2.0 128 0 0 Updated Jan 12, 2025
  • ipex-llm Public Forked from intel/ipex-llm

    Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.

    InferenceNexus/ipex-llm’s past year of commit activity
    Python 1 Apache-2.0 1,361 0 0 Updated Jan 12, 2025
  • llama-box Public Forked from gpustack/llama-box

    LM inference server implementation based on llama.cpp.

    InferenceNexus/llama-box’s past year of commit activity
    C++ 1 MIT 13 0 0 Updated Jan 12, 2025
  • mlc-llm Public Forked from mlc-ai/mlc-llm

    Universal LLM Deployment Engine with ML Compilation

    InferenceNexus/mlc-llm’s past year of commit activity
    Python 1 Apache-2.0 1,716 0 0 Updated Jan 12, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…