-
flow Public
Forked from lmnr-ai/flowA lightweight task engine for building stateful AI agents that prioritizes simplicity and flexibility.
Python Apache License 2.0 UpdatedDec 18, 2024 -
vllm-kvcompress Public
Forked from IsaacRe/vllm-kvcompressKV cache compression for high-throughput LLM inference
Python Apache License 2.0 UpdatedOct 2, 2024 -
-
InfiniGen Public
Forked from snu-comparch/InfiniGenInfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)
Python Apache License 2.0 UpdatedJul 10, 2024 -
flux Public
Forked from bytedance/fluxA fast communication-overlapping library for tensor parallelism on GPUs.
C++ Apache License 2.0 UpdatedJun 14, 2024 -
NvidiaTransformerEngine Public
Forked from NVIDIA/TransformerEngineA library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
Python Apache License 2.0 UpdatedJun 4, 2024 -
long-context-attention Public
Forked from feifeibear/long-context-attentionSequence Parallel Attention for Long Context LLM Model Training and Inference
Python UpdatedMay 28, 2024 -
llm_long_context_bench202405 Public
Forked from SomeoneKong/llm_long_context_bench202405Python Apache License 2.0 UpdatedMay 28, 2024 -
cutlass-flash-attention Public
Forked from 66RING/tiny-flash-attentionflash attention tutorial written in python, triton, cuda, cutlass
Cuda UpdatedMay 10, 2024 -
MiniGPT4-video Public
Forked from Vision-CAIR/MiniGPT4-videoPython BSD 3-Clause "New" or "Revised" License UpdatedApr 14, 2024 -
MiniGPT-4 Public
Forked from Vision-CAIR/MiniGPT-4Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Python BSD 3-Clause "New" or "Revised" License UpdatedApr 1, 2024 -
MetaGPT-agent Public
Forked from geekan/MetaGPT🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Python MIT License UpdatedMar 22, 2024 -
multimodal-ai-jina Public
Forked from jina-ai/serve☁️ Build multimodal AI applications with cloud-native stack
Python Apache License 2.0 UpdatedMar 20, 2024 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedMar 14, 2024 -
text2video-generative-models Public
Forked from Stability-AI/generative-modelsGenerative Models by Stability AI
Python MIT License UpdatedFeb 21, 2024 -
Latte Public
Forked from Vchitect/LatteThe official implementation of Latte: Latent Diffusion Transformer for Video Generation.
Python MIT License UpdatedFeb 21, 2024 -
langchain Public
Forked from langchain-ai/langchain⚡ Building applications with LLMs through composability ⚡
Python MIT License UpdatedJan 21, 2024 -
autogen Public
Forked from microsoft/autogenEnable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ
Jupyter Notebook Creative Commons Attribution 4.0 International UpdatedJan 21, 2024 -
mistral-src Public
Forked from mistralai/mistral-inferenceReference implementation of Mistral AI 7B v0.1 model.
Jupyter Notebook Apache License 2.0 UpdatedDec 23, 2023 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedDec 18, 2023 -
llm-benchmark-test Public
Forked from xzzWZY/open-framework-measurementinclude LLM open framework measurement
Python UpdatedDec 4, 2023 -
tgi-benchmarking Public
Forked from robertgshaw2-neuralmagic/tgi-benchmarkingBenchmarking LLMs on GPUs
Jupyter Notebook UpdatedNov 20, 2023 -
generative-ai-for-beginners Public
Forked from microsoft/generative-ai-for-beginners12 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Jupyter Notebook MIT License UpdatedNov 17, 2023 -
blora-text-generation-inference Public
Forked from robertgshaw2-neuralmagic/blora-text-generation-inferenceBatched LORA + Continuous Batching
Python Apache License 2.0 UpdatedSep 28, 2023 -
BLoRA-TGI-with-python-server Public
Forked from robertgshaw2-neuralmagic/BLoRA-TGIBatched Lora + Continuous Batching
Python UpdatedSep 13, 2023 -
tvm-mlc-llm Public
Forked from mlc-ai/mlc-llmEnable everyone to develop, optimize and deploy AI models natively on everyone's devices.
Python Apache License 2.0 UpdatedAug 13, 2023 -
gpu-profiling Public
Forked from robertgshaw2-neuralmagic/gpu-profilingGPU Profiling
Jupyter Notebook UpdatedAug 9, 2023 -
leetcode-master Public
Forked from youngyangyang04/leetcode-master《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
Shell UpdatedAug 7, 2023 -
CPlusPlus-Tutorial Public
Forked from Light-City/CPlusPlusThingsC++ Tutorial
C++ UpdatedJul 28, 2023 -
ComputeLibrary Public
Forked from ARM-software/ComputeLibraryThe Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
C++ MIT License UpdatedJul 21, 2023