Highlights
- Pro
Lists (7)
Sort Name ascending (A-Z)
💭 Chain of Thought
Let's think step by step.🔥 ChatGPT and Beyond
All kinds of GPTs and GPT-derived applications.🚀 Efficient Deep Learning
Model compression, acceleration, parameter efficient fine-tuning, etc.🔬 Medical Image Analysis
AI-assisted deep learning models for medical image analysis.🧐 Mixure of Experts
Two heads are better than one.🚢 Pretrained Models
Vision, language, audio, and multimodal pretrained models.⚡ Training Infrastructures
Frameworks for efficiently training or serving deep learning models.Starred repositories
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
A generative world for general-purpose robotics & embodied AI learning.
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
[ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
Landmark Attention: Random-Access Infinite Context Length for Transformers
Data annotation toolbox supports image, audio and video data.
The Open-Source Data Annotation Platform
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
[NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 whil…
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Agentic components of the Llama Stack APIs
Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Disaggregated serving system for Large Language Models (LLMs).
[ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.