-
Northwestern Polytechnical University
- Suzhou
-
13:09
(UTC +08:00)
LLM
Modeling, training, eval, and inference code for OLMo
Universal LLM Deployment Engine with ML Compilation
✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
High-Resolution Image Synthesis with Latent Diffusion Models
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge managemen…
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
A high-throughput and memory-efficient inference and serving engine for LLMs
Real-time Speech-Text Foundation Model Toolkit (wip)
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) and 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, Inte…
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.
Accessible large language models via k-bit quantization for PyTorch.
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM