- Beijing
Stars
The official repository for paper "Tora: Trajectory-oriented Diffusion Transformer for Video Generation"
🌞 CareGPT (关怀GPT)是一个医疗大语言模型,同时它集合了数十个公开可用的医疗微调数据集和开放可用的医疗大语言模型,包含LLM的训练、测评、部署等以促进医疗LLM快速发展。Medical LLM, Open Source Driven for a Healthy Future.
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Bringing BERT into modernity via both architecture changes and scaling
A generative world for general-purpose robotics & embodied AI learning.
TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loudness normalization operations.
Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings" published at Odyssey 2024
A comprehensive template for aligning large language models (LLMs) using Reinforcement Learning from Human Feedback (RLHF), transfer learning, and more. Build your own customizable LLM alignment so…
WiNGPT是一个基于GPT的医疗垂直领域大模型,旨在将专业的医学知识、医疗信息、数据融会贯通,为医疗行业提供智能化的医疗问答、诊断支持和医学知识等信息服务,提高诊疗效率和医疗服务质量。
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.
Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
A collection of datasets for the purpose of emotion recognition/detection in speech.
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Turn online textbook into Exam-friendly, offline, searchable PDF
Inference-only implementation of "One-Step Diffusion Distillation through Score Implicit Matching" [NIPS 2024]
Official Implementation (Pytorch) of "Constant Acceleration Flow", NeurIPS 2024
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
Consistency Distillation with Target Timestep Selection and Decoupled Guidance