Lists (1)
Sort Name ascending (A-Z)
Starred repositories
[Lumina Embodied AI Community] 具身智能技术指南 Embodied-AI-Guide
Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains papers, codes, datasets, evaluations, and analyses.
Witness the aha moment of VLM with less than $3.
A fork to add multimodal model training to open-r1
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
HunyuanVideo: A Systematic Framework For Large Video Generation Model
This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".
This is the repository for the Tool Learning survey.
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
Frontier Multimodal Foundation Models for Image and Video Understanding
[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Real-time updated, fine-grained reading list on LLM-synthetic-data.🔥
The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch
[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loudness normalization operations.