-
I2R
- Singapore
-
17:49
(UTC +08:00) - binwang.xyz
- https://orcid.org/0000-0001-9760-8343
Highlights
- Pro
Stars
An Open-source RL System from ByteDance Seed and Tsinghua AIR
Understanding R1-Zero-Like Training: A Critical Perspective
A collection of recent open-source math datasets for training and evaluating Math LLMs
A Survey on Efficient Reasoning for LLMs
Fully open data curation for reasoning models
Fully open reproduction of DeepSeek-R1
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
Wan: Open and Advanced Large-Scale Video Generative Models
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Neural Code Intelligence Survey 2024; Reading lists and resources
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
verl: Volcano Engine Reinforcement Learning for LLMs
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 🍓
Latest Advances on System-2 Reasoning
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation,…
The first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
[ArXiv 2024] Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models
This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov
A series of technical report on Slow Thinking with LLM
Frontier Multimodal Foundation Models for Image and Video Understanding
[ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization