Stars
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Robust Speech Recognition via Large-Scale Weak Supervision
StairNet series——Deep learning Real-time Stair Detection
StairNet series——Deep learning Real-time Stair Detection
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库
Fast and accurate automatic speech recognition (ASR) for edge devices
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Transformation-Equivariant 3D Object Detection for Autonomous Driving
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
A modular graph-based Retrieval-Augmented Generation (RAG) system
LeGO-LOAM: Lightweight and Ground-Optimized Lidar Odometry and Mapping on Variable Terrain
Master the command line, in one page