Lists (32)
Sort Name ascending (A-Z)
ALT
Automatic Lyrics TranscriptionASR
Attention
Audio Separation
Speech and Music SeparationAudio Synthsis
Biology
Computer Vision
Computer Vision TasksContinual Learning
Data Engineering
Data Testing
Datasets
Federated Learning
Finance
FrontEnd
GNN
Hugo
k8s
Knowledge Graph
ML
MLOPs
MLTemplate
Music
N-shot
NLP
Recommender
Roadmap
Self-Supervised
Synthetic Data
System Design
TTS
VPN
Vue
Starred repositories
AWS-native chatbot using Bedrock + Claude (+Nova and Mistral)
Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’
Open-Sora: Democratizing Efficient Video Production for All
SGLang is a fast serving framework for large language models and vision language models.
实时STT,连接OpenAI接口/智谱AI(流式LLM)和GPT-SOVITS/Edge-TTS,通过网页的方式,进行跨网络的服务调用,实现实时对话的效果
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
The official Python library for the OpenAI API
No fortress, purely open ground. OpenManus is Coming.
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
Tools for merging pretrained large language models.
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
Build resilient language agents as graphs.
Systematic evaluation framework that automatically rates overthinking behavior in large language models.
Official Repo for Open-Reasoner-Zero
Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Solve Visual Understanding with Reinforced VLMs
[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
DeepSeek Coder: Let the Code Write Itself
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
React app for inspecting, building and debugging with the Realtime API
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)