- Shanghai
Stars
YOLOv12: Attention-Centric Real-Time Object Detectors
SpatialLM: Large Language Model for Spatial Understanding
Model Context Protocol Servers
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
An autoregressive character-level language model for making more things
🍰 Desktop utility to download images/videos/music/text from various websites, and more.
Robust Speech Recognition via Large-Scale Weak Supervision
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
Build smaller, faster, and more secure desktop and mobile applications with a web frontend.
This repository aims at providing examples to illustrate ros2_control and ros2_controllers
Simple, unified interface to multiple Generative AI providers
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Image composition toolbox: everything you want to know about image composition or object insertion
The Robot Operating System, is a meta operating system for robots.
A ready-to-go translation ocr tool developed with WPF/WPF 开发的一款即用即走的翻译、OCR工具