Lists (20)
Sort Name ascending (A-Z)
AIGC&LLM
AIGC and LLMAlgorithm
AlgorithmASR
语音识别ChatGLM
ChatGLMChatGPT
ChatGPTCV
Computer Vision.DB
Database infosDL
Deep Learning projects.Docker
DockerEnglish
Github
Github相关ML
机器学习NLP
NLP related works end ideas.Open-S Framework
Open Source FrameworkProgram
Python
Python language.RS
推荐系统Search
搜索、检索TOOLS
WEB
Stars
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…
Multilingual Voice Understanding Model
End-to-End Object Detection with Transformers
A generative world for general-purpose robotics & embodied AI learning.
App-Controller: Allow users to manipulate your App with natural language
[CVPR'25] Official Implementations for Paper - MagicQuill: An Intelligent Interactive Image Editing System
Let your Claude able to think
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
TensorRT Extension for Stable Diffusion Web UI
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
TransNet V2: Shot Boundary Detection Neural Network
A tool used to obfuscate python scripts, bind obfuscated scripts to fixed machine or expire obfuscated scripts.
python代码加密以及python代码的License控制
Bringing Old Photo Back to Life (CVPR 2020 oral)
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
This repository contains the code for the FastApi Authentication api and test cases.
[NeurIPS D&B Track 2024] Official implementation of HumanVid
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
A modular graph-based Retrieval-Augmented Generation (RAG) system