Lists (23)
Sort Name ascending (A-Z)
Stars
5ire is a cross-platform desktop AI assistant, MCP client. It compatible with major service providers, supports local knowledge base and tools via model context protocol servers .
convert markdown to zhihu compatible format.
Pioneering Multimodal Reasoning with CoT
Explore the Multimodal “Aha Moment” on 2B Model
Fully open reproduction of DeepSeek-R1
Witness the aha moment of VLM with less than $3.
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
MM-EUREKA: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’
This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!
Officially implement of the paper "Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting".
R1-onevision, a visual language model capable of deep CoT reasoning.
assistant tools for attention visualization in deep learning
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
verl: Volcano Engine Reinforcement Learning for LLMs
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
[AAAI-2025] The official code of Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation
🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!
Diffusion attentive attribution maps for interpreting Stable Diffusion.
ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
Inference and training library for high-quality TTS models.
Liyulingyue / DesktopPet
Forked from llq20133100095/DeskTopPet一个桌面宠物程序,现在似乎发展成为桌面便签了。桌面便签程序见develop-todolist分支。