FDInSky

Follow

FDInSky

Follow

8 followers · 14 following

Baidu
BeiJing

Achievements

Achievements

Starred repositories

190 stars written in Python

JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Python 25,251 3,221 Updated Sep 24, 2024

AtsushiSakai / PythonRobotics

Python sample codes for robotics algorithms.

Python 24,057 6,639 Updated Jan 20, 2025

Genesis-Embodied-AI / Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python 23,085 1,908 Updated Jan 20, 2025

OpenBMB / MiniCPM-o

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 17,033 1,217 Updated Jan 20, 2025

OpenMOSS / MOSS

An open-source tool-augmented conversational language model from Fudan University

Python 12,019 1,147 Updated Jul 13, 2024

PKU-YuanGroup / Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,853 1,043 Updated Jan 20, 2025

mlfoundations / open_clip

An open source implementation of CLIP.

Python 10,828 1,022 Updated Jan 4, 2025

THUDM / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 10,357 964 Updated Jan 20, 2025

nerfstudio-project / nerfstudio

A collaboration friendly studio for NeRFs

Python 9,794 1,345 Updated Jan 21, 2025

nebuly-ai / optimate

A collection of libraries to optimise AI model performances

Python 8,373 636 Updated Jul 22, 2024

LargeWorldModel / LWM

Large World Model -- Modeling Text and Video with Millions Context

Python 7,209 554 Updated Oct 19, 2024

NVIDIA / Cosmos

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,127 431 Updated Jan 9, 2025

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 6,834 524 Updated Dec 25, 2024

modelscope / DiffSynth-Studio

Enjoy the magic of Diffusion models!

Python 6,751 629 Updated Jan 15, 2025

OpenGVLab / LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 5,802 377 Updated Mar 14, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,329 404 Updated Aug 7, 2024

modelscope / ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek3, ...) and 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, Inter…

Python 5,100 444 Updated Jan 20, 2025

bklieger-groq / g1

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 4,157 375 Updated Dec 6, 2024

modelscope / data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 3,438 197 Updated Jan 20, 2025

dvlab-research / MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,236 282 Updated May 4, 2024

THUDM / GLM

GLM (General Language Model)

Python 3,220 326 Updated Nov 3, 2023

opendilab / DI-engine

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Python 3,199 388 Updated Jan 13, 2025

libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)

Python 2,882 179 Updated Jun 16, 2024

NVlabs / VILA

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 2,800 224 Updated Jan 11, 2025

jy0205 / Pyramid-Flow

Code of Pyramidal Flow Matching for Efficient Video Generative Modeling

Python 2,710 266 Updated Dec 21, 2024

Alpha-VLLM / Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 2,127 88 Updated Aug 6, 2024

magicleap / SuperPointPretrainedNetwork

PyTorch pre-trained model for real-time interest point detection, description, and sparse tracking (https://arxiv.org/abs/1712.07629)

Python 1,969 397 Updated Jul 24, 2022

VITA-MLLM / VITA

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 1,969 140 Updated Jan 17, 2025

NUS-HPC-AI-Lab / VideoSys

VideoSys: An easy and efficient system for video generation

Python 1,884 128 Updated Jan 1, 2025

PKU-YuanGroup / LLaVA-CoT

LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning

Python 1,754 64 Updated Jan 8, 2025

Starred topics

anchor-free

detectron2