Skip to content
View wh0x's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report wh0x

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

🔨AI 方向好用的科研工具

2,463 353 Updated Jun 10, 2024

An Open-source Toolkit for LLM Development

Python 2,751 175 Updated Jan 13, 2025

ComfyUI's ControlNet Auxiliary Preprocessors

Python 2,585 228 Updated Oct 28, 2024

Train high-quality text-to-image diffusion models in a data & compute efficient manner

Python 473 36 Updated Jan 17, 2025

The best OSS video generation models

Python 2,827 291 Updated Jan 8, 2025

IP Adapter Instruct

Python 194 4 Updated Aug 10, 2024

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python 25,293 1,914 Updated Jan 27, 2025

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 5,821 489 Updated Jan 17, 2025

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Jupyter Notebook 3,862 325 Updated Jan 13, 2025

深度学习经典、新论文逐段精读

28,019 2,493 Updated Nov 17, 2024

和李沐一起读论文

156 23 Updated Jan 8, 2025

A family of lightweight multimodal models.

Python 982 74 Updated Nov 18, 2024

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Python 21,639 3,160 Updated Jan 19, 2025

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Python 1,251 152 Updated Dec 19, 2024

VideoSys: An easy and efficient system for video generation

Python 1,903 129 Updated Jan 1, 2025

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,773 602 Updated May 31, 2024
Python 239 30 Updated Dec 10, 2022

Large-scale text-video dataset. 10 million captioned short videos.

Python 620 39 Updated Aug 14, 2024

Character Animation (AnimateAnyone, Face Reenactment)

Python 3,297 258 Updated May 31, 2024

Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>

Python 4,503 290 Updated Jun 21, 2024

搞定C++:punch:。C++ Primer 中文版第5版学习仓库,包括笔记和课后练习答案。

C++ 8,133 1,990 Updated Sep 12, 2024

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Python 1,808 147 Updated Dec 30, 2024

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)

Python 761 45 Updated Jul 29, 2024
Python 10,061 1,298 Updated Feb 1, 2025

[CSUR] A Survey on Video Diffusion Models

1,928 96 Updated Dec 9, 2024

Generative Models by Stability AI

Python 25,215 2,791 Updated Sep 4, 2024

✨✨Latest Advances on Multimodal Large Language Models

13,751 886 Updated Jan 28, 2025

[CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale

Python 202 5 Updated Feb 27, 2024

AAAI 2024: Visual Instruction Generation and Correction

Python 91 3 Updated Feb 4, 2024
Next