Skip to content
View ShihaoZhaoZSH's full-sized avatar

Block or report ShihaoZhaoZSH

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[ECCV 2024 Oral] ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction

Python 40 7 Updated Aug 13, 2024

Video-Inpaint-Anything: This is the inference code for our paper CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility.

Python 254 3 Updated Sep 24, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 21,769 2,109 Updated Aug 9, 2024

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,292 1,009 Updated Oct 6, 2024

Pytorch Lightning入门中文教程,转载请注明来源。(当初是写着玩的,建议看完MNIST这个例子再上手)

Jupyter Notebook 181 18 Updated Dec 6, 2020

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Jupyter Notebook 7,903 838 Updated Jul 26, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,188 277 Updated May 4, 2024

Focus on prompting and generating

Python 40,522 5,653 Updated Aug 21, 2024

[TMLR] CiPR: An Efficient Framework with Cross-instance Positive Relations for Generalized Category Discovery

Python 8 Updated Mar 25, 2024

Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

Python 13,276 3,379 Updated Sep 20, 2024

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Jupyter Notebook 506 26 Updated Jul 1, 2024

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Python 1,061 54 Updated Jul 17, 2024

collection of diffusion model papers categorized by their subareas

1,169 59 Updated Oct 5, 2024

[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation

Python 309 20 Updated Jul 17, 2024

High-fidelity performance metrics for generative models in PyTorch

Python 984 62 Updated Jan 25, 2024

CLIP-based aesthetics predictor inspired by the interface of 🤗 huggingface transformers.

Python 28 Updated Jun 14, 2024

The state-of-the-art VSR

104 5 Updated Jul 19, 2023
Python 8 3 Updated Jan 26, 2019
Python 2 Updated Jan 15, 2022

[CVPR2023] Blind Video Deflickering by Neural Filtering with a Flawed Atlas

Python 692 41 Updated Jun 3, 2023

animatediff prompt travel

Python 1,188 105 Updated Jan 13, 2024

Using Low-rank adaptation to quickly fine-tune diffusion models.

Jupyter Notebook 6,991 481 Updated Mar 22, 2024

Github for my ICCV 2017 paper: "Localizing Moments in Video with Natural Language"

OpenEdge ABL 188 44 Updated Oct 31, 2020

Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies

Python 1,280 106 Updated Jul 14, 2024
Python 1 2 Updated Sep 20, 2019

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Python 4,921 1,097 Updated Jan 15, 2024

pytorch structural similarity (SSIM) loss

Python 1,879 364 Updated Feb 22, 2024

Finetune ModelScope's Text To Video model using Diffusers 🧨

Python 657 104 Updated Dec 14, 2023
Next