Skip to content
View carpedkm's full-sized avatar

Highlights

  • Pro

Block or report carpedkm

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
83 stars written in Python
Clear filter

Interact with your documents using the power of GPT, 100% privately, no data leaks

Python 54,779 7,364 Updated Nov 13, 2024

Official Code for DragGAN (SIGGRAPH 2023)

Python 35,820 3,455 Updated May 18, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 23,016 2,264 Updated Dec 27, 2024

Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory

Python 20,182 1,431 Updated Jan 9, 2025

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,856 1,040 Updated Dec 31, 2024

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 10,207 953 Updated Jan 8, 2025

🐍 Geometric Computer Vision Library for Spatial AI

Python 10,125 980 Updated Jan 6, 2025

Simple, unified interface to multiple Generative AI providers

Python 9,655 869 Updated Jan 5, 2025

Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".

Python 6,243 408 Updated Dec 27, 2024

[SIGGRAPH Asia 2024, Journal Track] ToonCrafter: Generative Cartoon Interpolation

Python 5,492 458 Updated Sep 9, 2024

Count the MACs / FLOPs of your PyTorch model.

Python 4,935 529 Updated Jul 8, 2024

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Python 4,645 353 Updated Jul 10, 2024

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 4,315 372 Updated Dec 22, 2024

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.

Python 3,951 278 Updated Oct 5, 2024

[WIP] Layer Diffusion for WebUI (via Forge)

Python 3,934 337 Updated Aug 30, 2024

Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models

Python 3,020 267 Updated Oct 22, 2024

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Python 2,921 183 Updated Oct 31, 2024

Flops counter for convolutional networks in pytorch framework

Python 2,847 306 Updated Sep 27, 2024

[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

Python 2,703 218 Updated Sep 8, 2024

Next-Token Prediction is All You Need

Python 1,957 77 Updated Oct 24, 2024

A curated list of image inpainting and video inpainting papers and resources

Python 1,956 263 Updated Nov 6, 2024

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Python 1,729 85 Updated Oct 31, 2024

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 1,724 67 Updated Jan 2, 2025

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,533 94 Updated Dec 11, 2024

[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"

Python 1,244 112 Updated Dec 20, 2023

Paint by Example: Exemplar-based Image Editing with Diffusion Models

Python 1,140 100 Updated Nov 28, 2023

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Python 1,126 86 Updated Oct 21, 2024

A minimal and universal controller for FLUX.1.

Python 1,049 65 Updated Jan 9, 2025

A family of lightweight multimodal models.

Python 970 73 Updated Nov 18, 2024

[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention

Python 809 63 Updated Jun 2, 2024
Next