Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,451 465 Updated Feb 12, 2025

Taited / clip-score

Quick scripts to calculate CLIP text-image similarity

Python 211 18 Updated Nov 26, 2024

MrGiovanni / ScaleMAI

19 Updated Jan 11, 2025

Chenglin-Yang / 1.58bit.flux

257 1 Updated Dec 31, 2024

baaivision / DIVA

[ICLR 2025] Diffusion Feedback Helps CLIP See Better

Python 254 13 Updated Jan 23, 2025

kohya-ss / sd-scripts

Python 5,680 928 Updated Feb 11, 2025

bghira / SimpleTuner

A general fine-tuning kit geared toward diffusion models.

Python 2,077 197 Updated Feb 10, 2025

Genesis-Embodied-AI / Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python 23,780 2,031 Updated Feb 12, 2025

SCAI-JHU / MuMA-ToM

MuMA-ToM: Multi-modal Multi-Agent Theory of Mind

Python 19 1 Updated Jan 23, 2025

MattWallingford / 360-1M

Python 47 1 Updated Feb 7, 2025

beichenzbc / Long-CLIP

[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"

Python 745 38 Updated Aug 13, 2024

FoundationVision / VAR

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,540 431 Updated Jan 12, 2025

tonychenxyz / emoknob

This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen, Run Chen, and Julia Hirschberg.

Python 66 7 Updated Oct 3, 2024

facebookresearch / scenescript

Public code release associated with SceneScript.

Python 139 11 Updated Oct 4, 2024

facebookresearch / projectaria_eyetracking

Project Aria Social Eye Tracking Model

Python 29 3 Updated Nov 22, 2024

Stability-AI / stable-fast-3d

SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement

Python 1,353 157 Updated Jan 22, 2025

facebookresearch / projectaria_tools

projectaria_tools is an C++/Python open-source toolkit to interact with Project Aria data

C++ 542 74 Updated Feb 12, 2025

eldar / flash3d

Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image

Python 175 15 Updated Nov 27, 2024

Beckschen / genex

Generative World Explorer

127 8 Updated Nov 26, 2024

CompVis / stable-diffusion

A latent text-to-image diffusion model

Jupyter Notebook 69,530 10,311 Updated Jun 18, 2024

UMass-Foundation-Model / CHAIC

[NeurIPS D&B Track 2024] Source code for the paper "Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge"

Python 13 Updated Nov 5, 2024

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 40,207 4,926 Updated Feb 12, 2025

Pepper-lll / LMforImageGeneration

Codebase for the paper-Elucidating the design space of language models for image generation

Python 45 1 Updated Nov 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tiezheng_Zhang ollie-ztz

Highlights

Block or report ollie-ztz

Stars

cfeng16 / GPS2Pix

aishwaryanr / awesome-generative-ai-guide

Saiyan-World / goku

SooLab / CGFormer

rese1f / aurora

HITsz-TMG / FilmAgent

lambert-x / VideoAuteur

NVIDIA / Cosmos