Skip to content
View ollie-ztz's full-sized avatar
  • Baltimore
  • 20:00 (UTC -05:00)

Highlights

  • Pro

Block or report ollie-ztz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
12 Updated Jan 22, 2025

A one stop repository for generative AI research updates, interview resources, notebooks and much more!

10,651 2,245 Updated Feb 12, 2025

Video Generation Foundation Models: https://saiyan-world.github.io/goku/

Python 1,390 92 Updated Feb 12, 2025

The official PyTorch implementation of the CVPR 2023 paper "Contrastive Grouping with Transformer for Referring Image Segmentation".

Python 48 3 Updated Apr 17, 2024

[ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark

Python 72 4 Updated Jan 22, 2025

Resources of our paper "FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces". New versions in the making!

Python 837 115 Updated Feb 12, 2025

VideoAuteur: Towards Long Narrative Video Generation

30 Updated Jan 13, 2025

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,451 465 Updated Feb 12, 2025

Quick scripts to calculate CLIP text-image similarity

Python 211 18 Updated Nov 26, 2024
19 Updated Jan 11, 2025

[ICLR 2025] Diffusion Feedback Helps CLIP See Better

Python 254 13 Updated Jan 23, 2025
Python 5,680 928 Updated Feb 11, 2025

A general fine-tuning kit geared toward diffusion models.

Python 2,077 197 Updated Feb 10, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 23,780 2,031 Updated Feb 12, 2025

MuMA-ToM: Multi-modal Multi-Agent Theory of Mind

Python 19 1 Updated Jan 23, 2025
Python 47 1 Updated Feb 7, 2025

[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"

Python 745 38 Updated Aug 13, 2024

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,540 431 Updated Jan 12, 2025

This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen, Run Chen, and Julia Hirschberg.

Python 66 7 Updated Oct 3, 2024

Public code release associated with SceneScript.

Python 139 11 Updated Oct 4, 2024

Project Aria Social Eye Tracking Model

Python 29 3 Updated Nov 22, 2024

SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement

Python 1,353 157 Updated Jan 22, 2025

projectaria_tools is an C++/Python open-source toolkit to interact with Project Aria data

C++ 542 74 Updated Feb 12, 2025

Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image

Python 175 15 Updated Nov 27, 2024

Generative World Explorer

127 8 Updated Nov 26, 2024

A latent text-to-image diffusion model

Jupyter Notebook 69,530 10,311 Updated Jun 18, 2024

[NeurIPS D&B Track 2024] Source code for the paper "Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge"

Python 13 Updated Nov 5, 2024

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 40,207 4,926 Updated Feb 12, 2025

Codebase for the paper-Elucidating the design space of language models for image generation

Python 45 1 Updated Nov 17, 2024
Next