Skip to content
View nguyentr17's full-sized avatar

Highlights

  • Pro

Block or report nguyentr17

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Repository for Quantifying Valence and Arousal in Text with Multilingual Pre-trained Transformers

Python 26 Updated Feb 26, 2023

This repository contains the codebase for MovieCLIP: Visual Scene Recognition in Movies

Python 36 4 Updated Oct 1, 2023

Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]

Python 169 28 Updated Sep 21, 2022

Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"

183 10 Updated Sep 1, 2024

Localization of Knowledge in Text-to-Image Models

Python 6 Updated Oct 8, 2024
Jupyter Notebook 4 2 Updated Oct 9, 2024

This is the official repository for our ECCV 2022 paper titled, "The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing"

Python 49 2 Updated Nov 28, 2022

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation (TMLR 2024)

Python 238 15 Updated Jul 1, 2024
Jupyter Notebook 19 Updated Sep 28, 2023

[CVPR 2023] Code for "Learning Emotion Representations from Verbal and Nonverbal Communication"

Python 45 7 Updated Feb 19, 2025

Zero-shot Image-to-Image Translation [SIGGRAPH 2023]

Python 1,096 80 Updated Oct 16, 2024

Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

Python 2,122 209 Updated Feb 22, 2025

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 1,876 273 Updated Feb 22, 2025

The official implementation of the IJCAI 2024 paper "MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models".

Python 37 1 Updated Sep 11, 2024

Code and data for "AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks" (TMLR 2024)

Jupyter Notebook 549 41 Updated Oct 29, 2024

[ICLR 2024] Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"

Python 811 57 Updated Oct 12, 2023

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

4,038 230 Updated Feb 21, 2025
Python 2 Updated Nov 1, 2024

Multi-Layer Sparse Autoencoders (ICLR 2025)

Python 18 Updated Feb 11, 2025

Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations" (ICLR '25)

Python 59 6 Updated Jan 27, 2025

Harmonic-NAS: Hardware-Aware Multimodal Neural Architecture Search on Resource-constrained Devices (ACML 2023)

Python 12 1 Updated May 7, 2024

Personal implementation of ASIF by Antonio Norelli

Jupyter Notebook 25 Updated May 24, 2024

ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).

Jupyter Notebook 208 22 Updated Feb 21, 2025
Python 8 1 Updated Aug 26, 2024

This repository contains information on the creation, evaluation, and benchmark models for the L+M-24 Dataset. L+M-24 will be featured as the shared task at The Language + Molecules Workshop at ACL…

Python 26 2 Updated Jan 23, 2025

Official implementation of our paper: Towards Robust and Reproducible Active Learning using Neural Networks, accepted at CVPR 2022.

Jupyter Notebook 67 7 Updated Aug 16, 2023

Reading list for research topics in multimodal machine learning

6,271 868 Updated Aug 20, 2024
JavaScript 262 35 Updated Jan 24, 2025

An efficient speech separation method

Python 271 33 Updated Apr 11, 2024
Next