-
Tamr
- Cambridge, Boston
- https://www.linkedin.com/in/trangnguyen17/
Highlights
- Pro
Stars
Repository for Quantifying Valence and Arousal in Text with Multilingual Pre-trained Transformers
This repository contains the codebase for MovieCLIP: Visual Scene Recognition in Movies
Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]
Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"
Localization of Knowledge in Text-to-Image Models
This is the official repository for our ECCV 2022 paper titled, "The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing"
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation (TMLR 2024)
[CVPR 2023] Code for "Learning Emotion Representations from Verbal and Nonverbal Communication"
Zero-shot Image-to-Image Translation [SIGGRAPH 2023]
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
The official implementation of the IJCAI 2024 paper "MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models".
Code and data for "AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks" (TMLR 2024)
[ICLR 2024] Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations" (ICLR '25)
Harmonic-NAS: Hardware-Aware Multimodal Neural Architecture Search on Resource-constrained Devices (ACML 2023)
Personal implementation of ASIF by Antonio Norelli
ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).
This repository contains information on the creation, evaluation, and benchmark models for the L+M-24 Dataset. L+M-24 will be featured as the shared task at The Language + Molecules Workshop at ACL…
Official implementation of our paper: Towards Robust and Reproducible Active Learning using Neural Networks, accepted at CVPR 2022.
Reading list for research topics in multimodal machine learning