Stars
[CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".
Extensible, parallel implementations of t-SNE
zer0int / Long-CLIP
Forked from beichenzbc/Long-CLIPScripts for use with LongCLIP, including fine-tuning Long-CLIP
What do CLIP Vision Transformers learn? Feature Visualization can show you!
[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"
A new multi-task learning framework using Vision Transformers
Attention visualization in CLIP
Plotting heatmaps with the self-attention of the [CLS] tokens in the last layer.
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-…
Visualising High Dimensional Data using tSNE
Collection of common code that's shared among different research projects in FAIR computer vision team.
[ICLR2023] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
[AAAI2024] Official implementation of the AAAI 2024 paper TGP-T
[CVPR-2022] Official implementation for "Knowledge Distillation with the Reused Teacher Classifier".
[ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383