Stars
[NeurIPS 2022] Geometric order learning for rank estimation
This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
A curated list of temporal action localization/detection and related area (e.g. temporal action proposal) resources.
Awesome papers & datasets specifically focused on long-term videos.
Writing clean and optimized Python code
PyTorch implementation of "Grid anchor based image cropping"
Code for the Active Speakers in Context Paper (CVPR2020)
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection (ECCV 2022)
The AVA dataset densely annotates 80 atomic visual actions in 351k movie clips with actions localized in space and time, resulting in 1.65M action labels with multiple labels per human occurring fr…
An implementation of local windowed attention for language modeling
Tensorflow and Matlab code for "RealVAD: A Real-world Dataset for Voice Activity Detection" and "Voice Activity Detection by Upper Body Motion Analysis and Unsupervised Domain Adaptation "
A curated list of different papers and datasets in various areas of audio-visual processing
Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
This is an official implementation for "Video Swin Transformers".
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
Implemenation of Asymmetric-TriTraining by Tensorflow
The official project for CVPR19 paper: Domain-Symmetric Networks for Adversarial Domain Adaptation
Author's implementation of the paper "Deep Relative Attributes" (ACCV 2016)
Methods and Implements of Deep Clustering