Skip to content
View seon92's full-sized avatar
  • Seoul

Block or report seon92

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Mamba SSM architecture

Python 13,896 1,199 Updated Jan 18, 2025

ECCV 2022

Python 6 Updated Oct 21, 2022

[NeurIPS 2022] Geometric order learning for rank estimation

Python 15 3 Updated Jan 11, 2024

This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)

Python 24 6 Updated Jun 28, 2024

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)

Python 141 15 Updated Jul 25, 2024
89 7 Updated Oct 19, 2022

A curated list of temporal action localization/detection and related area (e.g. temporal action proposal) resources.

574 65 Updated Sep 22, 2022

Awesome papers & datasets specifically focused on long-term videos.

243 11 Updated Nov 15, 2024

Writing clean and optimized Python code

Jupyter Notebook 545 124 Updated Jan 31, 2025

PyTorch implementation of "Grid anchor based image cropping"

Python 123 19 Updated Dec 28, 2024

Code for the Active Speakers in Context Paper (CVPR2020)

Python 53 13 Updated May 19, 2021

Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection (ECCV 2022)

Python 65 9 Updated Oct 29, 2023

The AVA dataset densely annotates 80 atomic visual actions in 351k movie clips with actions localized in space and time, resulting in 1.65M action labels with multiple labels per human occurring fr…

318 29 Updated Feb 9, 2022

An implementation of local windowed attention for language modeling

Python 410 43 Updated Jan 16, 2025

Tensorflow and Matlab code for "RealVAD: A Real-world Dataset for Voice Activity Detection" and "Voice Activity Detection by Upper Body Motion Analysis and Unsupervised Domain Adaptation "

Python 5 5 Updated Jul 8, 2020

A curated list of different papers and datasets in various areas of audio-visual processing

690 69 Updated Jan 30, 2024

Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset

Python 59 7 Updated Jan 18, 2022

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

Python 1,569 320 Updated Sep 25, 2024

Active Speaker Detection

Jupyter Notebook 19 4 Updated Jun 19, 2020

Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"

Python 111 26 Updated Nov 16, 2020

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 4,936 477 Updated Dec 26, 2024

ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'

Python 341 79 Updated Oct 23, 2023

This is an official implementation for "Video Swin Transformers".

Python 1,489 202 Updated Mar 8, 2023
Python 20 4 Updated Nov 2, 2018

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

Python 6,081 657 Updated Dec 11, 2024

Implemenation of Asymmetric-TriTraining by Tensorflow

Python 25 6 Updated Apr 24, 2018

The official project for CVPR19 paper: Domain-Symmetric Networks for Adversarial Domain Adaptation

Python 85 16 Updated Mar 29, 2021
Python 33 5 Updated Jul 28, 2021

Author's implementation of the paper "Deep Relative Attributes" (ACCV 2016)

Jupyter Notebook 42 12 Updated Sep 4, 2017

Methods and Implements of Deep Clustering

2,909 418 Updated Aug 25, 2024
Next