- Hangzhou, Zhejiang, China
Highlights
- Pro
先进模型
PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
A latent text-to-image diffusion model
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
PyTorch implementations of Generative Adversarial Networks.
Visual Speech Recognition for Multiple Languages
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Code and models for evaluating a state-of-the-art lip reading network
ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASSP'20 Lipreading using Temporal Convolutional Networks
Implementation of "High Speed and Robust RGB-Thermal Tracking via Dual Attentive Stream Siamese Network" on Pytorch.
Add bisenetv2. My implementation of BiSeNet
This is official Pytorch implementation of "Rethinking the necessity of image fusion in high-level vision tasks: A practical infrared and visible image fusion network based on progressive semantic …
Infrared and visible image fusion using deep learning framework (Pytorch)
[CVPR 2022 Oral & TPAMI 2024] MixFormer: End-to-End Tracking with Iterative Mixed Attention
The papers and results about RGB-T fusion tracking
A resource collection of RGBT Salient Object Detection
Google AI 2018 BERT pytorch implementation
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
RetinaFace: Deep Face Detection Library for Python
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
[ICCV 2023] Official implementation of the paper: "DIRE for Diffusion-Generated Image Detection"
This repo contains the code for 1D tokenizer and generator
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
Liquid: Language Models are Scalable and Unified Multi-modal Generators