-
Xi'an Jiaotong University
- China Xi‘an
-
16:08
(UTC -12:00) - https://www.xjtu.edu.cn/
Highlights
- Pro
Lists (12)
Sort Name ascending (A-Z)
Stars
[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
OpenMMLab Self-Supervised Learning Toolbox and Benchmark
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
[CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv.org/abs/2212.04500)
Code Release for MViTv2 on Image Recognition.
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
Implementation of ViViT: A Video Vision Transformer
An effective multimodal representation and fusion method for multimodal intent recognition
Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
A course on aligning smol models.
Instruction Tuning with GPT-4
Code and documentation to train Stanford's Alpaca models, and generate the data.
A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.
[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Learn OpenCV : C++ and Python Examples
This repository aims at providing efficient CNNs for Audio Tagging. We provide AudioSet pre-trained models ready for downstream training and extraction of audio embeddings.
Wrapper to expose Kinect for Windows v2 API in Python
An opinionated list of awesome Python frameworks, libraries, software and resources.
😎 Awesome lists about all kinds of interesting topics
🪄 Create rich visualizations with AI
microsoft / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2