-
Beijing Jiaotong University
- Beijing, China
-
12:18
(UTC +08:00)
Starred repositories
Extreme Image Compression using Fine-tuned VQGAN Models (DCC 2024)
TensorFlow implementation of EGIC (EGIC: Enhanced Low-Bit-Rate Generative Image Compression Guided by Semantic Segmentation, ECCV 2024)
Official implementation of "Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model" (WACV2024).
An open-source implementaion for fine-tuning Qwen2-VL series by Alibaba Cloud.
✨✨ MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
Awesome-LLM: a curated list of Large Language Model
Pytorch implementation of the paper "You Can Mask More For Extremely Low-Bitrate Image Compression".
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
✨✨Latest Advances on Multimodal Large Language Models
Vector (and Scalar) Quantization, in Pytorch
[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
ArcFace unofficial Implemented in Tensorflow 2.0+ (ResNet50, MobileNetV2). "ArcFace: Additive Angular Margin Loss for Deep Face Recognition" Published in CVPR 2019. With Colab.
Research code for pixel-based encoders of language (PIXEL)
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
(TPAMI 2024) A Survey on Open Vocabulary Learning
ICCV 2023 Paper Global Features are All You Need for Image Retrieval and Reranking Official Repository
MLCD & UNICOM : Large-Scale Visual Representation Model
All-In-One VLM: Image + Video + Transfer to Other Languages / Domains (TPAMI 2023)
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
A Pytorch Implementation of a continuously rate adjustable learned image compression framework.
Repository of the NeurIPS'22 paper "Selective compression learning of latent representations for variable-rate image compression" pytorch implementation
A collection of tools for neural compression enthusiasts.