nbgao

nbgao nbgao

Keep studying! Keep thinking!

14 followers · 46 following

Media Intelligence Laboratory(MIL@HDU)
Hangzhou, Zhejiang

Achievements

Organizations

Stars

nbgao / VLP

Visual-Language Pretraining

1 Updated Jul 21, 2021

MILVLG / rosita

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

Python 56 13 Updated Jun 13, 2023

alibaba / AliceMind

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Python 2,039 303 Updated Mar 19, 2024

yuewang-cuhk / awesome-vision-language-pretraining-papers

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

1,152 104 Updated Aug 19, 2022

huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 8,511 1,049 Updated Mar 21, 2025

microsoft / Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

Python 14,497 2,104 Updated Jul 24, 2024

AliaksandrSiarohin / first-order-model

This repository contains the source code for the paper First Order Motion Model for Image Animation

Jupyter Notebook 14,769 3,249 Updated Nov 14, 2024

openai / DALL-E

PyTorch package for the discrete VAE used for DALL·E.

Python 10,831 1,936 Updated Jan 31, 2024

lucidrains / DALLE-pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Python 5,605 637 Updated Feb 17, 2024

openai / CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 27,999 3,503 Updated Jul 23, 2024

CCYChongyanChen / VQA_AlgorithmDatasets

38 5 Updated Jan 20, 2023

Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

C++ 21,150 4,220 Updated Mar 18, 2025

google-research / vision_transformer

Jupyter Notebook 11,071 1,355 Updated Mar 6, 2025

nbgao / mt-captioning

Forked from MILVLG/mt-captioning

A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning

Python 1 Updated Sep 4, 2020

nbgao / bottom-up-attention.pytorch

Forked from MILVLG/bottom-up-attention.pytorch

An PyTorch reimplementation of bottom-up-attention models

Jupyter Notebook 1 Updated Sep 1, 2020

floodsung / Deep-Reasoning-Papers

Recent Papers including Neural Symbolic Reasoning, Logical Reasoning, Visual Reasoning, planning and any other topics connecting deep learning and reasoning

309 36 Updated May 30, 2022