-
National University of Singapore
- Singapore
- http://www.comp.nus.edu.sg/~xiaojun/
Stars
Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format
NexP: A Beginner Friendly Toolkit for Designing and Conducting Controlled Experiments
Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos
A list of papers about data quality in Large Language Models (LLMs)
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Learn to Watch TV: Multimodal Dialogue Understanding and Response Prediction
[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
DSIR large-scale data selection framework for language model training
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Benchmarking large language models' complex reasoning ability with chain-of-thought prompting
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
🦜🔗 Build context-aware reasoning applications
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Hadoop MapReduce training of modified Kneser-Ney smoothed language models
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
This is a collection of our NAS and Vision Transformer work.
Using VideoBERT to tackle video prediction
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
ICML'2022: Black-Box Tuning for Language-Model-as-a-Service & EMNLP'2022: BBTv2: Towards a Gradient-Free Future with Large Language Models