Stars
EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
Mosaic IT: Enhancing Instruction Tuning with Data Mosaics
FuxiaoLiu / awesome-Large-MultiModal-Hallucination
Forked from xieyuquanxx/awesome-Large-MultiModal-Hallucination😎 up-to-date & curated list of awesome LMM hallucinations papers, methods & resources.
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
[Arxiv] Towards Generic Anomaly Detection and Understanding: Large-scale Visual-linguistic Model (GPT-4V) Takes the Lead.
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Open-source datasets for paper "Fairness in Graph Mining: A Survey".
Open-source code for ''Graph Neural Networks with Adaptive Frequency Response Filter''.
Open source code for paper "EDITS: Modeling and Mitigating Data Bias for Graph Neural Networks".
Open-source Library PyGDebias: Graph Datasets and Fairness-Aware Graph Mining Algorithms
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>
An automatic MLLM hallucination detection framework
CoNLI: a plug-and-play framework for ungrounded hallucination detection and reduction
Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.
The dataset and code for the ICLR 2024 paper "Can LLM-Generated Misinformation Be Detected?"
Repository for the paper "Cognitive Mirage: A Review of Hallucinations in Large Language Models"
the repository of A survey on image-text multimodal models
A Survey on multimodal learning research.
Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`
[EACL'23] COVID-VTS: Fact Extraction and Verification on Short Video Platforms