Stars
A foundation model for knowledge graph reasoning
An implementation of "M3DOCRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding" by Jaemin Cho, Debanjan Mahata, Ozan Irsoy, Yujie He, and Mohit Bansal (UNC Chape…
Graph Foundation Model for Retrieval Augmented Generation
A pure python based utility to extract text and images from docx files.
GraphRAG-survey: A curated list of resources on graph-based retrieval-augmented generation.
Awesome-LLM-RAG: a curated list of advanced retrieval augmented generation (RAG) in Large Language Models
A curated list of retrieval-augmented generation (RAG) in large language models
GraphRAG using Local LLMs - Features robust API and multiple apps for Indexing/Prompt Tuning/Query/Chat/Visualizing/Etc. This is meant to be the ultimate GraphRAG/KG local LLM app.
Official style files for papers submitted to venues of the Association for Computational Linguistics
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
✨✨Latest Advances on Multimodal Large Language Models
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Multimodal image + text captioning for 416k figures from arXiv. Uses CLIP + SciBERT + GPT-2 in an encoder-decoder architecture. CS224N final project.
Generating figures from research papers, using textual captions from the paper.
Awesome-RAG: Collect typical RAG papers and systems.
RAG that intelligently adapts to your use case, data, and queries
A simple, easy-to-hack GraphRAG implementation
Standalone evaluation scripts and starter code for the ICDAR 2023 DUDE competition
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of existing MLLMs to comprehend long multimodal documents.
SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)
Official implement of "Free Lunch: Frame-level Contrastive Learning with Text Perceiver for Robust Scene Text Recognition in Lightweight Models" in PyTorch.