NanoRAG: Simple implementation of Retrieval-Augmented Generation System
RAG (Retrieval-augmented Generation) is a widely used technique in the field of NLP (Natural Language Processing) and is a highly popular topic in both academia and industry. However, current RAG systems (such as Langchain) involve significant engineering effort and a large number of parameters, which can be challenging for beginners to learn. Therefore, the original intent of this project is to utilize the simplest and most classic techniques to build a RAG system from scratch, providing code implementations for key steps only.
The two most critical components of rag are the Retriever and the Generator. The retriever is responsible for returning the top n most relevant documents in the database based on the input question and handing them over to the generator. The generator generates the corresponding answer based on the question and the top n documents. For now, we will only focus on plain text.
There are many papers related to the retrieval. The retrieval of this project uses the most basic retrieval DPR (Dense Passage Retrieval). This project refers to this.
With the emergence of LLM, decoder-only generators have become popular. This project implements generators based on OpenAI's GPT (closed source) and LLaMA (open source).
-
Retriever implementation.
-
Concat docs with Retriever inference.
-
Generator implementation.
-
update to recent wiki dump
-
multi-modal input retrieval