NanoRAG: Simple implementation of Retrieval-Augmented Generation System

RAG (Retrieval-augmented Generation) is a widely used technique in the field of NLP (Natural Language Processing) and is a highly popular topic in both academia and industry. However, current RAG systems (such as Langchain) involve significant engineering effort and a large number of parameters, which can be challenging for beginners to learn. Therefore, the original intent of this project is to utilize the simplest and most classic techniques to build a RAG system from scratch, providing code implementations for key steps only.

Key Components for RAG

The two most critical components of rag are the Retriever and the Generator. The retriever is responsible for returning the top n most relevant documents in the database based on the input question and handing them over to the generator. The generator generates the corresponding answer based on the question and the top n documents. For now, we will only focus on plain text.

Retriever

There are many papers related to the retrieval. The retrieval of this project uses the most basic retrieval DPR (Dense Passage Retrieval). This project refers to this.

Generator

With the emergence of LLM, decoder-only generators have become popular. This project implements generators based on OpenAI's GPT (closed source) and LLaMA (open source).

Todolist

Retriever implementation.
Concat docs with Retriever inference.
Generator implementation.
update to recent wiki dump
multi-modal input retrieval

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Generator		Generator
Retriever		Retriever
assets		assets
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NanoRAG: Simple implementation of Retrieval-Augmented Generation System

Key Components for RAG

Retriever

Generator

Todolist

About

Releases

Packages

Languages

GasolSun36/NanoRAG

Folders and files

Latest commit

History

Repository files navigation

NanoRAG: Simple implementation of Retrieval-Augmented Generation System

Key Components for RAG

Retriever

Generator

Todolist

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages