Skip to content

waetr/KET-RAG

Repository files navigation

KET-RAG: Knowledge-Enhanced Text Retrieval Augmented Generation

KET-RAG is a powerful and flexible framework for retrieval-augmented generation (RAG) enhanced with knowledge graphs. This project allows for structured document indexing and efficient LLM-based answer generation.

Overview

KET-RAG balances retrieval quality and efficiency with a multi-granular indexing framework consisting of:

  • Knowledge Graph Skeleton (SkeletonRAG): Selects key text chunks via PageRank and extracts structured knowledge using LLMs.
  • Text-Keyword Bipartite Graph (KeywordRAG): Links keywords to text chunks, mimicking knowledge graph relationships with minimal cost.

During retrieval, KET-RAG integrates information from both entity and keyword channels, enabling efficient and high-quality LLM-based answer generation. Experiments show that KET-RAG significantly reduces indexing costs while improving retrieval and generation quality, making it a practical solution for large-scale RAG applications.

Prerequisites

Ensure you have Python >=3.10 installed.

Installation

Install dependencies using Poetry:

pip install poetry
poetry install

Setup

Using the folder ragtest-musique as an example, follow these steps:

1. Initialize the Project

python -m graphrag init --root ragtest-musique/

This command sets up the necessary file structure and configurations.

2. Tune Prompts

python -m graphrag prompt-tune --root ragtest-musique/ --config ragtest-musique/settings.yaml --discover-entity-types

Adjust prompts for better retrieval.

3. Build the Index

Before running this step, modify settings.yaml to set the appropriate parameters as needed, based on our paper.

python -m graphrag index --root ragtest-musique/

This process creates an indexed structure for retrieval.

Generate Context and Answers

Environment Setup

Before executing the scripts, set up your API key:

export GRAPHRAG_API_KEY=your_api_key_here

1. Generate Context

To generate a single context file:

python indexing_sket/create_context.py ragtest-musique/ keyword 0.5

Parameters:

  • First argument: Root directory of the project
  • Second argument: Context-building strategy (text, keyword, or skeleton)
  • Third argument: Context threshold theta (range: 0.0-1.0)

2. Generate Answers

To generate answers for all context files in the output directory:

python indexing_sket/llm_answer.py ragtest-musique/

Acknowledgments

This project builds upon Microsoft's GraphRAG (version 0.4.1), licensed under the MIT License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published