KET-RAG is a powerful and flexible framework for retrieval-augmented generation (RAG) enhanced with knowledge graphs. This project allows for structured document indexing and efficient LLM-based answer generation.
KET-RAG balances retrieval quality and efficiency with a multi-granular indexing framework consisting of:
- Knowledge Graph Skeleton (SkeletonRAG): Selects key text chunks via PageRank and extracts structured knowledge using LLMs.
- Text-Keyword Bipartite Graph (KeywordRAG): Links keywords to text chunks, mimicking knowledge graph relationships with minimal cost.
During retrieval, KET-RAG integrates information from both entity and keyword channels, enabling efficient and high-quality LLM-based answer generation. Experiments show that KET-RAG significantly reduces indexing costs while improving retrieval and generation quality, making it a practical solution for large-scale RAG applications.
Ensure you have Python >=3.10 installed.
Install dependencies using Poetry:
pip install poetry
poetry install
Using the folder ragtest-musique
as an example, follow these steps:
python -m graphrag init --root ragtest-musique/
This command sets up the necessary file structure and configurations.
python -m graphrag prompt-tune --root ragtest-musique/ --config ragtest-musique/settings.yaml --discover-entity-types
Adjust prompts for better retrieval.
Before running this step, modify settings.yaml
to set the appropriate parameters as needed, based on our paper.
python -m graphrag index --root ragtest-musique/
This process creates an indexed structure for retrieval.
Before executing the scripts, set up your API key:
export GRAPHRAG_API_KEY=your_api_key_here
To generate a single context file:
python indexing_sket/create_context.py ragtest-musique/ keyword 0.5
- First argument: Root directory of the project
- Second argument: Context-building strategy (
text
,keyword
, orskeleton
) - Third argument: Context threshold theta (range:
0.0-1.0
)
To generate answers for all context files in the output directory:
python indexing_sket/llm_answer.py ragtest-musique/
This project builds upon Microsoft's GraphRAG (version 0.4.1), licensed under the MIT License.