MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot

✅ This paper is accepted by The Web Conference (WWW) 2025!

💻 This is the official implementation for our accepted paper MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot.

Authors

Xuejiao Zhao*, Siyan Liu*, Su-Yin Yang, Chunyan Miao**

Nanyang Technological University | Tan Tock Seng Hospital | Woodlands Health

* Both authors contributed equally to the paper

** Corresponding author

🔥 News

[2025.03.24] MedRAG has recently drawn some interest from media outlets and bloggers, such as Medium, AI Era, and CSDN. Tks for all the support — we're continuing to improve! 🙏
[2025.03.12] We release the officially generated diagnostic knowledge graph for the DDXPlus dataset. Have a try!
[2025.02.25] The Chinese Demo of MedRAG is now available on Bilibili, and the English Demo is available on YouTube.
[2025.02.10] MedRAG reported by the 'Quantum Heart' on RedNote.
[2025.02.04] We release the official implementation of MedRAG.
[2025.01.20] MedRAG has been accepted by WWW'25. Please check the latest paper version on ArXiv.

Overview

Figure 1: The overall framework of MedRAG.

MedRAG is to designed to enhance Retrieval-Augmented Generation (RAG) models by integrating Knowledge Graph (KG)-elicited reasoning, specifically for the medical domain. This model helps healthcare professionals generate diagnosis and treatment recommendations based on patient manifestations, improving diagnostic accuracy and reducing the risk of misdiagnosis, particularly for diseases with similar manifestations.

Key features of MedRAG include:

Knowledge Graph-Enhanced Reasoning: Integrates a diagnostic knowledge graph to improve the reasoning ability of the RAG model.
Accurate Diagnostic Support: Provides specific diagnostic insights and personalized treatment recommendations, even for complex or similar diseases.
Follow-Up Question Generation: Proactively generates relevant follow-up questions to clarify ambiguous patient information and enhance decision-making.
Evaluated on Real-World and Public Datasets: Demonstrated superior performance on the public DDXPlus dataset and a private chronic pain diagnostic dataset (CPDD) compared to existing RAG models.

Core Design of MedRAG: Knowledge Graph-Elicited Reasoning

The MedRAG approach addresses the following key challenges:

Knowledge Graph Construction: Using hierarchical aggregation to build disease knowledge graph, capturing complex relationships between diseases, categories, and their manifestations.
RAG-Based Reasoning: Combines EHR retrieval with diagnostic knowledge graph reasoning to enhance diagnostic accuracy.
Personalized Diagnostic Suggestions: Integrates multi-level information to provide personalized treatment and follow-up questions for doctors.

Dataset

The full MedRAG test set, including raw image data and annotations, can be downloaded from the links below. Due to the large size of the dataset, a lighter version is also available for quicker testing.

Download Full DDXPlus: A large-scale, synthesized EHR dataset widely recognized for offering complex, diverse medical diagnosis cases. It includes comprehensive patient data such as socio-demographic information, underlying diseases, symptoms, and antecedents.
CPDD: A private EHR dataset for chronic pain management from our partner hospital, Tan Tock Seng Hospital in Singapore.

Usage

To use MedRAG, follow these steps:

Get ready for repository and dependencies
Clone this repository to your local machine and install requirements in requirements.txt
```
git clone https://github.com/SNOWTEAM2023/MedRAG.git

cd MedRAG
pip install -r requirements.txt
```
Modify Tokens
To use your own OpenAI and Hugging Face API tokens, replace the placeholders in the authentication.py with your actual tokens. The relevant sections in the code have been left blank for this purpose.
```
# Replace with your OpenAI API token
api_key = "your_openai_api_token"   

# Replace with your Hugging Face API token
hf_token = "your_huggingface_api_token"
```
Run the main.py script
Once the paths and tokens have been updated, run the main.py file to start the program:
```
python main.py
```

Experimental Results

Main results

Figure 1: Results of quantitative performance comparison.

Our proposed MedRAG achieved the best or second-best~(with only one exception) performance across multiple metrics in all datasets. Accuracy on the $L 3$ metric is the best indicator of MedRAG's performance, as higher specificity increases diagnostic difficulty. MedRAG outperformed the second-best scores on the CPDD and DDXPlus datasets.

Figure 2: Performance of MedRAG on different LLM backbones with and without KG-elicited reasoning.

We evaluate KG-elicited reasoning on different LLM backbones, including both open-source and closed-source models. The results demonstrate that the inclusion of KG-elicited reasoning significantly enhances diagnostic accuracy across $L 1$ , $L 2$ , and $L 3$ for all backbone LLMs, compared to models without its use.

Additional Visualizations

Clustering result

The result of disease clustering in CPDD.

Diseases knowledge graph

The result of hierarchical aggregation in DDXPlus.

The result of hierarchical aggregation in CPDD.

Diagnostic differences augmentation

Diagnosic difference example.

While lumbar canal stenosis and sciatica share some similar features, the critical distinguishing factor lies in the response to sitting. In lumbar canal stenosis, features are typically alleviated when sitting, whereas in sciatica, sitting tends to exacerbate the discomfort.

📖 Citation

If you find our work useful, please consider citing our paper:

@inproceedings{zhao2025medrag,
  title={MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot},
  author={Zhao, Xuejiao and Liu, Siyan and Yang, Su-Yin and Miao, Chunyan},
  booktitle={THE WEB CONFERENCE 2025}
}

Name	Name	Last commit message	Last commit date
Latest commit 2023SNOWTEAM Update README.md Mar 24, 2025 5158344 · Mar 24, 2025 History 31 Commits
Embeddings_saved	Embeddings_saved	Initial commit	Feb 4, 2025
dataset	dataset	dataset	Mar 11, 2025
images	images	Add method figure	Feb 4, 2025
metrics	metrics	Initial commit	Feb 4, 2025
.DS_Store	.DS_Store	dataset	Mar 11, 2025
KG_Retrieve.py	KG_Retrieve.py	Initial commit	Feb 4, 2025
README.md	README.md	Update README.md	Mar 24, 2025
authentication.py	authentication.py	Initial commit	Feb 4, 2025
main.py	main.py	Initial commit	Feb 4, 2025
main_MedRAG.py	main_MedRAG.py	Update main_MedRAG.py	Feb 14, 2025
requirements.txt	requirements.txt	Initial commit	Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot

Authors

🔥 News

Overview

Core Design of MedRAG: Knowledge Graph-Elicited Reasoning

Dataset

Usage

Experimental Results

Main results

Additional Visualizations

📖 Citation

About

Releases

Packages

Contributors 2

Languages

SNOWTEAM2023/MedRAG

Folders and files

Latest commit

History

Repository files navigation

MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot

Authors

🔥 News

Overview

Core Design of MedRAG: Knowledge Graph-Elicited Reasoning

Dataset

Usage

Experimental Results

Main results

Additional Visualizations

📖 Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages