OphGLM

The first ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue

Table of content

Motivation
Modules
Dataset
Building Process
News
Reference

Motivation

OphGLM aims to enhance ophthalmic diagnostics by integrating visual and language models, improving human-computer interaction and clinical applicability. With the introduction of the FundusTuning-CN dataset, we hope to demonstrate promising advancements in fundus disease classification and interactive capabilities, paving the way for future developments in this field.

Modules

Constructing a fine-tuning dataset suitable for large language models in specific diseases from both basic knowledge and dialogue perspectives:

The illustration of Dynamic Label Pairing Strategy:

Basic LLM Model and Pre-trained Model:

ChatGLM-6B

Dataset

We have provided some available data in this source code, including: Ophthalmology historical doctor-patient dialogue from year 2010 to 2020 & Fine-tunning data sample in JSON

For building a fine-tuning dataset for LLMs targeting specific diseases, we recommend data collection from two aspects: foundational background knowledge and doctor-patient dialogues, from a clinical application perspective. The potential difficulty here lies in the fact that for specific diseases, especially rare diseases, doctor-patient dialogue data is very scarce.

Process

Step1: Constructing the Classification Model Leverage the ODIR5K Fundus Image Dataset

Selected images for Diabetic Retinopathy (DR), Age-related Macular Degeneration (AMD), Glaucoma, Myopia, and Cataracts from the ODIR5K dataset. Employing ConvNext as Image Encoder
Used ConvNext for image encoding, pretraining on a multi-disease classification task.

Link: ODIR5K

Step2: Collecting and Building LLM Fine-tunning Datasets

Fundus Instruction Set

Gathered information from web data and knowledge graphs, categorized into five subsets:
- Visual Diagnostic Instructions
- Causes and Symptoms
- Diagnosis and Examination
- Treatment and Prevention
- Prognosis and Lifestyle Fundus Conversation Set
Assembled fundus-related conversations, covering both rich and limited ophthalmic knowledge.

Step3: OphGLM Architecture

Components

Includes an Image Encoder, Text Encoder, Fusion Module, and a Large Language Model (LLM). Encoders and LLM Details
Used BERT as the text encoder, ConvNext as the image encoder, and ChatGLM-6B as the LLM. OphGLM Fine-tuning Process Pretraining the Image Encoder
Pretrained the image encoder on a multi-disease classification task. Tuning the Fusion Module
Trained the fusion module on a visual question-answering task, restricting updates to this module. Fine-tuning the LLM
Applied supervised fine-tuning to the LLM using image-text and plain text data to enhance multimodal comprehension.

News

2024.9.30 The core code and sample data have been uploaded! 🚩

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
examples		examples
figures		figures
model		model
README.md		README.md
chat.py		chat.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OphGLM

Motivation

Modules

Dataset

Process

News

About

Releases

Packages

Contributors 2

Languages

ML-AILab/OphGLM

Folders and files

Latest commit

History

Repository files navigation

OphGLM

Motivation

Modules

Dataset

Process

News

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages