Medperf is an open benchmarking platform for medical artificial intelligence using Federated Evaluation.
Inside this repo you can find all important pieces for running MedPerf. In its current state, it includes:
-
Backend server implemented in django. Can be found inside the
server
folder -
Command Line Interface for interacting with the server. Can be found inside the
cli
folder.
In order to run MedPerf locally, you must host the server in your machine, and install the CLI.
-
MedPerf has some dependencies that must be installed by the user before being able to run. This are mlcube and the required runners (right now there's docker and singularity runners). Depending on the runner you're going to use, you also need to download the runner engine. For this demo, we will be using Docker, so make sure to get the Docker Engine
pip install mlcube mlcube-docker mlcube-singularity
-
To host the server, please follow the instructions inside the
server/README.md
file. -
To install the CLI, please follow the instructions inside the
cli/README.md
file.
The server comes with prepared users and cubes for demonstration purposes. A toy benchmark was created beforehand for benchmarking XRay models. To execute it you need to:
-
The toy benchmark uses the TorchXRayVision library behind the curtain for both data preparation and model implementations. To run the benchmark, you need to have a compatible dataset. The supported dataset formats are:
- RSNA_Pneumonia
- CheX
- NIH
- NIH_Google
- PC
- COVID19
- SIIM_Pneumothorax
- VinBrain
- NLMTB
As an example, we're going to use the CheXpert Dataset for the rest of this guide. You can get it here. Even though you could use any version of the dataset, we're going to be using the downsample version for this demo. Once you retrieve it, keep track of where it is located on your system. For this demonstration, we're going to assume the dataset was unpacked to this location:
~/CheXpert-v1.0-small
We're going to be using the validation split
-
If you followed the server hosting instructions, then your instance of the server already has some toy users to play with. The CLI needs to be authenticated with a user to be able to execute commands and interact with the server. For this, you can run:
medperf login
And provide
testdataowner
as user andtest
as password. You only need to authenticate once. All following commands will be authenticated with that user. -
Benchmarks will usually require a data owner to generate a new version of the dataset that has been preprocessed for a specific benchmark. The command to do that has the following structure
medperf dataset create -b <BENCHMARK_UID> -d <PATH_TO_DATASET> -l <PATH_TO_LABELS>
for the CheXpert dataset, this would be the command to execute:
medperf dataset create -b 1 -d ~/CheXpert-v1.0-small -l ~/CheXpert-v1.0-small
Where we're executing the benchmark with UID
1
, since is the first and only benchmark in the server. By doing this, the CLI retrieves the data preparation cube from the benchmark and processes the raw dataset. You will be prompted for additional information and confirmations for the dataset to be prepared and registered onto the server. -
Once the dataset is prepared and registered, you can execute the benchmark with a given model mlcube. The command to do this has the following structure
medperf run -b <BENCHMARK_UID> -d <DATA_UID> -m <MODEL_UID>
For this demonstration, you would execute the following command:
medperf run -b 1 -d 63a -m 2
Given that the prepared dataset was assigned the UID of 63a. You can find out what UID your prepared dataset has with the following command:
medperf dataset ls
Additional models have been provided to the benchmark, this is the list of models you can execute:
- 2: CheXpert DenseNet Model
- 4: ResNet Model
- 5: NIH DenseNet Model
During model execution, you will be asked for confirmation of uploading the metrics results to the server.
A test.sh
script is provided for automatically running the whole demo on a public mock dataset.
- It is assumed that the
medperf
command is already installed (See instructions oncli/README.md
) and that all dependencies for the server are also installed (See instructions onserver/README.md
). mlcube
command is also required (See instructions oncli/README.md
)- The docker engine must be running
- A connection to internet is required for retrieving the demo dataset and mlcubes
Once all the requirements are met, running sh test.sh
will:
- cleanup any leftover medperf-related files (WARNING! Running this will delete the medperf workspace, along with prepared datasets, cubes and results!)
- Instantiate and seed the server using
server/seed.py
- Retrieve the demo dataset
- Run the CLI demo using
cli/cli.sh
- cleanup temporary files