Healthi Subnet

·

Introduction

This repository hosts the source code for the Healthi subnet, which operates atop the Bittensor network. The primary goal of this subnet is to utilize AI models for predictive diagnostics based on electronic health records (EHRs).

In the rapidly advancing field of healthcare technology, AI integration is transforming preventive medicine, particularly through predictive diagnostics. The increasing availability of patient data, especially EHRs, presents a significant opportunity to leverage AI for predicting health outcomes. This subnet on the Bittensor network rewards miners based on their AI models' performance in clinical prediction tasks, such as disease forecasting using EHRs. Our network aims to utilize these high-performing AI models developed by miners to improve patient outcomes, enhance healthcare delivery, and promote personalized clinical risk management.

Quickstart

This repository requires Python 3.10 or higher and Ubuntu 22.04/Debian 12.
We recommend creating a Python virtual environment (venv); using conda may cause errors.
Installation (omit the first line if Bittensor is already installed):

$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/opentensor/bittensor/master/scripts/install.sh)"
$ sudo apt update && sudo apt install jq npm python3.10-dev python3.10-venv git && sudo npm install pm2 -g && pm2 update
$ git clone https://github.com/Healthi-Labs/healthi-subnet.git
$ cd healthi-subnet
$ python3 -m venv .venv

If you are not familiar with Bittensor, you should first perform the following activities:

Subnet register

Mainnet

btcli subnet register --netuid 34 --wallet.name {cold_wallet_name} --wallet.hotkey {hot_wallet_name}

Testnet:

btcli subnet register --netuid 133 --wallet.name {cold_wallet_name} --wallet.hotkey {hot_wallet_name} --subtensor.network test

Note

Validators need to establish an internet connection with the miner. This requires ensuring that the port specified in --axon.port is reachable on the virtual machine via the internet. This involves either opening the port on the firewall or configuring port forwarding.

If you want to run on testnet, set netuid to 133, and add --validator_min_stake 0.0 --subtensor.network test.

Run miner:

$ cd healthi-subnet
$ source .venv/bin/activate
$ bash scripts/run.sh \
--name healthi_miner \
--install_only 0 \
--max_memory_restart 10G \
--branch main \
--netuid 34 \
--profile miner \
--wallet.name {cold_wallet_name} \
--wallet.hotkey {hot_wallet_name} \
--axon.port 12345

Run validator on testnet (validator updates automatically):

$ cd healthi-subnet
$ source .venv/bin/activate
$ bash scripts/run.sh \
--name healthi_validator \
--install_only 0 \
--max_memory_restart 5G \
--netuid 34 \
--profile validator \
--wallet.name {cold_wallet_name} \
--wallet.hotkey {hot_wallet_name}

To verify whether your miner is effectively responding to queries from validators, run the following commands (If you've recently started the miner, allow a few minutes):

cat ~/.pm2/logs/healthi-miner-out.log | grep "SUCCESS"

FAQ

How does rewarding work?

Miners are rewarded based on the accuracy of their predictions for future health conditions derived from analyses of electronic health record (EHR) sequences. The top 20% of miners receive significantly higher rewards than the rest.

What is the expected data input and output as a miner?

As a miner, your input will consist of sequences of Electronic Health Records (EHR) encoded with International Statistical Classification of Diseases and Related Health Problems (ICD-10) codes. In the following example, the patient visited the hospital twice, receiving two diagnoses each time:

Example Input:

[['D693', 'I10'], ['Z966', 'A047']]

The current disease prediction task involves estimating the likelihood of getting the following 14 diseases within one year. Outputs should be an array or list of probabilities in the order listed below:

Hypertension
Diabetes
Asthma
Chronic Obstructive Pulmonary Disease
Atrial Fibrillation
Coronary Heart Disease
Stroke
Anxiety and Depression
Dementia
Myocardial Infarction
Chronic Kidney Disease
Thyroid Disorder
Heart Failure
Cancer

Example Output:

[0.0027342219837009907, 0.012263162061572075, 0.01795087940990925, 0.016055596992373466, 0.010267915204167366, 0.0002267731324536726, 0.02317667566239834, 0.39082783460617065, 0.017462262883782387, 0.033581722527742386, 0.014757075347006321, 0.03425902500748634, 0.015123098157346249, 0.028889883309602737]

Compute Requirements

The computational requirements for participating as a miner or validator in our subnet are minimal. Our subnet does not necessitate GPU capabilities and runs effectively on a virtual private server (VPS) with 4 virtual CPUs and 16 GB RAM. Although miners are permitted to use GPU resources, achieving higher rewards within our subnet depends more on developing superior predictive models than on computational power.

Data source and how do we prevent data exploitation?

Our data originates from authentic inpatient records, which are anonymized using Generative Adversarial Networks (GANs) to preserve the original data distributions while ensuring patient confidentiality. To prevent data exploitation and enhance security, our API continuously generates unique, synthetic electronic health record sequences for validators, protecting against replay attacks.

How can I train my model?

Our base model is a small Transformer model equipped with a customized tokenizer for electronic health record (EHR) data. We recommend miners train their model based on our tokenizer. Training data is available at this link. Miners are also encouraged to use their own sourced EHR data for training.

Validator minimum staking requirements

Validators need to stake at least 10,000 Tao on the mainnet to query our data API, and Testnet validators won't have access to the data API, but they can still acquire data locally for testing purposes. Locally obtained data carries significantly less weight than data from the API.

How the synthetic EHR data generator is built?

The synthetic EHR data generator is built following a method detailed in a publication, which can be found here. We utilized datasets from both the US and the UK. The US hospital data is sourced from MIMIC-IV. The UK data is sourced from THIN. Due to the sensitivity of health data, models trained on EHR data cannot be made publicly available. This restriction is in place to prevent the potential reconstruction of the original dataset by malicious parties, similar to attempts to reverse engineer the training sets of large language models. As an example, you can find our data generator trained on synthetic EHR data here.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
healthi		healthi
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Healthi Subnet

Introduction

Quickstart

Subnet register

FAQ

About

Releases

Packages

Languages

License

Healthi-Labs/healthi-subnet

Folders and files

Latest commit

History

Repository files navigation

Healthi Subnet

Introduction

Quickstart

Subnet register

FAQ

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages