Multistate competing risk models in Python
Read the Docs
For details, read the JOSS paper
Hagai Rossman, Ayya Keshet, Malka Gorfine 2022
PyMSM
is a Python package for fitting competing risks and multistate models, with a simple API which allows user-defined model, predictions at a single or population sample level, statistical summaries and figures.
Features include:
- Fit a Competing risks Multistate model based on survival analysis (time-to-event) models.
- Deals with right censoring, competing events, recurrent events, left truncation, and time-dependent covariates.
- Run Monte-carlo simulations for paths emitted by the trained model and extract various summary statistics and plots.
- Load or configure a pre-defined model and run path simulations.
- Modularity and compatibility for different time-to-event models such as Survival Forests and other custom models.
pip install pymsm
Requires Python >=3.8.
Alternatively if you want to work with the latest development version, you can also directly install it from GitHub. To do that, you will need to
- Clone the repository to
REPO_FOLDER
(choose your own location) - Got to the location of the repository
cd $REPO_FOLDER
(Note! Not the pymsm folder, but the one above) - Run
pip install -e pymsm
. This imports the package for your Python interpreter
# Load data (See Rotterdam example for full details)
from pymsm.datasets import prep_rotterdam
dataset, states_labels = prep_rotterdam()
# Define terminal states
terminal_states = [3]
#Init MultistateModel
from pymsm.multi_state_competing_risks_model import MultiStateModel
multi_state_model = MultiStateModel(dataset,terminal_states)
# Fit model to data
multi_state_model.fit()
# Run Monte-Carlo simulation and sample paths
mcs = multi_state_model.run_monte_carlo_simulation(
sample_covariates = dataset[0].covariates.values,
origin_state = 1,
current_time = 0,
max_transitions = 2,
n_random_samples = 10,
print_paths=True)
stateDiagram-v2
s1 : (1) Primary surgery
s2 : (2) Disease recurrence
s3 : (3) Death
s1 --> s2: 1518
s1 --> s3: 195
s2 --> s3: 1077
Multi-state data are common, and could be used to describe trajectories in diverse health applications; such as describing a patient's progression through disease stages or a patient’s path through different hospitalization states. When faced with such data, a researcher or clinician might seek to characterize the possible transitions between states, their occurrence probabilities, or to predict the trajectory of future patients - all conditioned on various baseline and time-varying individual covariates. By fitting a multi-state model, we can learn the hazard for each specific transition, which would later be used to predict future paths. Predicting paths could be used at a single patient level, for example predict how long until a cancer patient will be relapse-free given his current health status, or at what probability will a patient end a trajectory at any of the possible states; and at the population level, for example predicting how many patients which arrive at the emergency-room will need to be admitted, given their covariates.
PyMSM is a Python package for fitting multi-state models, with a simple API which allows user-defined models, predictions at a single or population sample level, and statistical summaries and figures. Features of this software include:
- Fitting a Competing risks Multistate model based on various types of survival analysis (time-to-event) such as Cox proportional hazards models or machine learning models, while taking into account right censoring, competing events, recurrent events, left truncation, and time-dependent covariates.
- Running Monte-carlo simulations (in parallel computation) for paths emitted by the trained model and extracting various summary statistics and plots.
- Loading or configuring a pre-defined model and generating simulated data in terms of random paths using model parameters, which could be highly useful as a research tool.
- Modularity and compatibility for different time-to-event models such as Survival Forests and other custom ML models provided by the user.
The package is designed to allow modular usage by both experienced researchers and non-expert users. In addition to fitting a multi-state model for a given data - PyMSM allows the user to simulate trajectories, thus creating a multi-state data-set, from a predefined model. This could be a valuable research tool - both for sharing sensitive simulated individual data and as a tool for any downstream task which needs individual trajectories.
If you found this library useful in academic research, please cite:
@article{Rossman2022, doi = {10.21105/joss.04566},
url = {https://doi.org/10.21105/joss.04566},
year = {2022},
author = {Hagai Rossman and Ayya Keshet and Malka Gorfine},
title = {PyMSM: Python package for Competing Risks and Multi-State models for Survival Data},
journal = {Journal of Open Source Software} }
Also consider starring the project on GitHub
This project is based on methods first introduced by the authors of Roimi et. al. 2021.
Original R code by Jonathan Somer, Asaf Ben Arie, Rom Gutman, Uri Shalit & Malka Gorfine available here.
Also see Rossman & Meir et. al. 2021 for an application of this model on COVID-19 hospitalizations data.