Implementation of Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS.
A generalized and unsupervised mixture variational model with a multi-armed deep neural network, to jointly infer the discrete type and continuous type-specific variability. This framework can be applied to analysis of both, uni-modal and multi-modal datasets. It outperforms comparable models in inferring interpretable discrete and continuous representations of cellular identity, and uncovers novel biological insights. MMIDAS can thus help researchers identify more robust cell types, study cell type-dependent continuous variability, interpret such latent factors in the feature domain, and study multi-modal datasets.
-
Allen Institute Patch-seq data
The electrophysiological features have been computed following the approach in cplAE_MET/preproc/data_proc_E.py
- Python >= 3.9
- PyTorch >= 2.0
- CUDA enabled computing device
Creat a new conda environment and activate it.
conda create -n mmidas python=3.9
conda activate mmidas
Clone the repository and change your working directory to the newly cloned repository's path. Then install all required packages listed in the requirement.txt
file.
cd <directory you want to place the repo>
git clone https://github.com/AllenInstitute/MMIDAS
cd MMIDAS
python -m pip install -r requirements.txt
conda install pytorch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 cpuonly -c pytorch
Generally, PyTorch is compatible with NVIDIA GPUs that support CUDA, as CUDA provides the GPU acceleration process. There has been progress in making PyTorch compatible with Apple's chips, but for now the focus is on CUDA-supported NVIDIA GPUs.
Use nvidia-smi
command to get the installed CUDA version.
Select the appropriate PyTorch installation from here, matching your CUDA version.
# CUDA 12.1
conda install pytorch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 pytorch-cuda=12.1 -c pytorch -c nvidia
Work through the provided notebooks
in the following sequential:
- Data Preparation: Prepare the data file for training and testing (here for Mouse Smart-seq data)
- Training: Once the data is ready, train a MMIDAS model with 2 mixture variational arms.
- Evaluation: Test the trained model(s) and identify categorical variables representing cell types using the proposed pruning approach in the simplex.
- Clusterability: Compare the accuracy and Silhouette score of identified clusters to the suggested t-type.
- State trvaersal analysis: Explore the role of genes in encoding the continuous variability inferred by the model.