sciPENN (single cell imputation Prediction Embedding Neural Network) is a joint deep learning computational tool that is useful for analyses of single-cell RNA-seq data. The sciPENN method's repository can be found here.
This repository is dedicated to providing the code used to perform all evaluations in the sciPENN paper. It includes code used to generate results for sciPENN, and for the competing methods:
- totalVI
- Seurat 4
It is recommended the user proceeds as follows.
- Clone this repository to their local machine
- Download the data from Box.
- Install all necessary packages.
- Run all evaluation notebooks.
- Run notebooks to generate all figures.
Clone this repository to your local machine using the standard procedure.
Download the data from Box, and place them into the currently empty data folder.
The user will need to install anaconda and two conda environments containing many dependencies.
First, install Anaconda if you do not already have it, so that you can access conda commands in terminal.
Next, use scipenn_env.yml to set up the "scipenn_env" environment. This environment is needed for all python notebooks. Also, you will need to run an extra command in order to make this conda environment accessible from jupyter.
To do this, simply cd in the cloned "sciPENN_codes" repository. Once in this directory, run the following two commands.
$ conda env create -f scipenn_env.yml
$ python -m ipykernel install --user --name=sciPENN_env
The user will also need to install the "r40seurat40" environment is needed for all R notebooks. This installation process is much slower and more complicated, since setting up R environments is generally trickier. This environment can be set up by running the following commands one-by-one.
$ conda create --name r40seurat40
$ conda activate r40seurat40
$ conda config --add channels conda-forge
$ conda config --set channel_priority strict
$ conda install -c conda-forge r-base
$ conda install -c conda-forge python-igraph
$ conda install -c conda-forge r-hdf5r
$ conda install jupyter
$ R
> install.packages('Seurat')
> install.packages("remotes")
> remotes::install_github("mojaveazure/seurat-disk")
> install.packages(‘reticulate’)
> install.packages('IRkernel')
> IRkernel::installspec(name = 'r40seurat40', displayname = 'r40seurat40')
> quit()
Next, it is recommended that the user run all of the evaluation notebooks. The user should activate either the scipenn_env or r40seurat40 environment before opening jupyter to run the python notebooks. Note that the user can activate any environment that has jupyter installed. The following command will activate the scipenn_env command.
$ conda activate scipenn_env
Then, open jupyter. The user can use either jupyter notebook or jupyter lab. The following command will open jupyter notebook.
$ jupyter notebook
It is recommended that the user first run the sciPENN notebooks. Simply, open each of the following notebooks in jupyter. Make sure to set the active conda kernel in jupyter to "scipenn_env" and then run all cells. Repeat this for every notebook listed below.
- pbmc_to_malt sciPENN.ipynb
- Monocyte sciPENN.ipynb
- PBMC_to_H1N1 sciPENN.ipynb
- PBMC_to_H1N1 sciPENN - Runtime.ipynb
- PBMC_to_PBMC sciPENN.ipynb
- Covid_to_Covid sciPENN_Integrated.ipynb.ipynb
Next, it is recommended that the user run all scripts to evaluate totalVI. Simply, open each of the following notebooks in jupyter. Make sure to set the active conda kernel in jupyter to "scipenn_env" and then run all cells. Repeat this for every notebook listed below.
- PBMC_to_Malt TotalVI.ipynb
- Monocyte TotalVI.ipynb
- PBMC_to_H1N1 TotalVI.ipynb
- PBMC_to_H1N1 TotalVI - Runtime.ipynb
- PBMC_to_H1N1 TotalVI_Quantiles.ipynb
- PBMC_to_PBMC TotalVI.ipynb
- Covid_to_Covid TotalVI_Integrated.ipynb
Lastly, it is recommended that the user run the Seurat 4 notebooks. Simply, open each of the following notebooks in jupyter. Make sure to set the active conda kernel in jupyter to "r40seurat40" and then run all cells. Repeat this for every notebook listed below.
- PBMC_to_Malt seurat4.ipynb
- Monocyte seurat4.ipynb
- PBMC_to_H1N1 seurat4.ipynb
- PBMC_to_H1N1 seurat4 runtime.ipynb
- PBMC_to_PBMC seurat4.ipynb
In this last step, the user runs notebooks which use the saved results of previous notebooks to generate final figures. The notebooks generate results for each dataset/analysis one at a time. Make sure to set the active conda kernel in jupyter to "scipenn_env"