exVariance

A tool for integrated analysis of the liquid biopsy sequencing data

Installation

Best Practice

Install the github source code and ependencies below listed:

  git clone https://github.com/ShangZhang/exVariance.git

Dependencies:

Anaconda3/Miniconda3 conda version latter than 4.8.4
Python version latter than 3.8.3
Snakemake version=5.14.0
R version=3.6.3
R packages

How to install all the dependencies:

Install Anaconda3/Minicodna3 and Python
```
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
```
- Whilst running the installation script, follow the commands listed on screen, and press the enter key to scroll.
- Make sure to answer yes when asked if you want to prepend Miniconda3 to PATH.
- Close your terminal, open a new one and you should now have Conda working! Test by entering:
```
conda update conda
```
  - Press y to confirm the conda updates
Install Mamba The default conda solver is a bit slow and sometimes has issues with selecting the special version packages. Therefore, we recommend to install Mamba as a drop-in replacement via
```
conda install -c conda-forge mamba
```

Install Snakemake 5.14.0 and R 3.6.3

mamba create -n exvariance4 -c conda-forge -c bioconda snakemake=5.14.0 r-base=3.6.3 -y

Install related R packages Best Practice

mamba install -c r -c conda-forge -c bioconda -c eugene_t r-argparse r-clustersim r-ggpubr bioconductor-scater bioconductor-scran bioconductor-singlecellexperiment bioconductor-sva bioconductor-edger bioconductor-ruvseq r-kbet r-devtools -y

## continue to install other tools in R
conda activate exvariance4
R
> library(usethis)
> library(devtools)
> devtools::install_github(c("hemberg-lab/scRNA.seq.funcs","theislab/kBET"),host="https://api.github.com")

OR

conda activate exvariance4
R
> library(usethis)
> library(devtools)
> install.packages(c("argparse","clusterSim","ggpubr","BiocManager","devtools"))
> BiocManager::install(c("scater","scran","SingleCellExperiment","sva","edgeR","RUVSeq"))
> devtools::install_github(c("hemberg-lab/scRNA.seq.funcs"),host="https://api.github.com")
> devtools::install_github(c("theislab/kBET"),host="https://api.github.com")

Docker image

For easy installation, you can use the exVariance image of docker with all dependencies installed:

  docker pull <exVariance_image>

dependencies
1. docker version>=19.03.4

Singularity image

Alternatively, you can use use singularity or udocker to run the container for Linux kernel < 3 or if you don't have permission to use docker.

Download References

exVariance is dependent on reference files which can be found for the supported species listed below: hg38

To unzip these files: tar -xzf hg19.tar.gz OR tar -xzf mm9.tar.gz

Usage

Help message

Run exVariance --help to get the usage:

usage: exVariance [-h] --user_config_file USER_CONFIG_FILE

                  [--cluster]
                  [--cluster-config CLUSTER_CONFIG]
                  [--cluster-command CLUSTER_COMMAND]
                  [--singularity SINGULARITY]
                  [--singularity-wrapper-dir SINGULARITY_WRAPPER_DIR]

                  { RNA_seq_pre_process,RNA_seq_exp_matrix,
                    RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
                    RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
                    DNA_seq_ctDNA_mutation,DNA_seq_NP,
                    DNA_meth_WGBS,DNA_meth_RRBS,
                    DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
                    DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
                    }

exVariance is a tool for integrated analysis of the liquid biopsy sequencing data.

positional arguments:
  { RNA_seq_pre_process,RNA_seq_exp_matrix,
    RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
    RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
    DNA_seq_ctDNA_mutation,DNA_seq_NP,
    DNA_meth_WGBS,DNA_meth_RRBS,
    DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
    DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
    }

optional arguments:
  -h, --help            show this help message and exit
  --user_config_file USER_CONFIG_FILE, -u USER_CONFIG_FILE
                        the user config file

  --cluster             submit to cluster
  --cluster-config CLUSTER_CONFIG
                        cluster configuration file

  --cluster-command CLUSTER_COMMAND
                        command for submitting job to cluster (default read
                        from {config_dir}/cluster_command.txt
  --singularity SINGULARITY
                        singularity image file
  --singularity-wrapper-dir SINGULARITY_WRAPPER_DIR
                        directory for singularity wrappers


positional arguments:
  { RNA_seq_pre_process,RNA_seq_exp_matrix,
    RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
    RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
    DNA_seq_ctDNA_mutation,DNA_seq_NP,
    DNA_meth_WGBS,DNA_meth_RRBS,
    DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
    DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
    }

For additional help or support, please visit https://github.com/ShangZhang/exVariance

Input files

RNA-seq related examples can be found in demo directory with the following structure:

    ./demo/*/
    |-- config
    |   |-- default_config.yaml
    |   |-- <data_name>.yaml
    |   |-- dapars_configure.txt
    |   `-- RNAEditor_configure.txt
    |-- data
    |   |-- fastq/
    |   |-- sample_ids.txt
    |   |-- sample_classes.txt
    |   |-- compare_groups.yaml
    |   `-- batch_info.txt
    |-- output
    `-- summary

Other related examples can be found in demo directory with the following structure:

    ./demo/*/
    |-- config
    |   |-- default_config.yaml
    |   `-- <data_name>.yaml
    |-- data
    |   |-- fastq/
    |   `-- sample_ids.txt
    |-- output
    `-- summary

Note:

config/default_config.yaml: the default configuration file. If you don't understand, don't change the content.

config/<data_name>.yaml: the user defined configuration file, to point out the related used path.

data/fastq/ : directory contain samples name, suffixed with 'fastq' 'fasta.gz' or 'fastq.gz'.

data/sample_ids.txt: table of sample names (remove the suffix 'fastq' 'fasta.gz' or 'fastq.gz' )

output/: the output directory

summary/ : contain the summary files

You can create your own data directory with the above directory structure. Multiple datasets can be put in the same directory by replacing "example" with your own dataset names.

Run

For RNA-seq realted analysis

exVariance -u <USER_CONFIG_FILE> RNA_seq_pre_process

exVariance -u <USER_CONFIG_FILE> RNA_seq_exp_matrix

exVariance -u <USER_CONFIG_FILE> RNA_seq_fusion_transcripts

exVariance -u <USER_CONFIG_FILE> RNA_seq_RNA_editing

exVariance -u <USER_CONFIG_FILE> RNA_seq_SNP

exVariance -u <USER_CONFIG_FILE> RNA_seq_APA

exVariance -u <USER_CONFIG_FILE> RNA_seq_AS

For DNA-methylation realted analysis

exVariance -u <USER_CONFIG_FILE> DNA_meth_WGBS

exVariance -u <USER_CONFIG_FILE> DNA_meth_RRBS

exVariance -u <USER_CONFIG_FILE> DNA_meth_Seal_seq

exVariance -u <USER_CONFIG_FILE> DNA_meth_Methyl-cap_seq

exVariance -u <USER_CONFIG_FILE> DNA_meth_MeDIP_seq

exVariance -u <USER_CONFIG_FILE> DNA_meth_MCTA_seq

For DNA-seq realted analysis

exVariance -u <USER_CONFIG_FILE> DNA_seq_ctDNA_mutation

exVariance -u <USER_CONFIG_FILE> DNA_seq_NP

Output and Summary

For RNA-seq realted analysis

Rule Graph

For RNA-seq realted analysis

pre process

expression matrix

including filter, imputation, normalization, batch removing

fusion transcript

SNP

RNA editing

TCR analysis

For DNA methylation realted analysis

DNA_meth_WGBS,DNA_meth_RRBS

DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,DNA_meth_MeDIP_seq

DNA_meth_MCTA_seq

For DNA-seq related analysis

DNA-seq np

Change Log

v1.0.0

Release exVariance

v1.0.1

Fix some bugs

Details

For paire end analysis, the fastq files should end with _1.fastq.gz and _2.fastq.gz, and in the sample_ids.txt file, the suffix should not write in the file.
You need to create the following 3 directories: summary, output and temp.

System Requirements:

Some of the tools that exVariance uses, e.g. STAR is very memory intensive programs. Therefore we recommend the following system requirements for exVariance:

Minimal system requirements:

We recommend that you run exVariance on a server that has at least 48GB of ram. This will allow for a single-threaded exVariance run (on human samples).

Recommended system requirements:

We recommend that you have at least 64GB of ram and at least a 4-core CPU if you want to run exVariance in multi-threaded mode (which will speedup the workflow significantly).
Our own servers have 64GB of ram and 16 cores.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
bin		bin
demo		demo
docs		docs
snakemake		snakemake
.gitignore		.gitignore
README.md		README.md
_config.yml		_config.yml

HUNNNGRY/exVariance

Folders and files

Latest commit

History

Repository files navigation

exVariance

Table of Contents:

Installation

Best Practice

Dependencies:

How to install all the dependencies:

Docker image

Singularity image

Download References

Usage

Help message

Input files

Run

For RNA-seq realted analysis

For DNA-methylation realted analysis

For DNA-seq realted analysis

Output and Summary

For RNA-seq realted analysis

Rule Graph

For RNA-seq realted analysis

pre process

expression matrix

fusion transcript

SNP

RNA editing

TCR analysis

For DNA methylation realted analysis

DNA_meth_WGBS,DNA_meth_RRBS

DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,DNA_meth_MeDIP_seq

DNA_meth_MCTA_seq

For DNA-seq related analysis

DNA-seq np

Change Log

v1.0.0

v1.0.1

Details

System Requirements:

Minimal system requirements:

Recommended system requirements:

Copyright and License Information

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages