A tool for integrated analysis of the liquid biopsy sequencing data
- Installation
- Download References
- Usage
- System Requirements
- Copyright and License Information
- Citation
Install the github source code and ependencies below listed:
git clone https://github.com/ShangZhang/exVariance.git
- Anaconda3/Miniconda3 conda version latter than 4.8.4
- Python version latter than 3.8.3
- Snakemake version=5.14.0
- R version=3.6.3
- R packages
- Install Anaconda3/Minicodna3 and Python
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh
- Whilst running the installation script, follow the commands listed on screen, and press the enter key to scroll.
- Make sure to answer yes when asked if you want to prepend Miniconda3 to PATH.
- Close your terminal, open a new one and you should now have Conda working! Test by entering:
conda update conda
- Press y to confirm the conda updates
- Install Mamba
The default conda solver is a bit slow and sometimes has issues with selecting the special version packages. Therefore, we recommend to install Mamba as a drop-in replacement via
conda install -c conda-forge mamba
- Install Snakemake 5.14.0 and R 3.6.3
mamba create -n exvariance4 -c conda-forge -c bioconda snakemake=5.14.0 r-base=3.6.3 -y
- Install related R packages
Best Practice
mamba install -c r -c conda-forge -c bioconda -c eugene_t r-argparse r-clustersim r-ggpubr bioconductor-scater bioconductor-scran bioconductor-singlecellexperiment bioconductor-sva bioconductor-edger bioconductor-ruvseq r-kbet r-devtools -y
OR## continue to install other tools in R conda activate exvariance4 R > library(usethis) > library(devtools) > devtools::install_github(c("hemberg-lab/scRNA.seq.funcs","theislab/kBET"),host="https://api.github.com")
conda activate exvariance4 R > library(usethis) > library(devtools) > install.packages(c("argparse","clusterSim","ggpubr","BiocManager","devtools")) > BiocManager::install(c("scater","scran","SingleCellExperiment","sva","edgeR","RUVSeq")) > devtools::install_github(c("hemberg-lab/scRNA.seq.funcs"),host="https://api.github.com") > devtools::install_github(c("theislab/kBET"),host="https://api.github.com")
For easy installation, you can use the exVariance image of docker with all dependencies installed:
docker pull <exVariance_image>
- dependencies
- docker version>=19.03.4
Alternatively, you can use use singularity or udocker to run the container for Linux kernel < 3 or if you don't have permission to use docker.
exVariance is dependent on reference files which can be found for the supported species listed below: hg38
To unzip these files: tar -xzf hg19.tar.gz OR tar -xzf mm9.tar.gz
Run exVariance --help
to get the usage:
usage: exVariance [-h] --user_config_file USER_CONFIG_FILE
[--cluster]
[--cluster-config CLUSTER_CONFIG]
[--cluster-command CLUSTER_COMMAND]
[--singularity SINGULARITY]
[--singularity-wrapper-dir SINGULARITY_WRAPPER_DIR]
{ RNA_seq_pre_process,RNA_seq_exp_matrix,
RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
DNA_seq_ctDNA_mutation,DNA_seq_NP,
DNA_meth_WGBS,DNA_meth_RRBS,
DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
}
exVariance is a tool for integrated analysis of the liquid biopsy sequencing data.
positional arguments:
{ RNA_seq_pre_process,RNA_seq_exp_matrix,
RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
DNA_seq_ctDNA_mutation,DNA_seq_NP,
DNA_meth_WGBS,DNA_meth_RRBS,
DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
}
optional arguments:
-h, --help show this help message and exit
--user_config_file USER_CONFIG_FILE, -u USER_CONFIG_FILE
the user config file
--cluster submit to cluster
--cluster-config CLUSTER_CONFIG
cluster configuration file
--cluster-command CLUSTER_COMMAND
command for submitting job to cluster (default read
from {config_dir}/cluster_command.txt
--singularity SINGULARITY
singularity image file
--singularity-wrapper-dir SINGULARITY_WRAPPER_DIR
directory for singularity wrappers
positional arguments:
{ RNA_seq_pre_process,RNA_seq_exp_matrix,
RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
DNA_seq_ctDNA_mutation,DNA_seq_NP,
DNA_meth_WGBS,DNA_meth_RRBS,
DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
}
For additional help or support, please visit https://github.com/ShangZhang/exVariance
RNA-seq related examples can be found in demo
directory with the following structure:
./demo/*/
|-- config
| |-- default_config.yaml
| |-- <data_name>.yaml
| |-- dapars_configure.txt
| `-- RNAEditor_configure.txt
|-- data
| |-- fastq/
| |-- sample_ids.txt
| |-- sample_classes.txt
| |-- compare_groups.yaml
| `-- batch_info.txt
|-- output
`-- summary
Other related examples can be found in demo
directory with the following structure:
./demo/*/
|-- config
| |-- default_config.yaml
| `-- <data_name>.yaml
|-- data
| |-- fastq/
| `-- sample_ids.txt
|-- output
`-- summary
Note:
config/default_config.yaml
: the default configuration file. If you don't understand, don't change the content.config/<data_name>.yaml
: the user defined configuration file, to point out the related used path.data/fastq/
: directory contain samples name, suffixed with 'fastq' 'fasta.gz' or 'fastq.gz'.data/sample_ids.txt
: table of sample names (remove the suffix 'fastq' 'fasta.gz' or 'fastq.gz' )output/
: the output directorysummary/
: contain the summary files
You can create your own data directory with the above directory structure. Multiple datasets can be put in the same directory by replacing "example" with your own dataset names.
exVariance -u <USER_CONFIG_FILE> RNA_seq_pre_process
exVariance -u <USER_CONFIG_FILE> RNA_seq_exp_matrix
exVariance -u <USER_CONFIG_FILE> RNA_seq_fusion_transcripts
exVariance -u <USER_CONFIG_FILE> RNA_seq_RNA_editing
exVariance -u <USER_CONFIG_FILE> RNA_seq_SNP
exVariance -u <USER_CONFIG_FILE> RNA_seq_APA
exVariance -u <USER_CONFIG_FILE> RNA_seq_AS
exVariance -u <USER_CONFIG_FILE> DNA_meth_WGBS
exVariance -u <USER_CONFIG_FILE> DNA_meth_RRBS
exVariance -u <USER_CONFIG_FILE> DNA_meth_Seal_seq
exVariance -u <USER_CONFIG_FILE> DNA_meth_Methyl-cap_seq
exVariance -u <USER_CONFIG_FILE> DNA_meth_MeDIP_seq
exVariance -u <USER_CONFIG_FILE> DNA_meth_MCTA_seq
exVariance -u <USER_CONFIG_FILE> DNA_seq_ctDNA_mutation
exVariance -u <USER_CONFIG_FILE> DNA_seq_NP
including filter, imputation, normalization, batch removing
- Release exVariance
- Fix some bugs
- For paire end analysis, the fastq files should end with
_1.fastq.gz
and_2.fastq.gz
, and in thesample_ids.txt
file, the suffix should not write in the file. - You need to create the following 3 directories:
summary
,output
andtemp
.
Some of the tools that exVariance uses, e.g. STAR is very memory intensive programs. Therefore we recommend the following system requirements for exVariance:
We recommend that you run exVariance on a server that has at least 48GB of ram. This will allow for a single-threaded exVariance run (on human samples).
We recommend that you have at least 64GB of ram and at least a 4-core CPU if you want to run exVariance in multi-threaded mode (which will speedup the workflow significantly).
Our own servers have 64GB of ram and 16 cores.
Copyright (C) Lu Lab @ Tsinghua University, Beijing, China 2020 All rights reserved