Skip to content

HUNNNGRY/exVariance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

exVariance

A tool for integrated analysis of the liquid biopsy sequencing data


exVariance GitHub forks exVariance GitHub stars

Table of Contents:


Installation

Best Practice

Install the github source code and ependencies below listed:

  git clone https://github.com/ShangZhang/exVariance.git

Dependencies:

  1. Anaconda3/Miniconda3 conda version latter than 4.8.4
  2. Python version latter than 3.8.3
  3. Snakemake version=5.14.0
  4. R version=3.6.3
  5. R packages

How to install all the dependencies:

  1. Install Anaconda3/Minicodna3 and Python
    wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
    bash Miniconda3-latest-Linux-x86_64.sh
    
    • Whilst running the installation script, follow the commands listed on screen, and press the enter key to scroll.
    • Make sure to answer yes when asked if you want to prepend Miniconda3 to PATH.
    • Close your terminal, open a new one and you should now have Conda working! Test by entering:
      conda update conda
      
      • Press y to confirm the conda updates
  2. Install Mamba The default conda solver is a bit slow and sometimes has issues with selecting the special version packages. Therefore, we recommend to install Mamba as a drop-in replacement via
    conda install -c conda-forge mamba
  3. Install Snakemake 5.14.0 and R 3.6.3
    mamba create -n exvariance4 -c conda-forge -c bioconda snakemake=5.14.0 r-base=3.6.3 -y
    
  4. Install related R packages Best Practice
    mamba install -c r -c conda-forge -c bioconda -c eugene_t r-argparse r-clustersim r-ggpubr bioconductor-scater bioconductor-scran bioconductor-singlecellexperiment bioconductor-sva bioconductor-edger bioconductor-ruvseq r-kbet r-devtools -y
    ## continue to install other tools in R
    conda activate exvariance4
    R
    > library(usethis)
    > library(devtools)
    > devtools::install_github(c("hemberg-lab/scRNA.seq.funcs","theislab/kBET"),host="https://api.github.com")
    OR
    conda activate exvariance4
    R
    > library(usethis)
    > library(devtools)
    > install.packages(c("argparse","clusterSim","ggpubr","BiocManager","devtools"))
    > BiocManager::install(c("scater","scran","SingleCellExperiment","sva","edgeR","RUVSeq"))
    > devtools::install_github(c("hemberg-lab/scRNA.seq.funcs"),host="https://api.github.com")
    > devtools::install_github(c("theislab/kBET"),host="https://api.github.com")

Docker image

For easy installation, you can use the exVariance image of docker with all dependencies installed:

  docker pull <exVariance_image>
  • dependencies
    1. docker version>=19.03.4

Singularity image

Alternatively, you can use use singularity or udocker to run the container for Linux kernel < 3 or if you don't have permission to use docker.

Download References

exVariance is dependent on reference files which can be found for the supported species listed below: hg38

To unzip these files: tar -xzf hg19.tar.gz OR tar -xzf mm9.tar.gz

Usage

Help message

Run exVariance --help to get the usage:

usage: exVariance [-h] --user_config_file USER_CONFIG_FILE

                  [--cluster]
                  [--cluster-config CLUSTER_CONFIG]
                  [--cluster-command CLUSTER_COMMAND]
                  [--singularity SINGULARITY]
                  [--singularity-wrapper-dir SINGULARITY_WRAPPER_DIR]

                  { RNA_seq_pre_process,RNA_seq_exp_matrix,
                    RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
                    RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
                    DNA_seq_ctDNA_mutation,DNA_seq_NP,
                    DNA_meth_WGBS,DNA_meth_RRBS,
                    DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
                    DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
                    }

exVariance is a tool for integrated analysis of the liquid biopsy sequencing data.

positional arguments:
  { RNA_seq_pre_process,RNA_seq_exp_matrix,
    RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
    RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
    DNA_seq_ctDNA_mutation,DNA_seq_NP,
    DNA_meth_WGBS,DNA_meth_RRBS,
    DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
    DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
    }

optional arguments:
  -h, --help            show this help message and exit
  --user_config_file USER_CONFIG_FILE, -u USER_CONFIG_FILE
                        the user config file

  --cluster             submit to cluster
  --cluster-config CLUSTER_CONFIG
                        cluster configuration file

  --cluster-command CLUSTER_COMMAND
                        command for submitting job to cluster (default read
                        from {config_dir}/cluster_command.txt
  --singularity SINGULARITY
                        singularity image file
  --singularity-wrapper-dir SINGULARITY_WRAPPER_DIR
                        directory for singularity wrappers


positional arguments:
  { RNA_seq_pre_process,RNA_seq_exp_matrix,
    RNA_seq_fusion_transcripts,RNA_seq_RNA_editing,
    RNA_seq_SNP,RNA_seq_APA,RNA_seq_AS,
    DNA_seq_ctDNA_mutation,DNA_seq_NP,
    DNA_meth_WGBS,DNA_meth_RRBS,
    DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,
    DNA_meth_MeDIP_seq,DNA_meth_MCTA_seq
    }

For additional help or support, please visit https://github.com/ShangZhang/exVariance

Input files

RNA-seq related examples can be found in demo directory with the following structure:

    ./demo/*/
    |-- config
    |   |-- default_config.yaml
    |   |-- <data_name>.yaml
    |   |-- dapars_configure.txt
    |   `-- RNAEditor_configure.txt
    |-- data
    |   |-- fastq/
    |   |-- sample_ids.txt
    |   |-- sample_classes.txt
    |   |-- compare_groups.yaml
    |   `-- batch_info.txt
    |-- output
    `-- summary

Other related examples can be found in demo directory with the following structure:

    ./demo/*/
    |-- config
    |   |-- default_config.yaml
    |   `-- <data_name>.yaml
    |-- data
    |   |-- fastq/
    |   `-- sample_ids.txt
    |-- output
    `-- summary

Note:

  • config/default_config.yaml: the default configuration file. If you don't understand, don't change the content.
  • config/<data_name>.yaml: the user defined configuration file, to point out the related used path.
  • data/fastq/ : directory contain samples name, suffixed with 'fastq' 'fasta.gz' or 'fastq.gz'.
  • data/sample_ids.txt: table of sample names (remove the suffix 'fastq' 'fasta.gz' or 'fastq.gz' )
  • output/: the output directory
  • summary/ : contain the summary files

You can create your own data directory with the above directory structure. Multiple datasets can be put in the same directory by replacing "example" with your own dataset names.

Run

exVariance analysis

For RNA-seq realted analysis

exVariance -u <USER_CONFIG_FILE> RNA_seq_pre_process
exVariance -u <USER_CONFIG_FILE> RNA_seq_exp_matrix

exVariance -u <USER_CONFIG_FILE> RNA_seq_fusion_transcripts

exVariance -u <USER_CONFIG_FILE> RNA_seq_RNA_editing

exVariance -u <USER_CONFIG_FILE> RNA_seq_SNP

exVariance -u <USER_CONFIG_FILE> RNA_seq_APA

exVariance -u <USER_CONFIG_FILE> RNA_seq_AS

For DNA-methylation realted analysis

exVariance -u <USER_CONFIG_FILE> DNA_meth_WGBS

exVariance -u <USER_CONFIG_FILE> DNA_meth_RRBS

exVariance -u <USER_CONFIG_FILE> DNA_meth_Seal_seq

exVariance -u <USER_CONFIG_FILE> DNA_meth_Methyl-cap_seq

exVariance -u <USER_CONFIG_FILE> DNA_meth_MeDIP_seq

exVariance -u <USER_CONFIG_FILE> DNA_meth_MCTA_seq

For DNA-seq realted analysis

exVariance -u <USER_CONFIG_FILE> DNA_seq_ctDNA_mutation

exVariance -u <USER_CONFIG_FILE> DNA_seq_NP

Output and Summary

For RNA-seq realted analysis

exVariance RNA-seq output

Rule Graph

For RNA-seq realted analysis

pre process
rulegraph_RNA_seq_pre_process_pe
expression matrix

including filter, imputation, normalization, batch removing

rulegraph_RNA_seq_exp_matrix_pe
fusion transcript
SNP
rulegraph_RNA_seq_SNP
RNA editing
TCR analysis
rulegraph_RNA_seq_SNP

For DNA methylation realted analysis

DNA_meth_WGBS,DNA_meth_RRBS
wgbs_rrbs_pe
DNA_meth_Seal_seq,DNA_meth_Methyl-cap_seq,DNA_meth_MeDIP_seq
seal_methyl-cap_medip_pe
DNA_meth_MCTA_seq
mcta_pe

For DNA-seq related analysis

DNA-seq np
dna-seq_np_pe

Change Log

v1.0.0

  • Release exVariance

v1.0.1

  • Fix some bugs

Details

  1. For paire end analysis, the fastq files should end with _1.fastq.gz and _2.fastq.gz, and in the sample_ids.txt file, the suffix should not write in the file.
  2. You need to create the following 3 directories: summary, output and temp.

System Requirements:

Some of the tools that exVariance uses, e.g. STAR is very memory intensive programs. Therefore we recommend the following system requirements for exVariance:

Minimal system requirements:

We recommend that you run exVariance on a server that has at least 48GB of ram. This will allow for a single-threaded exVariance run (on human samples).

Recommended system requirements:

We recommend that you have at least 64GB of ram and at least a 4-core CPU if you want to run exVariance in multi-threaded mode (which will speedup the workflow significantly).
Our own servers have 64GB of ram and 16 cores.

Copyright and License Information

Copyright (C) Lu Lab @ Tsinghua University, Beijing, China 2020 All rights reserved

Citation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published