samtools v1.2
sambamba v0.5.9
bedtools v2.17.0
seqtk subseq v1.0
FASTX Toolkit v0.0.14
samblaster v0.1.24 (Source code was modified to redirect the fastq output to stdout. Please use the samblaster.cpp script provided in this repo before compiling)
bwa v0.7.15
R packages:
GenomicRanges v1.26.4
stringr v1.2.0
16-core (or greater) Intel processor
At least 128GB RAM
- An asterisk indicates that the steps are repeated twice, one for each pseudo-haplotype
- Place all the source codes, along with hg38_primary_header_meaning.txt, segdups.bedpe, and sv_blacklist.bed into the same directory (segdups.bedpe and sv_blacklist.bed are provided by 10x Genomics)
- NUI_config.sh must be edited manually each time
- After all samples are processed individually, results can be merged together to generate a unified, non-redundant list of NUIs by running combine_metaNUI.R
- Translocated sequences are removed at this point