-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VCF Creation Issue #14
Comments
Dear @Briteguy and @rebecca810 , Using the test data supplied in the However, when invoking (using bioconda built 1.0.1) on a custom capture sequencing sample aligned with bwa mem to hs37d5 (numeric identifiers for autosomes) the Error "Error: subscript contains invalid names" was raised: scramble.sh --out-name v3 --cluster-file v3_clusters.txt --ref hs37d5.fa --eval-meis --eval-dels The following output files were created and seem to be fine: v3_MEIs.txt
v3_PredictedDeletions.txt The following log was produced: Running sample: v3_clusters.txt
Running scramble with options:
blastRef : hs37d5.fa
clusterFile : v3_clusters.txt
deletions : TRUE
indelScore : 80
INSTALL.DIR : /opt/conda/miniconda3/envs/hum-analysis_21-q1_mei-detection/share/scramble/bin
mei.refs : /opt/conda/miniconda3/envs/hum-analysis_21-q1_mei-detection/share/scramble/resources/MEI_consensus_seqs.fa
meis : TRUE
meiScore : 50
minDelLen : 50
nCluster : 5
outFilePrefix : v3
pctAlign : 90
polyAdist : 100
polyAFrac : 0.75
Useful Functions Loaded
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames,
dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: ‘S4Vectors’
The following object is masked from ‘package:base’:
expand.grid
Loading required package: IRanges
Loading required package: XVector
Attaching package: ‘Biostrings’
The following object is masked from ‘package:base’:
strsplit
Done analyzing l1
Done analyzing sva
Done analyzing alu
Done analyzing l1
Done analyzing sva
Done analyzing alu
Sample had 14 MEI(s)
Done analyzing MEIs
214 clusters out of 497 were removed due to simple sequence
BLAST Database error: No alias or index file found for nucleotide database [/ramdisk/HUM/bwa_index/hs37d5.fa] in search path [/tmp/Rtmpy3iGE6::]
Number of alignments meeting thresholds: 283
Number of best alignments: 0
[1] "Two-End-Deletions: Working on contig 1"
[1] "Two-End-Deletions: Working on contig 10"
[1] "Two-End-Deletions: Working on contig 11"
[1] "Two-End-Deletions: Working on contig 12"
[1] "Two-End-Deletions: Working on contig 13"
[1] "Two-End-Deletions: Working on contig 14"
[1] "Two-End-Deletions: Working on contig 15"
[1] "Two-End-Deletions: Working on contig 16"
[1] "Two-End-Deletions: Working on contig 17"
[1] "Two-End-Deletions: Working on contig 18"
[1] "Two-End-Deletions: Working on contig 19"
[1] "Two-End-Deletions: Working on contig 2"
[1] "Two-End-Deletions: Working on contig 20"
[1] "Two-End-Deletions: Working on contig 21"
[1] "Two-End-Deletions: Working on contig 22"
[1] "Two-End-Deletions: Working on contig 3"
[1] "Two-End-Deletions: Working on contig 4"
[1] "Two-End-Deletions: Working on contig 5"
[1] "Two-End-Deletions: Working on contig 6"
[1] "Two-End-Deletions: Working on contig 7"
[1] "Two-End-Deletions: Working on contig 8"
[1] "Two-End-Deletions: Working on contig 9"
[1] "Two-End-Deletions: Working on contig GL000220.1"
[1] "Two-End-Deletions: Working on contig hs37d5"
[1] "Two-End-Deletions: Working on contig X"
[1] "finished one end dels"
Sample had 0 deletions
Done analyzing deletions
Warning message:
In predict.BLAST(bl, seq, BLAST_args = "-dust no") :
BLAST did not return a match!
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Error: subscript contains invalid names
Execution halted What does the predit.BLAST error mean? |
I identified the problem in make.vcf.R. Pull request will follow. |
it seems that SCRAMBLE output VCF using the software BLAST. makeblastdb -in YOUR_REFERENCE.fasta -dbtype nucl -parse_seqids my log: |
I have used the following commands to generate reference files (*.nhr, *.nin, and *.nsq files) for VCF creation for both GRCh37/38.
makeblastdb -in file.fasta -input_type fasta -dbtype nucl
However, when I run the Cluster analysis, I get the following error:
Done analyzing MEIs
Writing VCF file
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Error: subscript contains invalid names
Execution halted
This occurs with either reference (37 or 38).
Any ideas how I could go about troubleshooting this step?
Thanks in advance
The text was updated successfully, but these errors were encountered: