Skip to content

Commit

Permalink
updated documentation for softmask support
Browse files Browse the repository at this point in the history
  • Loading branch information
Ron Schwessinger authored and Ron Schwessinger committed Sep 4, 2020
1 parent 3128756 commit 5e9b627
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 1 deletion.
2 changes: 2 additions & 0 deletions tutorials/tutorial_predict_and_plot.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ python ../tensorflow1_version/run_deploy_shape_deepCregr.py --input example_regi
--name_tag predict \ # name tag to add to ouput files
--model ./model_deepCregr_5kb_GM12878_primary/model \ # trained deepC model downloaded and extracted
--genome ./hg19_chr17_fasta_for_test/hg19_chr17.fa \ # link to whole genome or chromosome wise fasta file (needs a fasta index) or test chr17 fasta file dowloaded and extracted
--use_softmasked=False \ #Specify if to include base pairs soft masked in the fasta file (lower case) default=False
--bp_context 1005000 \ # bp context (1 Mb + bin.size)
--add_window 500000 \ # how much bp to add to either side of the specified window
--num_classes 201 \ # The number of classes corresponds to the number of outputs (output bins of the vertical pole) (201 for 5kb models; 101 for 10kb models)
Expand All @@ -114,6 +115,7 @@ python ../tensorflow1_version/run_deploy_shape_deepCregr.py --input example_regi
--name_tag predict \
--model ./model_deepCregr_5kb_GM12878_primary/model \
--genome ./hg19_chr17_fasta_for_test/hg19_chr17.fa \
--use_softmasked=False \
--bp_context 1005000 \
--add_window 500000 \
--num_classes 201 \
Expand Down
4 changes: 3 additions & 1 deletion tutorials/tutorial_predict_and_plot.html
Original file line number Diff line number Diff line change
Expand Up @@ -420,6 +420,7 @@ <h3>Running the deepC prediction</h3>
--name_tag predict \ # name tag to add to ouput files
--model ./model_deepCregr_5kb_GM12878_primary/model \ # trained deepC model downloaded and extracted
--genome ./hg19_chr17_fasta_for_test/hg19_chr17.fa \ # link to whole genome or chromosome wise fasta file (needs a fasta index) or test chr17 fasta file dowloaded and extracted
--use_softmasked=False \ #Specify if to include base pairs soft masked in the fasta file (lower case) default=False
--bp_context 1005000 \ # bp context (1 Mb + bin.size)
--add_window 500000 \ # how much bp to add to either side of the specified window
--num_classes 201 \ # The number of classes corresponds to the number of outputs (output bins of the vertical pole) (201 for 5kb models; 101 for 10kb models)
Expand All @@ -432,6 +433,7 @@ <h3>Running the deepC prediction</h3>
--name_tag predict \
--model ./model_deepCregr_5kb_GM12878_primary/model \
--genome ./hg19_chr17_fasta_for_test/hg19_chr17.fa \
--use_softmasked=False \
--bp_context 1005000 \
--add_window 500000 \
--num_classes 201 \
Expand Down Expand Up @@ -611,7 +613,7 @@ <h3>Adding the source HiC data</h3>
# make sure the start and end position match to the binning of the Hi-C data
# use custom floor/ceiling funcitons for your bin size
binned.genome &lt;- getBinnedChrom(chr=&quot;chr17&quot;, start=custom_floor(limit1-window.size, bin.size), end=custom_ceil(limit2+window.size, bin.size), window=window.size, step=bin.size)</code></pre>
<pre><code>## [1] &quot;bedtools makewindows -b tempfile_for_bedtools_call_2020-07-31_09:55:04_0.327818183694035.bed -w 1005000 -s 5000 &gt;tempfile_from_bedtools_call_2020-07-31_09:55:04_0.327818183694035.bed&quot;</code></pre>
<pre><code>## [1] &quot;bedtools makewindows -b tempfile_for_bedtools_call_2020-09-04_12:02:11_0.325754458550364.bed -w 1005000 -s 5000 &gt;tempfile_from_bedtools_call_2020-09-04_12:02:11_0.325754458550364.bed&quot;</code></pre>
<pre class="r"><code>binned.genome &lt;- as.tibble(binned.genome)
names(binned.genome) &lt;- c(&quot;chr&quot;, &quot;start&quot;, &quot;end&quot;)

Expand Down
2 changes: 2 additions & 0 deletions tutorials/tutorial_train_a_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ seed_file='./saved_conv_weights_dhw_5layer_1k_pool.npz' #trained filters phase I
shuffle=True
store_dtype='bool' # how to store the sequence
whg_fasta='./hg19.fa' # link to whole genome fasta file for retrieving the sequences has to be indexed
use_softmasked=False # specify if to use soft masked bases from the fasta file (lowercase). Default=False

# if multiple GPUs present select a single one to run training on
# and not block the remaining
Expand Down Expand Up @@ -111,6 +112,7 @@ python ${SCRIPT_PATH}/run_training_deepCregr.py \
--shuffle=${shuffle} \
--store_dtype ${store_dtype} \
--whg_fasta ${whg_fasta} \
--use_softmasked=${use_softmasked} \
--gpu ${GPU}

```
Expand Down

0 comments on commit 5e9b627

Please sign in to comment.