Skip to content

Commit

Permalink
update doc
Browse files Browse the repository at this point in the history
  • Loading branch information
shenwei356 committed Jun 13, 2023
1 parent d4e915d commit 5a0dc93
Show file tree
Hide file tree
Showing 4 changed files with 10 additions and 11 deletions.
2 changes: 1 addition & 1 deletion docs/tutorial/detecting-pathogens/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ Creating the taxid mapping file (only needed for multiple reference genomes).

## Searching reads against the KMCP database

kmcp search -d refs.kmcp/ sample_1.fq.gz sample_2.fq.gz -o sample.kmcp.tsv.gz
kmcp search -w -d refs.kmcp/ sample_1.fq.gz sample_2.fq.gz -o sample.kmcp.tsv.gz

## Profiling

Expand Down
3 changes: 2 additions & 1 deletion docs/tutorial/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Tutorials

- [Taxonomic profiling](profiling)
- [Sequence and genome searching](searching)
- [Detecting specific pathogens](detecting-pathogens)
- [Sequence and genome searching](searching)
14 changes: 6 additions & 8 deletions docs/tutorial/profiling/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ For example, removing adapters and trimming using [fastp](https://github.com/Ope
fastp -i in_1.fq.gz -I in_2.fq.gz \
-o out_1.fq.gz -O out_2.fq.gz \
-l 75 -q 20 -W 4 -M 20 -3 20 --thread 32 \
--trim_poly_g --poly_g_min_len 10 --low_complexity_filter \
--html out.fastp.html

### Step 2. Removing host reads
Expand All @@ -30,24 +31,21 @@ Tools:

- [bowtie2](https://github.com/BenLangmead/bowtie2) is [recommended](https://doi.org/10.1099/mgen.0.000393) for removing host reads.
- [samtools](https://github.com/samtools/samtools) is also used for processing reads mapping file.
- [pigz](https://zlib.net/pigz/) is a parallel implementation of `gzip`, which is much faster than `gzip`.

Host reference genomes:

- Human: [CHM13](https://github.com/marbl/CHM13). We also provide a database of CHM13 for fast removing human reads.

Building the index (~60min):

bowtie2-build --threads 32 GCA_009914755.3_CHM13_T2T_v1.1_genomic.fna.gz chm13
bowtie2-build --threads 32 GCA_009914755.4_T2T-CHM13v2.0_genomic.fna.gz chm13v2.0

Mapping and removing mapped reads:

index=~/ws/db/bowtie2/chm13
index=~/ws/db/bowtie2/chm13v2.0

bowtie2 --threads 32 -x $index -1 in_1.fq.gz -2 in_2.fq.gz \
| samtools view -buS -f 4 - \
| samtools fastq - \
| gzip -c > sample.fq.gz
bowtie2 --threads 32 -x $index -1 in_1.fq.gz -2 in_2.fq.gz \
| samtools fastq -f 4 -o sample.fq.gz -

### Step 3. Searching

Expand Down Expand Up @@ -209,7 +207,7 @@ Demo result:

#### Searching on a computer cluster

Update: We recommend analyzing one sample using one computer node, which is easier to setup up.
<p style="color:Tomato;">Update: We recommend analyzing one sample using one computer node, which is easier to setup up.</p>


Here, we split genomes of GTDB into 16 partitions and build a database for
Expand Down
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ nav:
- Usage: usage.md
- Tutorials:
- Taxonomic profiling: tutorial/profiling/index.md
- Sequence and genome searching: tutorial/searching/index.md
- Detecting specific pathogens: tutorial/detecting-pathogens/index.md
- Sequence and genome searching: tutorial/searching/index.md
- Benchmarks:
- Taxonomic profiling: benchmark/profiling/index.md
- Sequence and genome searching: benchmark/searching/index.md
Expand Down

0 comments on commit 5a0dc93

Please sign in to comment.