Skip to content

Commit

Permalink
Documentation update
Browse files Browse the repository at this point in the history
  • Loading branch information
art-egorov committed Nov 4, 2024
1 parent 77dff1f commit 063ba73
Show file tree
Hide file tree
Showing 9 changed files with 21 additions and 7 deletions.
12 changes: 9 additions & 3 deletions docs/ExampleDrivenGuide/cmd_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -208,18 +208,24 @@ lovis4u -gff lovis4u_data/guide/gff_files -hl --set-category-colour -c A4p2 --ru

In addition to visualisation, *hmmscan* folder with search results is saved to the output directory. As you can see, LoVis4u replaces category and name attributes of CDSs that have hits with search. You can keep default names (labels) using `-kdn`, `--keep-default-name` parameter and default category with `-kdc`, `--keep-default-category` option. Also, if you want to show all labels for proteins with hits (for instance, DruM2 label is shown only for the first occurrence in the figure above) you can use `-salq`, `--show-all-labels-for-query` parameter.

In addition to visualisation, the *hmmscan* folder with search results is saved to the output directory. LoVis4u replaces the category and name attributes of CDSs that have hits. You can keep the default names (labels) using the `-kdn`, `--keep-default-name` parameter and the default category with `-kdc`, `--keep-default-category` option. Also, if you want to show all labels for the proteins with hits (for instance, the DruM2 label is shown only for the first occurrence in the figure above) you can use the `-salq`, `--show-all-labels-for-query` parameter.

**Selecting defence system database**

Since for the defence systems we have two databases: PADLOC and DefenseFinder, a user can specify which one to use for annotation, while by default both are used. To do that you can use `-dm`, `--defence-models` parameter with one of the three option: *PADLOC*, *DefenseFinder* or *both*. In case a protein has a hit to both databases, target with lowest e-value is kept.
P2 phage is most suitable for demonstration of this parameter since *Tin* proteins model can be found only in PADLOC database, while *Old* protein has a lowest e-value for DefenseFinder database model. To choose only PADLOC database models for search you can use `-dm PADLOC`:

Since we have two defence system databases: PADLOC and DefenseFinder, it is possible to specify which to use for annotation (by default both are used). This is done with the `-dm`, `--defence-models` parameter with one of the three options: PADLOC, DefenseFinder or both. In case a protein has a hit to both databases, the target with the lowest e-value is kept.
The P2 phage and its Tin/Old hotspot defence island is a good example for demonstrating this parameter since the Tin protein model is only in the PADLOC database, while the Old protein has a lowest e-value for the DefenseFinder database model. To choose only PADLOC database models for search you can use `-dm PADLOC`:


```sh
lovis4u -gff lovis4u_data/guide/gff_files/NC_001895.1.gff --set-category-colour -c A4p2 \
--run-hmmscan -dm PADLOC -o lovis4u_hmmscan_PADLOC
```
![f4_hmmscan_p](cmd_guide/img/lovis4u_hmmscan_padloc.png){loading=lazy width="100%"}

Similarly, you can choose to use DefenseFinder models only with `-dm DefenseFinder`. And as we mentioned, in that case annotation of Tin protein is absent.
Similarly, you can choose to use DefenseFinder models only with `-dm DefenseFinder`. As mentioned above, in that case the Tin annotation is absent.

```sh
lovis4u -gff lovis4u_data/guide/gff_files/NC_001895.1.gff --set-category-colour -c A4p2 \
Expand All @@ -230,9 +236,9 @@ lovis4u -gff lovis4u_data/guide/gff_files/NC_001895.1.gff --set-category-colour

**How to use your own HMM models**

LoVis4u also allows to use your own HMM models. You can specify them using `-hmm, --add-hmm-models <folder_path [name]>` parameter. Folder should contain files in HMMER format (one file per model). Usage: `-hmm path [name]`. Specifying name is optional, by default it will be taken from them folder name. If you want to add multiple hmm databases you can use this argument several times: `-hmm path1 [name1] -hmm path2 [name2] ....`.
LoVis4u also allows the use of your own HMM models. You can specify these using `-hmm, --add-hmm-models <folder_path [name]>`. The folder should contain files in HMMER format (one file per model). Usage: `-hmm path [name]`. Specifying the name is optional; by default it will be taken from the folder name. If you want to add multiple HMM databases you can use this argument mulitple times: `-hmm path1 [name1] -hmm path2 [name2] ...`.

Finally, if you want to force to search only against your models excluding default set, you can add `-omh`, `--only-mine-hmms` parameter in addition to `-hmm` option.
Finally, if you want to search only against your models excluding default set, you can add `-omh, --only-mine-hmms` parameter in addition to `-hmm` option.

## Other LoVis4u features

Expand Down
Binary file modified docs/img/lovis4u_pipeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/lovis4u_pipeline_old1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions lovis4u/DataProcessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -631,6 +631,8 @@ def save_locus_annotation_table(self) -> None:
try:
if not os.path.exists(self.prms.args["output_dir"]):
os.mkdir(self.prms.args["output_dir"])
locus_ids = [locus.seq_id for locus in self.loci ]
self.locus_annotation = self.locus_annotation.loc[self.locus_annotation.index.isin(locus_ids)]
file_path = os.path.join(self.prms.args["output_dir"], "locus_annotation_table.tsv")
self.locus_annotation.to_csv(file_path, sep="\t", index_label="sequence_id")
if self.prms.args["verbose"]:
Expand All @@ -651,6 +653,8 @@ def save_feature_annotation_table(self) -> None:
try:
if not os.path.exists(self.prms.args["output_dir"]):
os.mkdir(self.prms.args["output_dir"])
feature_ids = [feature.feature_id for locus in self.loci for feature in locus.features]
self.feature_annotation = self.feature_annotation.loc[self.feature_annotation.index.isin(feature_ids)]
file_path = os.path.join(self.prms.args["output_dir"], "feature_annotation_table.tsv")
self.feature_annotation.to_csv(file_path, sep="\t", index_label="feature_id")
if self.prms.args["verbose"]:
Expand Down
1 change: 1 addition & 0 deletions lovis4u/lovis4u_data/A4L.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ hmm_config_names = hmm_defence_df,hmm_defence_padloc,hmm_virulence,hmm_anti_defe
database_names = defence (DefenseFinder),defence (PADLOC),virulence,anti-defence,AMR
defence_models = both
only_mine_hmms = False
hmm_models = False

;[Paths]
palette = {internal}/palette.txt
Expand Down
5 changes: 3 additions & 2 deletions lovis4u/lovis4u_data/A4p1.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@ mmseqs_s = 7
;[pyhmmer parameters and additional annotation]
run_hmmscan_search = False
hmmscan_evalue = 1e-3
hmmscan_query_coverage_cutoff = 0.8
hmmscan_hmm_coverage_cutoff = 0.7
hmmscan_query_coverage_cutoff = 0.7
hmmscan_hmm_coverage_cutoff = 0.65
update_protein_name_with_target_name = True
update_category_with_database_name = True
show_label_on_first_occurrence_for_query_proteins = True
Expand All @@ -51,6 +51,7 @@ hmm_config_names = hmm_defence_df,hmm_defence_padloc,hmm_virulence,hmm_anti_defe
database_names = defence (DefenseFinder),defence (PADLOC),virulence,anti-defence,AMR
defence_models = both
only_mine_hmms = False
hmm_models = False

;[Paths]
palette = {internal}/palette.txt
Expand Down
3 changes: 2 additions & 1 deletion lovis4u/lovis4u_data/A4p2.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ hmm_config_names = hmm_defence_df,hmm_defence_padloc,hmm_virulence,hmm_anti_defe
database_names = defence (DefenseFinder),defence (PADLOC),virulence,anti-defence,AMR
defence_models = both
only_mine_hmms = False
hmm_models = False

;[Paths]
palette = {internal}/palette.txt
Expand Down Expand Up @@ -127,7 +128,7 @@ groups_stroke_colours_alpha = 1
feature_label_font_size = 6.5
feature_label_gap = 0.7
feature_label_font_face = regular
gap_between_regions = 6
gap_between_regions = 10

;[Homology_track]
homology_fill_colour = lightgrey
Expand Down
1 change: 1 addition & 0 deletions lovis4u/lovis4u_data/standard.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ hmm_config_names = hmm_defence_df,hmm_defence_padloc,hmm_virulence,hmm_anti_defe
database_names = defence (DefenseFinder),defence (PADLOC),virulence,anti-defence,AMR
defence_models = both
only_mine_hmms = False
hmm_models = False

;[Paths]
palette = {internal}/palette.txt
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ def package_files(directory):
extra_files.append("../docs/pypi.md")

setuptools.setup(name="lovis4u",
version="0.0.11",
version="0.0.11.1",
python_requires='>=3.8',
description="Loci Visualisation Tool.",
url="https://art-egorov.github.io/lovis4u/",
Expand Down

0 comments on commit 063ba73

Please sign in to comment.