Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select AutoPVS1 transcript as final variant annotation #145

Merged
merged 3 commits into from
Sep 8, 2023

Conversation

rjcorb
Copy link
Collaborator

@rjcorb rjcorb commented Sep 8, 2023

Purpose/implementation Section

What feature is being added or bug is being addressed?

Closes #144. This PR modifies autogvp and annotation filtering scripts to retain autopvs1 transcript annotation as final outputted annotation for each variant.

What was your approach?

  • modified 01-annotate_variants_CAVATICA_input.R and annotate_variants_custom_input.R to retain autopvs1 Feature column.
  • modified 04-filter_gene_annotations.R to filter annotated vcf for only vcf_id-Feature pairs in autogvp output.

What GitHub issue does your pull request address?

#144

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Please run shell script on both pbta and custom test files

bash run_autogvp.sh --workflow="cavatica" \
--vcf=input/test_pbta.single.vqsr.filtered.vep_105.vcf \
--filter_criteria='INFO/AF>=0.2 INFO/DP>=15 (gnomad_3_1_1_AF_non_cancer<0.01|gnomad_3_1_1_AF_non_cancer=".")' \
--intervar=input/test_pbta.hg38_multianno.txt.intervar \
--multianno=input/test_pbta.hg38_multianno.txt \
--autopvs1=input/test_pbta.autopvs1.tsv \
--outdir=../results \
--out="test_pbta"
bash run_autogvp.sh --workflow="custom" \
--vcf=input/test_VEP.vcf \
--clinvar=input/clinvar.vcf.gz \
--intervar=input/test_VEP.hg38_multianno.txt.intervar \
--multianno=input/test_VEP.vcf.hg38_multianno.txt \
--autopvs1=input/test_autopvs1.txt \
--outdir=../results \
--out="test_custom"

Is there anything that you want to discuss further?

There are rare instances (I believe only in the custom test files) in which variants are annotated as intergenic by VEP, but have transcript annotation by AutoPVS1. This results in NA annotation columns for these variants in the final output, since the AutoPVS1 transcript is not found in the VEP vcf file. I will plan to run this on larger data sets to determine if this only happens with intergenic variants, in which case we can annotate them as such in the final output.

Documentation Checklist

  • The function has examples to showcase the usage

Copy link
Collaborator

@naqvia naqvia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic and code looks good to me. Runs as expected. I appreciate your usage of glue::glue and I like the idea of using an input file output_colnames.tsv. I am assuming this would be the default version and will eventually allow the user to define their own file? Not crucial at the moment, but If the latter, we'd prob need to amend the readme!

@rjcorb
Copy link
Collaborator Author

rjcorb commented Sep 8, 2023

hmm good point...I suppose we could add instructions for modifying output_colnames, it's pretty comprehensive now but there may be other annotations we're not capturing here

@rjcorb rjcorb merged commit 0fc4932 into main Sep 8, 2023
1 check passed
@rjcorb rjcorb deleted the rjcorb/144-select-autopvs1-transcript branch September 8, 2023 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: use AutoPVS1 transcript annotation in final output
2 participants