Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graph of recombination events algorithm #142

Open
valery-shap opened this issue Apr 11, 2022 · 4 comments
Open

Graph of recombination events algorithm #142

valery-shap opened this issue Apr 11, 2022 · 4 comments

Comments

@valery-shap
Copy link

Hello,

I have the data from gubbins gff file with coordinated (Start and Stop, and snp_count), I've downloaded this data and had the graph with peaks at the bottom. Could you please explain what is y-value? and how this value was calculated? I've tried to do the same distribution and had different result)

Best regards,
Valery

@jameshadfield
Copy link
Owner

Hi Valery, the y-axis is the number of recombination events observed which cover that particular position in the genome. Note that an event may involve many strains but only be counted once (the red blocks indicate events which are inferred on ancestral nodes and therefore involve more than one tip).

@valery-shap
Copy link
Author

valery-shap commented Apr 12, 2022

sorry, one more clarifying question. by the number of recombination events Do you mean exactly the number of strains that were detected in the regions of recombination on that position? But it couldn't be so, because I have near 300 isolates and there were positions where all isolates were involved but the max value on the graph is near 100. Maybe is there some window within it's counting? or relative value?
I supposed that y-axis is the mean number of snps, that was counted:

  1. count snps density within the region of recombination so every position within this region has this density
  2. sum this densities between the isolates that had this region of recombination
  3. get the sum density for every position

@jameshadfield
Copy link
Owner

Do you mean exactly the number of strains that were detected in the regions of recombination on that position?

No, the number of recombination events. If a recombination event is inferred to happen at an ancestral node then one event will involve multiple strains. The Gubbins paper details how these are inferred, phandango just visualises them.

I supposed that y-axis is the mean number of snps, that was counted:

No. It is the number of recombination events.

@valery-shap
Copy link
Author

Ok, thank you, I'll try, it's special terminology from gubbins. So if we return to the output gff file and the output of Phandango graph. Does it count exactly number of rows where the value from the last column "node=" is the ancestral node?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants