Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

homozygous reference positions for FP assesment #10

Open
5mec opened this issue Apr 15, 2020 · 1 comment
Open

homozygous reference positions for FP assesment #10

5mec opened this issue Apr 15, 2020 · 1 comment

Comments

@5mec
Copy link

5mec commented Apr 15, 2020

Hi there,

I'm looking for the data described in your Genomes res. 2017 paper as:
“… we identified 2,737,246,156 positions that are homozygous reference across the pedigree. These positions can be used to calculate false positive rates when assessing variant calling pipelines.”

Could you please direct me to the correct file for hg38?

From the description of the Confident Regions at https://github.com/Illumina/PlatinumGenomes/wiki/Confident-regions I can't tell if this is homozygous reference data as the first and second paragraph on this page are confusing when read together.

Thanks for your help

helen

@blmoore
Copy link
Member

blmoore commented Apr 15, 2020

Confident regions contain both homozygous reference regions and the validated variant sites (i.e. records in the truthset VCFs) so if you were to subtract the NA12878 + NA12877 truthset records from the confident region bed files you'll be left with just the hom-ref regions.

hg38 bed files are here:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants