Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with running muli-plot option #1

Open
tania-k opened this issue Nov 19, 2022 · 2 comments
Open

Issue with running muli-plot option #1

tania-k opened this issue Nov 19, 2022 · 2 comments

Comments

@tania-k
Copy link

tania-k commented Nov 19, 2022

Hello,
Would you mind clarifying what your list is asking for in your drawing muli-plot option?
When I add a file with one column of just my species ID that is also present in my vcf file I receive this error:

FileList Format wrong, should be (Two columns) :
Stat.FilePath.stat.gz PopulationIDA

What do the "Stat.FilePath.stat.gz" files look like? Are these zipped DNA files of each species and the path they are found in?

Thank you for your time.
Tania

@tania-k
Copy link
Author

tania-k commented Nov 19, 2022

To those who took some time to find this.
The instructions are buried in the help option.

2). Muti population
This is common situation in the LD decay analysis. For example, if there are 50 samples (wild1, wild2, wild3...wild25, cul1, cul2, cul3...cul25) in the VCF file,
To compare the LD decay of these two groups (wild vs cultivation), first of all, put their sample names into own file list for each group, column or row is ok.

         ./bin/PopLDdecay -InVCF  In.vcf.gz  -OutStat  wild.stat.gz  -SubPop wildName.list
         ./bin/PopLDdecay -InVCF  In.vcf.gz  -OutStat   cul.stat.gz  -SubPop culName.list
         #   created manually  muti.list by yourself
         perl bin/Plot_MutiPop.pl -inList  muti.list  -output  OutputPrefix

Note:
  A. The <wildName.list> can list as follow(column or row is ok):
                  wild1
                  wild2
                  ...
                  wild25
  B. The format of <muti.list> had two columns, the file path of population result and the population flag, such as:
                  /ifshk7/BC_PS/Lddecay/wild.stat.gz   wild
                  /ifshk7/BC_PS/Lddecay/cul.stat.gz    cultivation

3). One population with multi-chr
One population with multiple chromosome VCF files. For example, if there are 3 chromosomes VCF files (Chr1, Chr2 and Chr3) as the input.

        ./bin/PopLDdecay -InVCF  Chr1.vcf.gz  -OutStat  Chr1.stat.gz
        ./bin/PopLDdecay -InVCF  Chr2.vcf.gz  -OutStat  Chr2.stat.gz
        ./bin/PopLDdecay -InVCF  Chr3.vcf.gz  -OutStat  Chr3.stat.gz
        ls  `pwd`/Chr*.stat.gz   > chr.list
        perl bin/Plot_OnePop.pl -inList  chr.list  -output  OutputPrefix

Note:
A. It can run in parallel when calculating the chromosomes' statistics files.
B. The files list only store the file path, which is diff with the multi-population list
C. It will generate the file 'OutputPrefix.bin' is the summary statistics file of all chromosomes, and same format with the chromosomes' statistics files.
D. The <chr.list> format can be generated by as above command 'ls Chr*.stat.gz > chr.list'

4). Muti population with multi-chr
Muti population with multiple chromosome VCF files. For example, if there are 2 chromosomes VCF files (Chr1, Chr2) as the input.

         ./bin/PopLDdecay -InVCF  Chr1.vcf.gz  -OutStat  W.Chr1.stat.gz -SubPop wildName.list
         ./bin/PopLDdecay -InVCF  Chr2.vcf.gz  -OutStat  W.Chr2.stat.gz -SubPop wildName.list
         ./bin/PopLDdecay -InVCF  Chr1.vcf.gz  -OutStat  C.Chr1.stat.gz -SubPop culName.list
         ./bin/PopLDdecay -InVCF  Chr2.vcf.gz  -OutStat  C.Chr2.stat.gz -SubPop culName.list
         ls  `pwd`/W.Chr*.stat.gz   > W.chr.list
         perl bin/Plot_OnePop.pl -inList  W.chr.list  -output  Wild.cat
         ls  `pwd`/C.Chr*.stat.gz   > C.chr.list
         perl bin/Plot_OnePop.pl -inList  C.chr.list  -output  Cul.cat
         perl bin/Plot_MutiPop.pl -inList  muti.list  -output  OutputPrefix

Note:
 A. The format of <muti.list> had two columns , the file path of population result and the population flag, such as:
                  /ifshk7/BC_PS/Lddecay/Wild.cat.bin    wild
                  /ifshk7/BC_PS/Lddecay/Cul.cat.bin     cultivation

@hewm2008
Copy link
Owner

thanks for using

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants