-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ATAC-Seq parameters #195
Comments
I Think it's better to continue this thread because my question is quite very close and the issue is still open I read that -f BAMPE was a good pratice . But then should it be used with --nomodel option ? and also should we call broad or narrow peaks ? For what it worth, this is what I'm using until now on my bam without duplicates:
|
answer my own question :I think with -f BAMPE , no need --nomodel option, it doesn't use it by default when BAMPE is set. |
From an experimental point of view, the 5' end of the fragments is the main interest, so it is better to treat them as single end and perform peak calling using '-f BAM --nomodel --shift -100 --extsize 200' See this thread: https://groups.google.com/d/msg/macs-announcement/4OCE59gkpKY/v9Tnh9jWriUJ |
why --shift -100 --extsize 200 ? |
The --shift -100 --extsize 200 option centers a 200 bp window on the Tn5 binding site, which is more accurate for ATAC-seq data (the same option is also applicable for DNase-seq data). The 5' ends of reads represent the Tn5 (and DNase I) cut sites; so the 5' ends of reads represent the most accessible regions. If you have paired-end ATAC-seq reads, many of your read pairs will span at least one nucleosome. If you look at plots of distribution of fragment length, you should see many fragments that don't span nucleosomes (less 150-200 bp), but you will also see many fragments that do span nucleosomes (>200-300 bp). Therefore, you don't want to run MACS2 with the default model building parameters meant for ChIP-seq. These model building parameters assume that the binding site is in the middle of the fragment, which is accurate for ChIP-seq, but not for ATAC-seq. If you use the ChIP-seq options for ATAC-seq, many of your peaks will be centered on nucleosomes instead of centered on the accessible regions. The reason why a 200 bp window is often used (--shift 100 --extsize 200), is that it is reasonable to assume that ATAC-seq peaks are at least 200 bp long because this is about the size of a nucleosome-free region with a single nucleosome removed. Some people may use --shift -75 --extsize 150, with the assumption that the length of an accessible region with a single nucleosome removed is about 150 bp, which is also reasonable. |
@kwcurrin Thank you for your answer. macs2 callpeak -t in.bam -n in -g hs -q 0.05 --nomodel -f BAMPE -B --broad |
Hi,
I have gone through the other two ATAC-seq issues. Accordingly, I tried one run with BAMPE and one without. here are the codes.
macs2 callpeak -t raw/dH1f_1.bam -n dh1f_1.broad -g hs -q 0.05 --nomodel --extsize 200 -f BAMPE -B --broad
callpeak -t raw/dh1f_2.bam -n dh1f_2.broad -g hs -q 0.05 --nomodel --shift -100 --extsize 200 -B --broad
Then I wanted to see how close their results are and the length of broadPeak files are as
1232 dh1f_1.broad_peaks.broadPeak 92323 dh1f.broad_peaks.broadPeak
There is a huge difference and I cannot make sure what parameters to use.
The text was updated successfully, but these errors were encountered: