Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The N50 is very short from soil sample #259

Open
lulunisrna opened this issue Jan 21, 2020 · 2 comments
Open

The N50 is very short from soil sample #259

lulunisrna opened this issue Jan 21, 2020 · 2 comments

Comments

@lulunisrna
Copy link

lulunisrna commented Jan 21, 2020

Dear voutcn,

I have problem with my result for using MEGAHIT. My result for N50 is very short, around 450-550bp. My sample is from soil plantation. I have reed the same issue from this page and you give an advise for using min --min-count 1, but it doesn't work for me. I also already tried to running assembly with --kmin-1pass, but the result of N50 also too short, around 500bp. Now i'm trying for using --presets meta-large for this assembly. I hope i will get the good result. If my result is still bad, do you have an advice for me to fix this problem? Thank you.

@franciscozorrilla
Copy link

Hi, I've also been having a hard time finding suitable parameters for my soil datasets (#254). Did you manage to improve your N50 somehow?

@voutcn
Copy link
Owner

voutcn commented Feb 23, 2020

Soil samples are hard to assemble because of

  1. Very high bio-diversity (too many microorganisms) and a lot of them are sequenced at very low depth
  2. Some dominant microorganisms can be sequenced at extremely high depth which introduces a lot of sequencing error

No solution to the first problem other than sequencing a lot more data.
For the second problem normalization may help. See #239 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants