Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to understand the effects of varying inputs #249

Open
Laura-Alex opened this issue Oct 29, 2019 · 0 comments
Open

Trying to understand the effects of varying inputs #249

Laura-Alex opened this issue Oct 29, 2019 · 0 comments

Comments

@Laura-Alex
Copy link

Laura-Alex commented Oct 29, 2019

Hello,

I am probably using megahit in a manner it wasn't designed for (on fasta sequences after previous sorting of metagenomic data by spacegraphcats. Also I'm using -r even though some of my reads are paired, because I can't yet figure it out). I'd like to understand it better. My issue is as follows. When my input is very small (say, 5 sequences out of which 4 cover my region of interest), I get the contig that I want (or at least I assume it's correct, because it's homologous to the other genes from my list). However, when I use it on a larger dataset which includes those same sequences, I no longer get the respective contig. It doesn't even seem to appear in the 'intermediate_contigs'. I suspect this might be because I am trying to recover uncommon paralogues of low-abundance organisms.

What is causing this, and is there a way to adjust the settings to do what I want? Help would be appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant