You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I can cat the fq separately but that takes forever because the libraries are many Gb each and I'll just delete the tmp file immediately anyway
the other alternative is to run filtlong separately for each library, but reading in the references also takes a long time for each one. It would be nice if it output data when using the process substitution on input.
thanks
The text was updated successfully, but these errors were encountered:
This is an interesting one! I suspect the problem comes from the fact that Filtlong currently reads through the input file twice: once to score the reads and once to output them. I did it so Filtlong didn't have to keep the read sequences in memory, just info about the reads. So a 10 GB read file should work on a machine with 8 GB of memory.
I think that with your process substitution, Filtlong is reading through the reads fine the first time through but is getting nothing the second time through. I'm no Linux whiz, but I don't see a way around this one.
The best solution that comes to mind is to have Filtlong accept multiple input files in a single command. Then you could achieve what you want without process substitution.
I tried the following, which seems to process reads ok but doesn't output anything
filtlong -1 R1.fastq.gz --min_length 1000 --target_bases 100000 <(cat *.fq)
I can cat the fq separately but that takes forever because the libraries are many Gb each and I'll just delete the tmp file immediately anyway
the other alternative is to run filtlong separately for each library, but reading in the references also takes a long time for each one. It would be nice if it output data when using the process substitution on input.
thanks
The text was updated successfully, but these errors were encountered: