Skip to content
/ metAMOS Public

A metagenomic and isolate assembly and analysis pipeline built with AMOS

License

Notifications You must be signed in to change notification settings

marbl/metAMOS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

metAMOS v0.34 README
Last updated: November 22nd 2011

NEWS:

*PhyloSift now supported
*meta-IDBA now supported
*Velvet-SC now supported

> SUMMARY
        * A/ HARDWARE REQUIREMENTS
        * B/ SOFTWARE REQUIREMENTS
        * C/ INSTALLING metAMOS
        * D/ QUICK START
        * E/ OUTPUT
        * F/ CONTACT

----------------------------------------------------------------------------------
A/ HARDWARE REQUIREMENTS

metAMOS was designed to work on any standard 64bit Linx
environment. To use metAMOS for tutorial/teaching purposes, a minimum
of 8 GB RAM is required. To get started on real data sets a minimum of
32 GB of RAM is recommened, and anywhere from 64-512 GB may be
necessary for larger datasets. In our experience, for most 50-60
million read datasets, 64 GB is a good place to start (68 GB of memory
available on High Memory Instance at Amazon Elastic Compute Cloud ).

----------------------------------------------------------------------------------
B/ SOFTWARE REQUIREMENTS

The main prequisite software is python2.6+ and AMOS (available from
http://amos.sf.net). Once python2.6+ and AMOS are installed, there
should not be any other major prerequisites as most everything that is
needed is distributed with metAMOS inside of the /Utilities
directory. However, there is some software that metAMOS can
incorporate into its pipeline that we are not allowed to distribute,
such as MetaGeneMark. To get a license to use MetaGeneMark, plesae
visit: http://exon.gatech.edu/license_download.cgi.

----------------------------------------------------------------------------------
C/ INSTALLING metAMOS

To download the software, go to https://github.com/treangen/metAMOS
and click on Downloads. Once downloaded, simply unpack the files and
open the metAMOS directory. Once inside the metAMOS directory, run:

python INSTALL.py

This will download and install any external dependencies (or they can
be refused by answering NO), which may take minutes or hours to
download depending on your connection speed.

----------------------------------------------------------------------------------
D/ QUICK START

Before you get started using metAMOS a brief review of its design will
help clarify its intended use. metAMOS gas two main components:

1) initPipeline.py
2) runPipeline.py

The first component, initPipeline.py, is for creating new projects and
also initiliazing sequence libraries. Currently interleaved &
non-interleaved fasta, fastq, and SFF files are supported.

usage info:

(non-interleaved fastq, single library)
initPipeline.py -1 file.fastq.1 -2 file.fastq.2 -d projectDir -i 300:500 -q

(non-interleaved fasta, single library)
initPipeline.py -1 file.fastq.1 -2 file.fastq.2 -d projectDir -i 300:500 -f

(interleaved fastq, single library)
initPipeline.py -m file.fastq.12  -d projectDir -i 300:500 -q

(interleaved fastq, multiple libraries)
initPipeline.py -m file.fastq.12,file2.fastq.12  -d projectDir -i 300:500,1000:2000 -q

(interleaved fastq, multiple libraries, existing assembly)
initPipeline.py -m file.fastq.12,file2.fastq.12 -c file.contig.fa -d projectDir -i 300:500,1000:2000 -q

(interleaved fastq, multiple libraries, existing assembly)
initPipeline.py -m file.fastq.12,file2.fastq.12 -c file.contig.fa -d projectDir -i 300:500,1000:2000 -q

The second component, runPipeline.py, takes a project directory as
input and runs the following steps by default:

1. Preprocess
2. Assemble
3. FindORFs
4. FindRepeats
5. Annotate
6. Scaffold
7. Propagate 
8. FindScaffoldORFs
9. Classify 
10. Postprocess

usage info:

usage: runPipeline.py [options] -d projectdir (required)
options:  -a <assembler> -k <kmer size> -f (forcestep) -s (skipstep) -p <num threads>  -v (verbose?) -t (filter reads?)


For example, to enable meta-IDBA as the assembler:

-a metaidba

And to use PhyloSift to annotate:

-c phylosift

Any single step in the pipeline can be skipped by passing the
following parameter to runPipeline:

-n,--skipsteps=Step1,..

metAMOS reruns steps based on timestamp information, so if the input
files for a step in the pipeline hasn't changed since the last run, it
will be skipped automatically. However, you can forefully run any step
in the pipeline by passing the following parameter to runPipeline:

-f,--force=Step1,..

Upon completion, all of the final results will be stored in the
Postprocess/out directory. A third component, createReport.py, takes
this directory (or multiple Posprocess/out directories) as input and
as output, generates an HTML page with summary statistics and a few
static plots.

----------------------------------------------------------------------------------
E/ Example output

http://www.cbcb.umd.edu/software/metamos/report.krona.html

Krona publication: Ondov BD, Bergman NH, Phillippy AM.. Interactive
metagenomic visualization in a Web browser. BMC Bioinformatics. 2011
Sep 30;12:385.  PMID: 21961884

----------------------------------------------------------------------------------
F/ CONTACT

Who to contact to report bugs, forward complaints, feature requests:

Todd Treangen: [email protected]
Sergey Koren: [email protected]

----------------------------------------------------------------------------------
G/ CITE

Coming soon!