-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
7 changed files
with
132 additions
and
62 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
|
||
# Description of the data for the MetaPathways 2.0 data product | ||
|
||
1. Sample_Name/preprocessed/ # the preprocessed folder | ||
Sample_Name.fasta : the preprocessed fasta sequence after filtering the input sequences based | ||
on the cutoffs on length, removal of ambiguous base pairs | ||
Sample_Name.mapping.txt : the original names of the input sequences are mapped to produce | ||
uniform name in the format Sample_Name_x, where | ||
x is the contig number | ||
|
||
|
||
2. Sample_Name/orf_prediction/: this folder contains the output related to the prodigal | ||
Sample_Name.fna : Open Read Frame (ORFs) detected by prodigal | ||
Sample_Name.faa : the translated ORFs | ||
Sample_Name.gff : a gff file containing the ORF information by prodigal | ||
Sample_Name.qced.faa : these are QCed amino acid sequences from file Sample_Name.faa after removing | ||
sequences that are too short. The names of the ORFs are in the format | ||
Sample_Name_x_y, where | ||
x is the contig number and y is the ORF number in contig Sample_Name_x | ||
Sample_Name.unannot.gff : These are created from the Sample_Name.gff file after removing the filtered sequences | ||
|
||
3. Sample_Name/blast_results/: this folder contains the BLAST or LAST results | ||
Sample_Name.dbname.LASTout : tabular format of the LAST results of the Sample_Name.qced.faa file against the | ||
database dbname | ||
Sample_Name.dbname.BLASTout : tabular format of the BLAST results of the Sample_Name.qced.faa file against the | ||
database dbname | ||
Sample_Name.dbname.BLASTout.parsed.txt : tabular format of the parsed BLAST results of the Sample_Name.qced.faa file against the | ||
database dbname | ||
Sample_Name.dbname.LASTout.parsed.txt : tabular format of the parsed LAST results of the Sample_Name.qced.faa file against the | ||
database dbname | ||
|
||
4. Sample_Name/genbank/: this folder contains the annotatied and genbank files | ||
Sample_Name.annot.gff : The annotatated file created from Sample_Name.unannot.gff (folder 2 above ) by combining | ||
the annotations created from the parsed LAST/BLAST files | ||
Sample_Name.gbk : The genbank file created by putting the annotations into it | ||
|
||
5. Sample_Name/mltreemap_calculations/: folder for the MLTreeMap result. Currently, it is empty | ||
|
||
|
||
6. Sample_Name/ptools/: this is the input to the pathway tools to create the PGDB | ||
|
||
7. Sample_Name/results/: this is the results folder | ||
annotation_table : folder that has the annotatation information tables | ||
megan : folder to drop the megan input file | ||
mltreemap : mltreemap summary results | ||
pgdb : the zipped PGDB file that needs to be unzipped in the ptools-local/pgdb/user/ folder | ||
Sample_Name.pathways.txt : provides the list of pathways in the column format | ||
PWY_NAME PWY_COMMON_NAME NUM_REACTIONS NUM_COVERED_REACTIONS ORF_COUNT | ||
note that at the end of each line (after the column ORF_COUNT) all the ORFs names are appended | ||
rRNA : the rRNA scan stats file | ||
sequin : not used currently | ||
tRNA : the tRNA scan stats file | ||
|
||
8 Sample_Name/run_statistics/: this folder has informatio about the run stats | ||
Sample_Name.amino.stats : the amino acid stats file before and after filtering the translated ORFs | ||
Sample_Name.contig.lengths.txt : the contig length distribution file | ||
Sample_Name.nuc.stats : the nucleotide sequence stats file before and after filtering the translated ORFs | ||
Sample_Name.orf.lengths.txt : the ORF length stats file | ||
run_parameters.txt : this file stores the parameters that were used for the run |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.