NeuroPeptideSearch

Note: this is currently a beta version...

NPSearch (NeuroPeptideSearch) is a program that searches for potential neuropeptides precursors based on the motifs commonly found on a neuropeptide. Ideally, the input would be transcriptome or protein data since there are no introns to worry about. The program produces a list of sequences that all contain all the common motifs found in a neuropeptide, which would need to be further analysed. The default output is a fasta file and a word document in which the signal peptide and potential neuropeptide cleavage sites are colour-coded.

Note: For this program to work, you will need to obtain a copy of Signal P 4.1 from cbs at "http://www.cbs.dtu.dk/cgi-bin/nph-sw_request?signalp" and link this to the program. Alternatively you will require an output text file from the Signal P which you can input into the program.

** Currently only supported on Mac OS & Linux (and Windows through cywWin32)

##Outputs The default output is just a fasta file and a colour-formatted word document. However, it is possible to get all the temporary file outputted (by using the "-a" option)

protein.aa              The genome translated in all 6 frames (* represents stop codons)
orf.fa                  All possible Open reading frames are extracted from protein.aa (i.e. any methionine residue to a stop codon.)
orf_condensed.fa        Open reading frames that are longer than 10 residues.
signalp_out.txt         A signal Peptide test done on each sequence in orf_condensed.fa
signalp_seq.fa          Showing all sequences that have a signal peptide with "-" where the signal peptide cleavage site is AND have at least one neuropeptide cleavage site after the signal peptide cleavage site
output.fa               Removal of duplicate entries i.e. those that have different identifiers but the same sequence. The Final output fasta file.
output.docx             output.fa as a word document - Signal peptide in blue and the motif in red

Installation

Simply open the Terminal and type this

$ gem install np_search

Usage

Usage: $ np_search [Options] -m [Motif] -t [Input Type] -i [Input File] -o [Output Folder Name]

Where:

Mandatory Options:

-m, --motif [Query Motif]        The query motif to be searched for.
                                 The period "." can be used to denote any character. Multiple
                                 motifs query can be used by using a pipeline character ("|")
                                 between each query and putting the motif query in speech marks
                                 e.g. "KR|RR|R..R"
                                 Advanced Users: Regular expressions are supported in the motif.
-t, --input_type [type]          The type of data in the input query file. The only two options
                                 available are "genetic" and "protein".
-i, --input [file]               The input file. This can be a relative or a full path.
-o, --output [folder name]       The path to the output folder. This will be created if the folder does not already exist.

Optional Options:

-c, --cut_off N                  Changes the default minimum Open Reading Frame from 10 amino acid residues to N amino acid residues.
-s, --signalp_file [file]        Supply the output file of the Signal Peptide script (version 4.x) to the script.
                                  Otherwise the script will try to run the external Signal Peptide script when running.
-a, --output_all                 Outputs all possible files.
-e, --extract_orf                Only extracts the Open Reading Frames.
-v, --verbose                    Provides more information on each step taken in this program.
-h, --help                       Display this screen
    --version                    Shows version

Examples

Help can be accessed easily, directly from the command line:

$ np_search -h

Example 1

$ np_search -v -a -c 25 -m neuro_clv -t genetic -i genetic_data.fa -o starfish

The Example Explained:

-v                  = Optional - Runs the verbose options
-a                  = Optional - Runs the Output_all Option
-c 25               = Optional -
-m neuro_clv        = Mandatory - 
-t genetic          = Mandatory - Describes the type of input data 
                      (can be either "genetic" or "protein")
-i genetic_data.fa  = Mandatory
-o starfish         = Mandatory

Further information on running the script on test material is provided within the test suite download.

Contributing

Fork it
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Add some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
bin		bin
lib		lib
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
Gemfile		Gemfile
LICENSE.txt		LICENSE.txt
README.md		README.md
Rakefile		Rakefile
np_search.gemspec		np_search.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeuroPeptideSearch

Installation

Usage

Examples

Example 1

Contributing

About

Releases

Packages

License

yannickwurm/NeuroPeptideSearch

Folders and files

Latest commit

History

Repository files navigation

NeuroPeptideSearch

Installation

Usage

Examples

Example 1

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages