Skip to content

Python prototype of a novel transcription factor complex prediction algorithm

Notifications You must be signed in to change notification settings

edeltoaster/DACO-algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


        DACO: DOMAIN-AWARE COHESIVENESS OPTIMIZATION ALGORITHM
		(prototype by Thorsten Will, 2014; update of May 2016)

USAGE:

python DACO.py [PPIN] [TFs] [PAIR-THRESHOLD] [DEPTH] [OUT-FILE] , with:

PPIN : a protein-protein interaction network in the simple input format (SIF) with the layout 'ProteinA ProteinB weight'. The weight is expected to be represented as the probability of an interaction (float between 0 and 1). Proteins are expected to be UniProt accession numbers, like 'P12345'.

TFs : a textfile with a list of transcription factors of the organism line by line.

PAIR-THRESHOLD : Probability threshold for initial pair seed construction.

DEPTH : maximal complex size (to keep combinatorial explosion low).

OUT-FILE : output will be written to this file. Complex candidates are stored line by line, members are comma-separated.

Examples of input files for yeast can be found in the 'example_inputs/' folder.

Yeast data of the Dec. 2013 release of UniProt (and Pfam/InterPro at that time) is precached to speed-up the data retrieval. Without this preloading, the domain-domain interaction network construction can take some time (depending on the size of the network).

To reset the precached yeast data, just delete the following files in the 'data/' folder: ipro_mapping.data - relating InterPro and Pfam IDs pfam.data - associated Pfam IDs per protein up_entry_data.data - Uniprot data per protein up_names.data - UniProt and name conversion data

Updates/Fixes: - URL of Pfam changed

About

Python prototype of a novel transcription factor complex prediction algorithm

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages