Attention: This repo is currently under construction

Introduction

Biological networks have a key role in drug discovery for disease treatement. The study of biological networks and identifying critical nodes can facilitate identification of the key targets in treating diseases. However, efficiently finding critical nodes with low cost is challenging. Additionally different experimental works exist to identify critical nodes which are expensive and time consuming. The development of analytics including statistics and machine learning has revolutionized critical node detections [1]. The Critical Node Detection Problem (CNDP) is the optimizaiton problem that consists in finding a set of nodes, the deletion of which maximally degrades network connectivity according to some predefined connectivity metrices[2]. This project is to find a minimum cardinality set of proteins whose removal would optimally dismantle the interactions in the protein- protein network and help neutralize harmful organisms such as bacteria or viruses. Finding key players or nodes in protein-protein interaction networks can have great implications for structure based strategies of drug design. We use Finder [2] a novel frame work that combines techniques from Graph Neural Net (GNN) and Reinforcement Learning (RL) to achieve fast and effective CNDP. The original finder is writen with tensor flow (TF). We wrote Finder with pytorch to implement for our work. And then we applied a serials of ehancements, particularly for human protein-protein networks.

Reference

[1] X.Liu et al., "Coputational methods for identifying the critical nodes in biological networks", vol.21, no.2, pp. 486-497, 2020.

[2] C. Fan, L. Zeng, Y. Sun, and Y.Y. Liu, Finder supplementary, vol. 2, no.6. 2020.

Data

Data source (raw)

Title: The Human Reference Protein Interactome Mapping Project

link: http://www.interactome-atlas.org/download

Seed model:

FINDER (FInding key players in complex Networks through DEep Reinforcement learning)

This is a PyTorch implementation of FINDER, as described in the paper:

Fan, C., Zeng, L., Sun, Y and Liu Y-Y. Finding key players in complex networks through deep reinforcement learning. Nat Mach Intell (2020).

Overview

Finding an optimal set of nodes, called key players, whose activation (or removal) would maximally enhance (or degrade) certain network functionality, is a fundamental class of problems in network science. Potential applications include network immunization, epidemic control, drug design, and viral marketing. Due to their general NP-hard nature, those problems typically cannot be solved by exact algorithms with polynomial time complexity. Many approximate and heuristic strategies have been proposed to deal with specific application scenarios. Yet, we still lack a unified framework to efficiently solve this class of problems. Here we introduce a deep reinforcement learning framework FINDER, which can be trained purely on small synthetic networks generated by toy models and then applied to a wide spectrum of influencer finding problems. Extensive experiments under various problem settings demonstrate that FINDER significantly outperforms existing methods in terms of solution quality. Moreover, it is several orders of magnitude faster than existing methods for large networks. The presented framework opens up a new direction of using deep learning techniques to understand the organizing principle of complex networks, which enables us to design more robust networks against both attacks and failures.

Repo Contents

code: source code of FINDER for the following four cases in the paper.
- FINDER_CN: source code for the Critical Node (CN) problem under the node-unweighted scenario.
- FINDER_CN_cost: source code for the Critical Node (CN) problem under the node-weighted (including degree-based costs and random costs) scenarios.
- FINDER_ND: source code for the Network Dismantling (ND) problem under the node-unweighted scenario.
- FINDER_ND_CM: source code for the Network Dismantling (ND) problem under the node-unweighted scenario, with configuraion model.
- FINDER_ND_cost: source code for the Network Dismantling (ND) problem under the node-weighted (including degree-based costs and random costs) scenarios.
- FINDER_ND_cost_CM: source code for the Network Dismantling (ND) problem under the node-weighted (including degree-based costs and random costs) scenarios, with configuration model.
- [old_FINDER_CN_cost_tf],[old_FINDER_CN_tf],[old_FINDER_ND_cost_tf],[old_FINDER_ND_tf], old tensor flow versions
results: results obtained by FINDER for all the cases, which should be the same as are reported in the paper.

System Requirements

Software dependencies and operating systems

Software dependencies

Look at reuirements.txt

Operating systems

The package development version is tested on Linux and Windows 10 operating systems. The developmental version of the package has been tested on the following systems:

Linux: Ubuntu 18.04
Windows: 10

The pip package should be compatible with Windows, and Linux operating systems.

Before setting up the FINDER users should have gcc version 7.4.0 or higher.

Hardware Requirements

The FINDER model requires a standard computer with enough RAM and GPU to support the operations defined by a user. For minimal performance, this will be a computer with about 4 GB of RAM and 16GB of GPU. For optimal performance, we recommend a computer with the following specs:

RAM: 16+ GB
CPU: 4+ cores, 3.3+ GHz/core GPU: 16+ GB

The runtimes below are generated using a computer with the recommended specs (16 GB RAM, 4 [email protected] GHz) and internet of speed 25 Mbps.

Installation Guide

For pytorch:

We are currently working on the pytorch implementation for FINDER_ND_cost. (look under FINDER_torch.pyx and train_torch.py)

Instructions

First install all the above required packages, which are contained in the requirements.txt file

pip install -r requirements.txt

Make all the file

python setup.py build_ext -i

Typical install time

It took about 15 mins to install all the required packages, and about 1 mins to make all the files.

Reproduction instructions

Instructions to run

Train the model,

CUDA_VISIBLE_DEVICES=gpu_id python train_torch.py

Modify the hyper-parameters in FINDER_torch.pyx to tune the model, and make files after the the modification.

For old tensorflow models:

CUDA_VISIBLE_DEVICES=gpu_id python train.py

Modify the hyper-parameters in FINDER.pyx to tune the model, and make files after the the modification.

Test synthetic data,

CUDA_VISIBLE_DEVICES=-1 python testSynthetic.py (do not use GPU for test)

Using the well-trained model (stored in ./models), you can obtain the results reported in the paper.

Test real data,

CUDA_VISIBLE_DEVICES=-1 python testReal.py (do not use GPU for test)

Using the well-trained model (stored in ./models), you can obtain the results reported in the paper.

Expected output

The experimental results are saved in the results folder, which contains four subfolders, each of which corresponds to one model, and the synthetic and real results are separated into two different subfolders for the sake of clearity.

Expected run time

Trainning takes about 10 hours for each model.

Test takes about <2 minutes for ND models, 3~4 minutes for ND_cost models.

Basebline methods implementation

We compared with HDA, HBA, HCA, HPRA, CI, MinSum, BPD, GND and RatioCut, which are state-of-the-art baselines for network key players finding methods.

We ran HDA, HBA, HCA, and HPRA with Networkx 2.0, and for HDA in large real-world networks, we instead used SNAP, a general-purpose, high-performance system for graph analysis.

http://snap.stanford.edu/snap

We used the source codes released online, and adopted the best parameter settings for each method. For RatioCut, we modified the traditional RatioCut method based on the GND implementation.

https://github.com/zhfkt/ComplexCi (CI)
https://github.com/abraunst/decycler (MinSum)
http://power.itp.ac.cn/~zhouhj/codes.html (BPD)
https://github.com/hcmidt/corehd (CoreHD)
https://github.com/renxiaolong/Generalized-Network-Dismantling (GND and RatioCut)

Reproducing the results that reported in the FINDER paper

Here is the link to the dataset that was used in the paper, including: 1) real data: nine different test data collected from the SNAP repository; 2) synthetic data: generated by Barab'{a}si-Albert (BA) model.

https://drive.google.com/open?id=1HAxIUsgOPYXHDikmlTIKIXunGWWLXbr2

Reference

Please cite our work if you find our code/paper is useful to your work.

@article{fan2020finding,
   title={Finding key players in complex networks through deep reinforcement learning},
   author={Fan, Changjun and Zeng, Li and Sun, Yizhou and Liu, Yang-Yu},
   journal={Nature Machine Intelligence},
   pages={1--8},
   year={2020},
   publisher={Nature Publishing Group}
 }

Name		Name	Last commit message	Last commit date
Latest commit History 314 Commits
Baseline_results		Baseline_results
Presentations_and_reports		Presentations_and_reports
code		code
data		data
environment		environment
paper		paper
.gitignore		.gitignore
Baselines.ipynb		Baselines.ipynb
Baselines_ND.ipynb		Baselines_ND.ipynb
CN_analytics.ipynb		CN_analytics.ipynb
Colab_runs-CM_sample.ipynb		Colab_runs-CM_sample.ipynb
Colab_runs-Data_preprocess.ipynb		Colab_runs-Data_preprocess.ipynb
Colab_runs.ipynb		Colab_runs.ipynb
Final_graphs.ipynb		Final_graphs.ipynb
LICENSE		LICENSE
README.md		README.md
requirements.sh		requirements.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Attention: This repo is currently under construction

Introduction

Data

Data source (raw)

Seed model:

FINDER (FInding key players in complex Networks through DEep Reinforcement learning)

Contents

Overview

Repo Contents

System Requirements

Software dependencies and operating systems

Software dependencies

Operating systems

Hardware Requirements

Installation Guide

For pytorch:

Instructions

Typical install time

Reproduction instructions

Instructions to run

Expected output

Expected run time

Basebline methods implementation

Reproducing the results that reported in the FINDER paper

Reference

About

Uh oh!

Releases

Packages

Languages

License

faraz2023/FINDER-pytorch

Folders and files

Latest commit

History

Repository files navigation

Attention: This repo is currently under construction

Introduction

Data

Data source (raw)

Seed model:

FINDER (FInding key players in complex Networks through DEep Reinforcement learning)

Contents

Overview

Repo Contents

System Requirements

Software dependencies and operating systems

Software dependencies

Operating systems

Hardware Requirements

Installation Guide

For pytorch:

Instructions

Typical install time

Reproduction instructions

Instructions to run

Expected output

Expected run time

Basebline methods implementation

Reproducing the results that reported in the FINDER paper

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages