Skip to content

Commit

Permalink
Merge pull request FederatedAI#4289 from FederatedAI/feature-1.9.0-mo…
Browse files Browse the repository at this point in the history
…del-merge

Feature 1.9.0 model merge
  • Loading branch information
mgqa34 authored Aug 29, 2022
2 parents 01950ac + e9bb7e2 commit 5295347
Show file tree
Hide file tree
Showing 5 changed files with 257 additions and 314 deletions.
6 changes: 5 additions & 1 deletion doc/tutorial/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Here we provide tutorials on running FATE jobs:

Submiting jobs with `Pipeline` is recommended, here are some `Jupyter Notebook` with guided instructions:
Submitting jobs with `Pipeline` is recommended, here are some `Jupyter Notebook` with guided instructions:

- [Upload Data with FATE-Pipeline](pipeline/pipeline_tutorial_upload.ipynb)
- [Train & Predict Hetero SecureBoost with FATE-Pipeline](pipeline/pipeline_tutorial_hetero_sbt.ipynb)
Expand All @@ -24,3 +24,7 @@ Models can be pushlished with FATE Serving to `Serving without FATE`:
And for those who want to run jobs in batches, ie. run algorithm tests, try using `fate_test`:

- [FATE-Test Tutorial](fate_test_tutorial.md)

To merge models from different roles and export as sklearn/lighGBM or PMML format, please refer to the tutorial below:

- [Guide to Merging FATE Model](./model_merge.md)
252 changes: 252 additions & 0 deletions doc/tutorial/model_merge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
# Guide to Merging FATE Model

## 1. Overview

Trained FATE models from federated job of `host` and `guest` may be merged locally and reused in other frameworks.
Currently, FATE supports merging and exporting models from `HeteroLR`, `HeteroSSHELR` and `HeteroSecureBoost` into PMML or sklearn Logistic Regression Model/lightGBM objects.
Below we walk through the general process of merging a Hetero Logistic Regression model using FATE-Flow Client command line tool.

## 2. Install FATE-Flow Client

`flow` command line tool is distributed along with [FATE-Client](https://pypi.org/project/fate-client/).
Once `FATE-Client` installed, one can find a cmd enterpoint named `flow`:

```commandline
pip install fate_client
flow --help
```
```
Usage: flow [OPTIONS] COMMAND [ARGS]...
Fate Flow Client
Options:
-h, --help Show this message and exit.
Commands:
checkpoint Checkpoint Operations
component Component Operations
data Data Operations
init Flow CLI Init Command
job Job Operations
key Key Operations
model Model Operations
privilege Privilege Operations
provider Component Provider Operations
queue Queue Operations
resource Resource Manager
server FATE Flow Server Operations
service FATE Flow External Service Operations
table Table Operations
tag Tag Operations
task Task Operations
template Template Operations
test FATE Flow Test Operations
tracking Component Operations
```

To use FATE-Flow Client in terminal, we need to first specify which `FATE Flow Service` to connect to.
Assume we have a FATE Flow Service at 172.15.0.1:9380, then execute the following command to initialize FATE-Flow Client:

```bash
flow init -c /data/projects/fate/conf/service_conf.yaml
flow init --ip 172.15.0.1 --port 9380
```

In the following steps, we assume that `host` is the role executes merging action.

## 3. Prepare Model

To obtain `guest's` trained Hetero Logistic Regression, we need to first make `host` load exported model.
For models trained in standalone mode, you may skip this step.

### 3.1 Export Model(Cluster mode)

Below is an example model exportation configuration, adopted from [here](https://github.com/FederatedAI/FATE-Flow/blob/main/examples/model/export_model.json):

```json
{
"role": "guest",
"party_id": 9999,
"model_id": "arbiter-10000#guest-9999#host-10000#model",
"model_version": "202208291446341887230",
"output_path": "/data/projects/fate/examples/export_model"
}
```

Run the following command to export specified model:

```commandline
flow model export -c export_model.json
```

If model exportation succeeds, `flow` will return success message like this:

```
{
"retcode": 0,
"file": "/data/projects/fate/examples/model_export/guest#9999#arbiter-10000#guest-9999#host-10000#model_202208291446341887230.zip",
"retmsg": "download successfully, please check /data/projects/fate/examples/model_export/guest#9999#arbiter-10000#guest-9999#host-10000#model_202208291446341887230.zip"
}
```

### 3.2 Import Model(Cluster mode)

If the FATE model is trained in distributed mode, where `guest` and `host` have FATE each installed in different machines,
exported model needs to be imported/registered on merging role's system first before merging models.

Here is an example configuration for model importation
adopted from [here](https://github.com/FederatedAI/FATE-Flow/blob/main/examples/model/import_model.json),
assuming we have put the exported `guest's` model from above step to directory "/data/projects/fate/temp" on `host's` machine:

```json
{
"role": "guest",
"party_id": 9999,
"model_id": "arbiter-10000#guest-9999#host-10000#model",
"model_version": "202208291446341887230",
"file": "/data/projects/fate/temp/guest#9999#arbiter-10000#guest-9999#host-10000#model_202208291446341887230.zip"
}
```

Run the following command to import specified model:

```commandline
flow model import -c import_model.json
```

If model importation succeeds, `flow` will return success message like this:

```
{
"data": {
"job_id": "202208291521250963160",
"model_id": "arbiter-10000#guest-9999#host-10000#model",
"model_version": "202208291446341887230",
"party_id": "9999",
"role": "guest"
},
"retcode": 0,
"retmsg": "success"
}
```

## 4. Merge Model

### Hetero Logistic Regression Example

Once model is ready, run the following command to merge models into a sklearn Logistic Regression model:

```commandline
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230 --guest-party-id 9999 --host-party-ids 10000 --component-name hetero_lr_0 --model-type lr --output-format sklearn --target-name y --include-guest-coef --output-path model_merge_sklearn
```

We may then load and predict with the obtained sklearn model:

```python
import base64
import json
import pandas
import pickle

filename = "model_merge_sklearn"
test_data_h = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_host.csv",
header=0, index_col="id")
test_data_g = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_guest.csv",
header=0, index_col="id")
test_data_g = test_data_g.loc[:, test_data_g.columns != 'y']
test_data = pandas.concat((test_data_g, test_data_h), axis=1)

with open(filename, "r") as f:
loaded_model = pickle.loads(base64.b64decode(json.load(f)))
predict_score = loaded_model.predict_proba(test_data)
```

### HeteroSecureBoost Example

If the federated model is a HeteroSecureBoost Model, run this command to merge model and export it in LightGBM format.

```commandline
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230
--guest-party-id 9999 --host-party-ids 10000 --component-name hetero_secureboost_0 --model-type secureboost --output-format lgb --target-name y --output-path ./lgb.txt
```

If you want a model in PMML format, use this command:

```commandline
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230
--guest-party-id 9999 --host-party-ids 10000 --component-name hetero_secureboost_0 --model-type secureboost --output-format pmml --target-name y --output-path ./lgb.txt
```

Please notice that the model merge function offers a parameter 'host_rename'. Host features will be renamed by adding a sitename
suffix to the original feature name once this option is on:

```commandline
Host party id is 9998, sitename is host_9998, origin feature is x1
x1 -> x1, default setting
x1 -> x1_host_9998, when enable host_rename
```

Use this command to rename host names when output

```commandline
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230 --host-rename
--guest-party-id 9999 --host-party-ids 10000 --component-name hetero_secureboost_0 --model-type secureboost --output-format lgb --target-name y --output-path ./lgb.txt
```

We may then load and predict with the obtained lightgbm model. Remember that to get the correct predicted result,
the order of features in predicted data should be in the same order as the merged tree model.

```python
import json
import pandas
import lightgbm as lgb

filename = "./lgb.txt"
test_data_h = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_host.csv",
header=0, index_col="id")
test_data_g = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_guest.csv",
header=0, index_col="id")
test_data_g = test_data_g.loc[:, test_data_g.columns != 'y']
test_data = pandas.concat((test_data_g, test_data_h), axis=1)

bst = lgb.Booster(model_str=open(filename, 'r').read())

pred_rs = bst.predict(test_data)
```

For more information on command options, please check `help` menu:

```commandline
flow component hetero-model-merge --help
```
```
Usage: flow component hetero-model-merge [OPTIONS]
- DESCRIPTION:
Merge a hetero model.
- USAGE:
flow component hetero-model-merge --model-id guest-9999#host-9998#model --model-version 202208241838502253290 --guest-party-id 9999 --host-party-ids 9998,9997 --component-name hetero_secure_boost_0 --model-type secureboost --output-format pmml --target-name y --no-host-rename --no-include-guest-coef --output-path model.xml
Options:
--model-id TEXT Model id. [required]
--model-version TEXT Model version. [required]
-gid, --guest-party-id TEXT A valid party id. [required]
-hids, --host-party-ids TEXT Multiple party ids, use a comma to separate
each one. [required]
-cpn, --component-name TEXT A valid component name. [required]
--model-type TEXT [required]
--output-format TEXT [required]
--target-name TEXT
--host-rename / --no-host-rename
--include-guest-coef / --no-include-guest-coef
-o, --output-path PATH User specifies output directory path.
[required]
-h, --help Show this message and exit.
```


Loading

0 comments on commit 5295347

Please sign in to comment.