forked from FederatedAI/FATE
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request FederatedAI#4289 from FederatedAI/feature-1.9.0-mo…
…del-merge Feature 1.9.0 model merge
- Loading branch information
Showing
5 changed files
with
257 additions
and
314 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,252 @@ | ||
# Guide to Merging FATE Model | ||
|
||
## 1. Overview | ||
|
||
Trained FATE models from federated job of `host` and `guest` may be merged locally and reused in other frameworks. | ||
Currently, FATE supports merging and exporting models from `HeteroLR`, `HeteroSSHELR` and `HeteroSecureBoost` into PMML or sklearn Logistic Regression Model/lightGBM objects. | ||
Below we walk through the general process of merging a Hetero Logistic Regression model using FATE-Flow Client command line tool. | ||
|
||
## 2. Install FATE-Flow Client | ||
|
||
`flow` command line tool is distributed along with [FATE-Client](https://pypi.org/project/fate-client/). | ||
Once `FATE-Client` installed, one can find a cmd enterpoint named `flow`: | ||
|
||
```commandline | ||
pip install fate_client | ||
flow --help | ||
``` | ||
``` | ||
Usage: flow [OPTIONS] COMMAND [ARGS]... | ||
Fate Flow Client | ||
Options: | ||
-h, --help Show this message and exit. | ||
Commands: | ||
checkpoint Checkpoint Operations | ||
component Component Operations | ||
data Data Operations | ||
init Flow CLI Init Command | ||
job Job Operations | ||
key Key Operations | ||
model Model Operations | ||
privilege Privilege Operations | ||
provider Component Provider Operations | ||
queue Queue Operations | ||
resource Resource Manager | ||
server FATE Flow Server Operations | ||
service FATE Flow External Service Operations | ||
table Table Operations | ||
tag Tag Operations | ||
task Task Operations | ||
template Template Operations | ||
test FATE Flow Test Operations | ||
tracking Component Operations | ||
``` | ||
|
||
To use FATE-Flow Client in terminal, we need to first specify which `FATE Flow Service` to connect to. | ||
Assume we have a FATE Flow Service at 172.15.0.1:9380, then execute the following command to initialize FATE-Flow Client: | ||
|
||
```bash | ||
flow init -c /data/projects/fate/conf/service_conf.yaml | ||
flow init --ip 172.15.0.1 --port 9380 | ||
``` | ||
|
||
In the following steps, we assume that `host` is the role executes merging action. | ||
|
||
## 3. Prepare Model | ||
|
||
To obtain `guest's` trained Hetero Logistic Regression, we need to first make `host` load exported model. | ||
For models trained in standalone mode, you may skip this step. | ||
|
||
### 3.1 Export Model(Cluster mode) | ||
|
||
Below is an example model exportation configuration, adopted from [here](https://github.com/FederatedAI/FATE-Flow/blob/main/examples/model/export_model.json): | ||
|
||
```json | ||
{ | ||
"role": "guest", | ||
"party_id": 9999, | ||
"model_id": "arbiter-10000#guest-9999#host-10000#model", | ||
"model_version": "202208291446341887230", | ||
"output_path": "/data/projects/fate/examples/export_model" | ||
} | ||
``` | ||
|
||
Run the following command to export specified model: | ||
|
||
```commandline | ||
flow model export -c export_model.json | ||
``` | ||
|
||
If model exportation succeeds, `flow` will return success message like this: | ||
|
||
``` | ||
{ | ||
"retcode": 0, | ||
"file": "/data/projects/fate/examples/model_export/guest#9999#arbiter-10000#guest-9999#host-10000#model_202208291446341887230.zip", | ||
"retmsg": "download successfully, please check /data/projects/fate/examples/model_export/guest#9999#arbiter-10000#guest-9999#host-10000#model_202208291446341887230.zip" | ||
} | ||
``` | ||
|
||
### 3.2 Import Model(Cluster mode) | ||
|
||
If the FATE model is trained in distributed mode, where `guest` and `host` have FATE each installed in different machines, | ||
exported model needs to be imported/registered on merging role's system first before merging models. | ||
|
||
Here is an example configuration for model importation | ||
adopted from [here](https://github.com/FederatedAI/FATE-Flow/blob/main/examples/model/import_model.json), | ||
assuming we have put the exported `guest's` model from above step to directory "/data/projects/fate/temp" on `host's` machine: | ||
|
||
```json | ||
{ | ||
"role": "guest", | ||
"party_id": 9999, | ||
"model_id": "arbiter-10000#guest-9999#host-10000#model", | ||
"model_version": "202208291446341887230", | ||
"file": "/data/projects/fate/temp/guest#9999#arbiter-10000#guest-9999#host-10000#model_202208291446341887230.zip" | ||
} | ||
``` | ||
|
||
Run the following command to import specified model: | ||
|
||
```commandline | ||
flow model import -c import_model.json | ||
``` | ||
|
||
If model importation succeeds, `flow` will return success message like this: | ||
|
||
``` | ||
{ | ||
"data": { | ||
"job_id": "202208291521250963160", | ||
"model_id": "arbiter-10000#guest-9999#host-10000#model", | ||
"model_version": "202208291446341887230", | ||
"party_id": "9999", | ||
"role": "guest" | ||
}, | ||
"retcode": 0, | ||
"retmsg": "success" | ||
} | ||
``` | ||
|
||
## 4. Merge Model | ||
|
||
### Hetero Logistic Regression Example | ||
|
||
Once model is ready, run the following command to merge models into a sklearn Logistic Regression model: | ||
|
||
```commandline | ||
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230 --guest-party-id 9999 --host-party-ids 10000 --component-name hetero_lr_0 --model-type lr --output-format sklearn --target-name y --include-guest-coef --output-path model_merge_sklearn | ||
``` | ||
|
||
We may then load and predict with the obtained sklearn model: | ||
|
||
```python | ||
import base64 | ||
import json | ||
import pandas | ||
import pickle | ||
|
||
filename = "model_merge_sklearn" | ||
test_data_h = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_host.csv", | ||
header=0, index_col="id") | ||
test_data_g = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_guest.csv", | ||
header=0, index_col="id") | ||
test_data_g = test_data_g.loc[:, test_data_g.columns != 'y'] | ||
test_data = pandas.concat((test_data_g, test_data_h), axis=1) | ||
|
||
with open(filename, "r") as f: | ||
loaded_model = pickle.loads(base64.b64decode(json.load(f))) | ||
predict_score = loaded_model.predict_proba(test_data) | ||
``` | ||
|
||
### HeteroSecureBoost Example | ||
|
||
If the federated model is a HeteroSecureBoost Model, run this command to merge model and export it in LightGBM format. | ||
|
||
```commandline | ||
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230 | ||
--guest-party-id 9999 --host-party-ids 10000 --component-name hetero_secureboost_0 --model-type secureboost --output-format lgb --target-name y --output-path ./lgb.txt | ||
``` | ||
|
||
If you want a model in PMML format, use this command: | ||
|
||
```commandline | ||
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230 | ||
--guest-party-id 9999 --host-party-ids 10000 --component-name hetero_secureboost_0 --model-type secureboost --output-format pmml --target-name y --output-path ./lgb.txt | ||
``` | ||
|
||
Please notice that the model merge function offers a parameter 'host_rename'. Host features will be renamed by adding a sitename | ||
suffix to the original feature name once this option is on: | ||
|
||
```commandline | ||
Host party id is 9998, sitename is host_9998, origin feature is x1 | ||
x1 -> x1, default setting | ||
x1 -> x1_host_9998, when enable host_rename | ||
``` | ||
|
||
Use this command to rename host names when output | ||
|
||
```commandline | ||
flow component hetero-model-merge --model-id arbiter-10000#guest-9999#host-10000#model --model-version 202208291446341887230 --host-rename | ||
--guest-party-id 9999 --host-party-ids 10000 --component-name hetero_secureboost_0 --model-type secureboost --output-format lgb --target-name y --output-path ./lgb.txt | ||
``` | ||
|
||
We may then load and predict with the obtained lightgbm model. Remember that to get the correct predicted result, | ||
the order of features in predicted data should be in the same order as the merged tree model. | ||
|
||
```python | ||
import json | ||
import pandas | ||
import lightgbm as lgb | ||
|
||
filename = "./lgb.txt" | ||
test_data_h = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_host.csv", | ||
header=0, index_col="id") | ||
test_data_g = pandas.read_csv("/data/projects/fate/examples/data/breast_hetero_guest.csv", | ||
header=0, index_col="id") | ||
test_data_g = test_data_g.loc[:, test_data_g.columns != 'y'] | ||
test_data = pandas.concat((test_data_g, test_data_h), axis=1) | ||
|
||
bst = lgb.Booster(model_str=open(filename, 'r').read()) | ||
|
||
pred_rs = bst.predict(test_data) | ||
``` | ||
|
||
For more information on command options, please check `help` menu: | ||
|
||
```commandline | ||
flow component hetero-model-merge --help | ||
``` | ||
``` | ||
Usage: flow component hetero-model-merge [OPTIONS] | ||
- DESCRIPTION: | ||
Merge a hetero model. | ||
- USAGE: | ||
flow component hetero-model-merge --model-id guest-9999#host-9998#model --model-version 202208241838502253290 --guest-party-id 9999 --host-party-ids 9998,9997 --component-name hetero_secure_boost_0 --model-type secureboost --output-format pmml --target-name y --no-host-rename --no-include-guest-coef --output-path model.xml | ||
Options: | ||
--model-id TEXT Model id. [required] | ||
--model-version TEXT Model version. [required] | ||
-gid, --guest-party-id TEXT A valid party id. [required] | ||
-hids, --host-party-ids TEXT Multiple party ids, use a comma to separate | ||
each one. [required] | ||
-cpn, --component-name TEXT A valid component name. [required] | ||
--model-type TEXT [required] | ||
--output-format TEXT [required] | ||
--target-name TEXT | ||
--host-rename / --no-host-rename | ||
--include-guest-coef / --no-include-guest-coef | ||
-o, --output-path PATH User specifies output directory path. | ||
[required] | ||
-h, --help Show this message and exit. | ||
``` | ||
|
||
|
Oops, something went wrong.