English | 中文
FATE-Flow is a end to end pipeline platform for Federated Learning. Pipeline is a sequence of components which is specifically designed for highly flexible, high performance federated learning tasks. That includes data processing, modeling, training, verification, publishing and serving inference.
- DAG define Pipeline
- Describe DAG using FATE-DSL in JSON format
- FATE has a large number of default federated learning components, such as Hetero LR/Homo LR/Secure Boosting Tree and so on.
- Developers can easily implement custom components using Basic-API and build their own Pipeline through DSL.
- Federated Modeling Task Life-cycle Manager, start/stop, status synchronization and so on.
- Powerful Federated Scheduler, support multiple scheduling strategies for DAG Job and Component Task.
- Real-time tracking of data, parameters, models, and metric during the experiment.
- Federated Model Manager, model binding, versioning and deployment tools.
- Provide HTTP API and Command Line Interface.
- Data and interface support for modeling visualization on FATE-Board.
Only one step is required to configure a component for pipeline.
define the module of this component
define the input, includes data, model or isometric_model(only be used for FeatureSelection)
define the output, includes data and model
Fate-flow is deployed in $PYTHONPATH/fate_flow/
. It depends on two
configuration files: $PYTHONPATH/arch/conf/server.conf
,
$PYTHONPATH/fate_flow/settings.py
server.conf: | Server.conf configures the address of all FATE services. FATE-Flow of different deployment modes needs different fate services. For details, please refer to the following specific deployment mode. |
||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
settings.py: | Key configuration item description:
|
||||||||||||||||||||||||||||||||||||||||||
service.sh: | Server start/stop/restart script
|
You only need to start the FATE-Flow service to run the federated learning modeling experiment.
Configuration: |
|
||||||
---|---|---|---|---|---|---|---|
Features: |
|
You need to deploy three service:
- MySQL
- FATE-Flow
- FATE-Board
FATE provides a standalone version of the docker for experience.please refer to docker version deploy guide at docker-deploy.
Configuration: |
|
---|
Features: |
|
---|
FATE also provides a distributed runtime architecture for Big Data scenario. Migration from standalone to cluster requires configuration change only. No algorithm change is needed. To deploy FATE on a cluster, please refer to cluster deploy guide at cluster-deploy.
Configuration: |
|
||||||||
---|---|---|---|---|---|---|---|---|---|
Features: |
|
FATE-Flow provide REST API and Command Line Interface. Let’s start using the client to run a Federated Learning Pipeline Job example(Standalone).
Upload Data(guest/host): | python fate_flow_client.py -f upload -c examples/upload_guest.json
python fate_flow_client.py -f upload -c examples/upload_host.json Note The configuration item USE_LOCAL_DATA in FATE-Flow Server represents whether to use the data on the FATE-Flow client machine when uploading data, and the default is use(True). If the configuration USE_LOCAL_DATA of FATE-Flow Server is set to True, and you still want to use the data on the machine where the FATE-Flow Server is located, you can add it to the upload configuration "module" parameter, the parameter value 0(default is 1). Note When the cluster deployment uses the same table to upload data, it is necessary to carry the drop parameter (0 represents overwriting upload, 1 represents deleting the previous data and re-uploading) python fate_flow_client.py -f upload -c examples/upload_guest.json -drop 0 |
---|---|
Submit Job: | python fate_flow_client.py -f submit_job -d examples/test_hetero_lr_job_dsl.json -c examples/test_hetero_lr_job_conf.json Command response example: {
"data": {
"board_url": "http://localhost:8080/index.html#/dashboard?job_id=2019121910313566330118&role=guest&party_id=9999",
"job_dsl_path": "xxx/jobs/2019121910313566330118/job_dsl.json",
"job_runtime_conf_path": "xxx/jobs/2019121910313566330118/job_runtime_conf.json",
"logs_directory": "xxx/logs/2019121910313566330118",
"model_info": {
"model_id": "arbiter-10000#guest-9999#host-10000#model",
"model_version": "2019121910313566330118"
}
},
"jobId": "2019121910313566330118",
"retcode": 0,
"retmsg": "success"
} Some of the following operations will use these response information. |
Query Job: |
python fate_flow_client.py -f query_job -r guest -p 10000 -j $job_id
And then, you can found so many useful command from `CLI <./doc/fate_flow_cli.rst>`__.
For more Federated Learning pipeline Job example, please refer at `federatedml-1.x-examples <./../examples/federatedml-1.x-examples>`__ and it’s `README <./../examples/federatedml-1.x-examples/README.rst>`__
Publish model to FATE-Serving, and then using Serving’s GRPC API to inference.
Modify service configuration: | Modify the IP and end of FATE-Serving in arch/conf/server_conf.json (please note that many parties need to modify the actual deployment address of their respective FATE-Serving), the content is “servings”:[“ip: port”], restart FATE-Flow after modification. server_conf.json format is as follows: {
"servers": {
"servings": [
"127.0.0.1:8000"
]
}
} |
---|---|
Publish Model: | python fate_flow_client.py -f load -c examples/publish_load_model.json Please replace the corresponding configuration in
|
Publish Model Online Default: | python fate_flow_client.py -f bind -c examples/bind_model_service.json Please replace the corresponding configuration in
|
FATE-Flow Server log: | $PYTHONPATH/logs/fate_flow/ |
---|---|
Job log: | $PYTHONPATH/logs/$job_id/ |
What is the role of FATE FLOW in the FATE?: | FATE Flow is a scheduling system that schedules the execution of algorithmic components based on the DSL of the job submitted by the user. |
---|---|
ModuleNotFoundError: | No module named “arch”: Set PYTHONPATH to the parent directory of fate_flow. |
Why does the task show success when submitting the task, but the task fails on the dashboard page?: | |
|
|
What meaning and role do the guest, host, arbiter, and local roles represent in fate?: | |
|
|
Error about“cannot find xxxx” when killing a waiting job: | |
Fate_flow currently only supports kill on the job initiator, kill will report “cannot find xxx”. |
|
What is the upload data doing?: | Upload data is uploaded to eggroll and becomes a DTable format executable by subsequent algorithms. |
How to download the generated data in the middle of the algorithm?: | |
|
|
If the same file upload is executed twice, will fate delete the first data and upload it again?: | |
It will be overwritten if the keys are the same in the same table. |
|
What is the reason for the failure of this job without error on the board?: | |
The logs in these places will not be displayed on the
board: |
|
What is the difference between the load and bind commands?: | |
Load can be understood as a model release, and bind is the default model version. |