forked from kubeedge/sedna
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
build high frequency sedna example with modelbox
Signed-off-by: Ymh13383894400 <[email protected]>
- Loading branch information
1 parent
306080d
commit 104d80e
Showing
2 changed files
with
286 additions
and
0 deletions.
There are no files selected for viewing
94 changes: 94 additions & 0 deletions
94
examples/incremental_learning_modelbox/mnist/Build_image.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
# Build image | ||
|
||
This part is the point, different project requirements need to rebuild the image. For a specific introduction to modelbox and the built-in functions of modelbox, please refer to the modelbox documentation manual. | ||
|
||
## Container image download | ||
|
||
Use the following command to pull the relevant image. For example, cuda11.2, TensorFlow's unbuntu development image, then download the latest version of the image command is as follows: | ||
|
||
```shell | ||
docker pull modelbox/modelbox-develop-tensorflow_2.6.0-cuda_11.2-ubuntu-x86_64:latest | ||
``` | ||
|
||
The address of the ModelBox image repository is as follows:https://hub.docker.com/u/modelbox | ||
|
||
## One-click startup script | ||
|
||
```shell | ||
#!/bin/bash | ||
|
||
# ssh map port, [modify] | ||
SSH_MAP_PORT=50022 | ||
|
||
# editor map port [modify] | ||
EDITOR_MAP_PORT=1104 | ||
|
||
# http server port [modify] | ||
HTTP_SERVER_PORT=8080 | ||
|
||
# container name [modify] | ||
CONTAINER_NAME="modelbox_instance_`date +%s` " | ||
|
||
# image name | ||
IMAGE_NAME="modelbox/modelbox-develop-tensorflow_2.6.0-cuda_11.2-ubuntu-x86_64" | ||
|
||
HTTP_DOCKER_PORT_COMMAND="-p $HTTP_SERVER_PORT:$HTTP_SERVER_PORT" | ||
|
||
docker run -itd --gpus all -e NVIDIA_DRIVER_CAPABILITIES=compute,utility,video \ | ||
--tmpfs /tmp --tmpfs /run -v /sys/fs/cgroup:/sys/fs/cgroup:ro \ | ||
--name $CONTAINER_NAME -v /home:/home \ | ||
-p $SSH_MAP_PORT:22 -p $EDITOR_MAP_PORT:1104 $HTTP_DOCKER_PORT_COMMAND \ | ||
$IMAGE_NAME | ||
``` | ||
|
||
**Notes:** | ||
|
||
- After creating a file using the vim start_`docker.sh`, `i` enters the edit mode, pastes the above code, edits and modifications, and saves `wx`. | ||
- In the docker startup script, pay attention to whether the image version launched is consistent with the image version you need. | ||
- If you need to debug `gdb` in the container, you need to add the --privileged parameter to the startup container. | ||
- If you execute the above command on a machine without a `GPU`, you can delete the `--gpus`-related parameters. However, only CPU-related functional units can be used at this time. | ||
- If the port is not occupied but still unreachable after starting mirroring, you need to check the firewall settings. | ||
|
||
|
||
|
||
## Use containers to fulfill requirements | ||
|
||
```shell | ||
docker exec -it [container id] bash | ||
# Carry out your project. | ||
``` | ||
|
||
## Build image | ||
|
||
1. docker commit image | ||
|
||
Save project,It would be more convenient for us to just use `docker commit` directly. | ||
|
||
```shell | ||
docker commit [container-ID] [image-name] | ||
``` | ||
|
||
2. build image | ||
|
||
Use the image created | ||
|
||
```dockerfile | ||
# load basic image | ||
FROM [image-name] | ||
|
||
# configure the Environment variable | ||
ENV PYTHONPATH "/root" | ||
|
||
# modify the Working directory | ||
WORKDIR /root | ||
|
||
ENTRYPOINT ["python"] | ||
``` | ||
|
||
Save the content as `Dockerfile`,command: | ||
|
||
```shell | ||
docker build -t [image-name] . | ||
``` | ||
|
||
Get a image of our business needs. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,192 @@ | ||
# Using Incremental Learning Job In Mnist | ||
|
||
This document introduces how to use the inference of incremental learning job in Mnist. Using the incremental learning inference job, our application can automatically retrains, evaluates, and updates models based on the data generated at the edge. | ||
|
||
## Mnist Experiment | ||
|
||
### Prepare Model | ||
|
||
``` | ||
Link:https://pan.baidu.com/s/1Gi5BJ_NQzqj66R8N5OXPzA | ||
Extract code:OSPP | ||
``` | ||
|
||
### Prepare dataset | ||
|
||
``` | ||
Link:https://pan.baidu.com/s/1Gi5BJ_NQzqj66R8N5OXPzA | ||
Extract code:OSPP | ||
``` | ||
|
||
### Prepare Image | ||
|
||
This example uses the image: | ||
|
||
``` | ||
ymh13383894400/mnist-new:v1 | ||
``` | ||
|
||
This image is generated by the script used for creating training, eval and inference worker. | ||
|
||
### Project creation and running | ||
|
||
Create a Mnist project | ||
|
||
``` | ||
├─flowunit:# Flowunit directory | ||
│ ├─mnist_preprocess:# Preprocessing functional unit | ||
│ ├─mnist_infer:# TensorFlow Inference Functional Unit | ||
│ ├─mnist_response:# HTTP responses construct functional units | ||
└─graph:# Flowchart catalog | ||
│ ├─mnist.toml:# Inference flowchart | ||
│ └─test_mnist.py # Inference python file | ||
``` | ||
|
||
create the job | ||
|
||
```shell | ||
WORKER_NODE="edge-node1" | ||
INFER_NODE="edge-node2" | ||
``` | ||
|
||
- Create Dataset | ||
|
||
```yaml | ||
kubectl create -f - <<EOF | ||
apiVersion: sedna.io/v1alpha1 | ||
kind: Dataset | ||
metadata: | ||
name: incremental-dataset | ||
spec: | ||
url: "/data/train_data.txt" | ||
format: "txt" | ||
nodeName: $WORKER_NODE | ||
EOF | ||
``` | ||
|
||
- Create Initial Model to simulate the initial model in incremental learning scenario. | ||
|
||
```yaml | ||
kubectl create -f - <<EOF | ||
apiVersion: sedna.io/v1alpha1 | ||
kind: Model | ||
metadata: | ||
name: initial-model | ||
spec: | ||
url : "/models/base_model" | ||
format: "ckpt" | ||
EOF | ||
``` | ||
|
||
- Create Deploy Model | ||
|
||
```yaml | ||
kubectl create -f - <<EOF | ||
apiVersion: sedna.io/v1alpha1 | ||
kind: Model | ||
metadata: | ||
name: deploy-model | ||
spec: | ||
url : "/models/deploy_model/saved_model.pb" | ||
format: "pb" | ||
EOF | ||
``` | ||
|
||
- Start The Incremental Learning Job | ||
|
||
The inference part uses the modelbox image to run the pod. | ||
|
||
```yaml | ||
IMAGE=ymh13383894400/mnist-new:v1 | ||
|
||
kubectl create -f - <<EOF | ||
apiVersion: sedna.io/v1alpha1 | ||
kind: IncrementalLearningJob | ||
metadata: | ||
name: Mnist-demo | ||
spec: | ||
initialModel: | ||
name: "initial-model" | ||
dataset: | ||
name: "incremental-dataset" | ||
trainProb: 0.8 | ||
trainSpec: | ||
template: | ||
spec: | ||
nodeName: $WORKER_NODE | ||
containers: | ||
- image: $IMAGE | ||
name: train-worker | ||
imagePullPolicy: IfNotPresent | ||
args: ["train.py"] | ||
trigger: | ||
checkPeriodSeconds: 60 | ||
timer: | ||
start: 02:00 | ||
end: 20:00 | ||
condition: | ||
operator: ">" | ||
threshold: 500 | ||
metric: num_of_samples | ||
evalSpec: | ||
template: | ||
spec: | ||
nodeName: $WORKER_NODE | ||
containers: | ||
- image: $IMAGE | ||
name: eval-worker | ||
imagePullPolicy: IfNotPresent | ||
args: ["eval.py"] | ||
deploySpec: | ||
model: | ||
name: "deploy-model" | ||
hotUpdateEnabled: true | ||
pollPeriodSeconds: 60 | ||
trigger: | ||
condition: | ||
operator: ">" | ||
threshold: 0.1 | ||
metric: precision_delta | ||
hardExampleMining: | ||
name: "IBT" | ||
parameters: | ||
- key: "threshold_img" | ||
value: "0.9" | ||
- key: "threshold_box" | ||
value: "0.9" | ||
template: | ||
spec: | ||
nodeName: $INFER_NODE | ||
containers: | ||
- image: $IMAGE | ||
name: infer-worker | ||
imagePullPolicy: IfNotPresent | ||
args: ["test_mnist.py"] | ||
volumeMounts: | ||
- name: localvideo | ||
mountPath: /video/ | ||
- name: hedir | ||
mountPath: /he_saved_url | ||
resources: # user defined resources | ||
limits: | ||
memory: 2Gi | ||
volumes: # user defined volumes | ||
- name: localvideo | ||
hostPath: | ||
path: /incremental_learning/video/ | ||
type: DirectoryOrCreate | ||
- name: hedir | ||
hostPath: | ||
path: /incremental_learning/he/ | ||
type: DirectoryOrCreate | ||
outputDir: "/output" | ||
EOF | ||
``` | ||
|
||
### Check Incremental Learning Job | ||
|
||
Query the service status: | ||
|
||
```shell | ||
kubectl get incrementallearningjob Mnist-detection-demo | ||
``` |