Skip to content

ACES scheduler subcomponent in charge of executing the workload actions

License

Notifications You must be signed in to change notification settings

ACES-EU/workload-actions-manager

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Workload Actions Manager (WAM)

GPLv3 License Lint & Test

architecture

The Workload Actions Manager (WAM) is a specialized service integrated into the Kubernetes control plane, designed to manage and orchestrate workload actions for Smart AIOps entities. These entities, which include both centralized AIOps reschedulers and decentralized agents, utilize the WAM API to initiate specific actions. WAM oversees the coordination, task assignment, and interaction with various components required to perform workload actions such as creating or removing pods. Due to its distributed architecture, WAM emphasizes parallel and asynchronous execution of actions. Entities are notified of action completions via a separate endpoint, indicating whether the action was successful or not.

To ensure high availability and fault tolerance, WAM is deployed with multiple replicas, all accessible through a single Kubernetes Service with a known domain and port. The system leverages a fault-tolerant persistent message queue to store scheduling recommendations. AIOps entities interact with WAM by sending action requests through the API, with WAM then executing these actions efficiently while maintaining fault tolerance.

The design of the WAM component architecture aims to minimize custom logic by delegating tasks to existing Kubernetes components like the ReplicaSet controller to accomplish the required workload actions. For enhanced flexibility and customization, the Kubernetes scheduler is extended with custom scheduler plugins, allowing for tailored scheduling processes to meet unique requirements.

WAM implements four primary actions—create, delete, move, and swap—which are generally sufficient for most directives from AIOps entities or to implement higher-level actions. While these foundational actions cover most use cases, some scenarios may necessitate additional logic or more complex actions. The abstraction level required for actions depends on the output from MLOps entities and the target orchestration technology, necessitating a component that bridges different abstraction levels. A possible solution is a scheduler-specific action mapping component that converts scheduler-specific concepts into foundational actions. This component can implement high-level actions using low-level actions, such as moving an entire application from one edge environment to another, or it may include advanced resource characterization capabilities, like placing workloads on nodes with fewer than three pods and average RAM utilization below 50%.

Requirements

Create a Cluster

grep k3d-registry.localhost /etc/hosts || echo "127.0.0.1 k3d-registry.localhost" | sudo tee -a /etc/hosts
k3d cluster create --config deploy/k3d.yaml --api-port localhost:5443

Build

just build_and_push

# Possible issue & fix:
# chmod a+x ./wam-scheduler/vendor/k8s.io/code-generator/generate-internal-groups.sh

Deploy

helm install --namespace default wam-redis deploy/redis
sleep 30 # wait for Redis to start
helm install --namespace kube-system wam-scheduler deploy/wam-scheduler
helm install --namespace default wam deploy/wam
helm install --namespace default test-a deploy/test-a
helm install --namespace default test-b deploy/test-b
kubectl port-forward <wam_pod_name> 3030:3030

Examples

# create a replica of A on node 7
curl -X POST -H "Content-Type: application/json" \
  -d '{"method":"action.Create","params":[{"workload": {"namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment", "name": "test-a"}, "node": {"name": "k3d-aces-agent-7"}}], "id":"1"}' \
  http://localhost:3030/rpc
  
# create a replica of A on node 4
curl -X POST -H "Content-Type: application/json" \
  -d '{"method":"action.Create","params":[{"workload": {"namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment", "name": "test-a"}, "node": {"name": "k3d-aces-agent-4"}}], "id":"1"}' \
  http://localhost:3030/rpc

# delete a replica of A on node 7
export pod_to_delete=$(kubectl get pods -l 'app.kubernetes.io/name=test-a' -o wide | grep 'k3d-aces-agent-7' | awk '{print $1}' | head -n 1)
echo "$pod_to_delete"
curl -X POST -H "Content-Type: application/json" \
  -d "{\"method\":\"action.Delete\",\"params\":[{\"pod\": {\"namespace\": \"default\", \"name\": \"$pod_to_delete\"}}], \"id\":\"1\"}" \
  http://localhost:3030/rpc

# move a replica of A from node 4 to node 7
export pod_to_move=$(kubectl get pods -l 'app.kubernetes.io/name=test-a' -o wide | grep 'k3d-aces-agent-4' | awk '{print $1}' | head -n 1)
echo "$pod_to_move"
curl -X POST -H "Content-Type: application/json" \
  -d "{\"method\":\"action.Move\",\"params\":[{\"pod\": {\"namespace\": \"default\", \"name\": \"$pod_to_move\"}, \"node\": {\"name\": \"k3d-aces-agent-7\"}}], \"id\":\"1\"}" \
  http://localhost:3030/rpc
  
# create a replica of B on node 4
curl -X POST -H "Content-Type: application/json" \
  -d '{"method":"action.Create","params":[{"workload": {"namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment", "name": "test-b"}, "node": {"name": "k3d-aces-agent-4"}}], "id":"1"}' \
  http://localhost:3030/rpc

# create another replica of B on node 4
curl -X POST -H "Content-Type: application/json" \
  -d '{"method":"action.Create","params":[{"workload": {"namespace": "default", "apiVersion": "apps/v1", "kind": "Deployment", "name": "test-b"}, "node": {"name": "k3d-aces-agent-4"}}], "id":"1"}' \
  http://localhost:3030/rpc

# swap replica of A (node 7) with replicas of B (node 4)
export pod_to_swap_A=$(kubectl get pods -l 'app.kubernetes.io/name=test-a' -o wide | grep 'k3d-aces-agent-7' | awk '{print $1}' | head -n 1)
export pod_to_swap_B_1=$(kubectl get pods -l 'app.kubernetes.io/name=test-b' -o wide | grep 'k3d-aces-agent-4' | awk '{print $1}' | sort | head -n 1)
export pod_to_swap_B_2=$(kubectl get pods -l 'app.kubernetes.io/name=test-b' -o wide | grep 'k3d-aces-agent-4' | awk '{print $1}' | sort | tail -n 1)
echo "$pod_to_swap_A"
echo "$pod_to_swap_B_1"
echo "$pod_to_swap_B_2"
curl -X POST -H "Content-Type: application/json" \
  -d "{\"method\":\"action.Swap\",\"params\":[{\"x\": {\"namespace\": \"default\", \"name\": \"$pod_to_swap_A\"}, \"y\": [{\"namespace\": \"default\", \"name\": \"$pod_to_swap_B_1\"}, {\"namespace\": \"default\", \"name\": \"$pod_to_swap_B_2\"}]}], \"id\": \"1\"}" \
  http://localhost:3030/rpc

Clean up

k3d cluster stop aces && k3d cluster delete aces

Resources

License

This project is licensed under the terms of the GNU General Public License v3.0. See the LICENSE file for details.

© 2024 Faculty of Computer and Information Science, University of Ljubljana

About

ACES scheduler subcomponent in charge of executing the workload actions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published