Skip to content

Commit

Permalink
[docs][KubeRay] Update KubeRay doc for release v0.5.0 (ray-project#34178
Browse files Browse the repository at this point in the history
)

index.md: No code here to test and verify.

 getting-started.ipynb: Test manually.

 user-guides.md: No code here to test and verify

 k8s-cluster-setup.md: No code here to test and verify
 config.md: No code here to test and verify
 configuring-autoscaling.md: Test manually.
 logging.md: Test manually.
 gpu.rst: I did not verify code snippets, but GPU usage will be verified in gpu-training-example.md.
 experimental.md: No code here to test and verify
 static-ray-cluster-without-kuberay.md: Skip this. This document has no relationship with KubeRay.
 examples.md

 ml-example.md: (Will update in [docs][KubeRay] Provide some GKE instructions in KubeRay example ray-project#33339)
 gpu-training-example.md (Will update in [docs][KubeRay] Provide some GKE instructions in KubeRay example ray-project#33339)
 references.md

Ray Serve
 kubernetes.md: Test manually.

 fault-tolerance.md: I do not test all serve's recovery procedures. I make sure the RayService can be created as expected.

helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm install kuberay-operator kuberay/kuberay-operator --version 0.5.0
# path: doc/
kubectl apply -f source/serve/doc_code/fault_tolerance/k8s_config.yaml

# port forward
kubectl port-forward service/rayservice-sample-serve-svc 8000

# Test the serve deployment
curl localhost:8000

# Delete a worker Pod
kubectl delete pod ${WORKER_POD}

# Test the serve deployment again
curl localhost:8000
 run_gcs_ft_on_k8s.py
  • Loading branch information
kevin85421 authored Apr 10, 2023
1 parent 10be570 commit fe96939
Show file tree
Hide file tree
Showing 14 changed files with 145 additions and 266 deletions.
6 changes: 2 additions & 4 deletions doc/source/cluster/kubernetes/configs/ray-cluster.log.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,17 +23,15 @@ metadata:
controller-tools.k8s.io: "1.0"
name: raycluster-complete-logs
spec:
rayVersion: '2.0.0'
rayVersion: '2.3.0'
headGroupSpec:
serviceType: ClusterIP
rayStartParams:
dashboard-host: '0.0.0.0'
block: 'true'
template:
spec:
containers:
- name: ray-head
image: rayproject/ray:2.0.0
image: rayproject/ray:2.3.0
lifecycle:
preStop:
exec:
Expand Down
10 changes: 6 additions & 4 deletions doc/source/cluster/kubernetes/examples/gpu-training-example.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,9 @@ kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container
# (Method 2) "gcloud container clusters get-credentials <your-cluster-name> --region <your-region> --project <your-project>"
# (Method 3) "kubectl config use-context ..."

# Create the KubeRay operator
kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.4.0&timeout=90s"
# Install both CRDs and KubeRay operator v0.5.0.
helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm install kuberay-operator kuberay/kuberay-operator --version 0.5.0

# Create a Ray cluster
kubectl apply -f https://raw.githubusercontent.com/ray-project/ray/master/doc/source/cluster/kubernetes/configs/ray-cluster.gpu.yaml
Expand Down Expand Up @@ -114,7 +115,8 @@ It is optional.
```shell
# Step 2: Deploy a Ray cluster on Kubernetes with the KubeRay operator.
# Create the KubeRay operator
kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.4.0&timeout=90s"
helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm install kuberay-operator kuberay/kuberay-operator --version 0.5.0

# Create a Ray cluster
kubectl apply -f https://raw.githubusercontent.com/ray-project/ray/master/doc/source/cluster/kubernetes/configs/ray-cluster.gpu.yaml
Expand Down Expand Up @@ -177,7 +179,7 @@ Delete your Ray cluster and KubeRay with the following commands:
kubectl delete raycluster raycluster

# Please make sure the ray cluster has already been removed before delete the operator.
kubectl delete -k "http://github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.4.0&timeout=90s"
helm uninstall kuberay-operator
```
If you're on a public cloud, don't forget to clean up the underlying
node group and/or Kubernetes cluster.
Loading

0 comments on commit fe96939

Please sign in to comment.