This project is a small example of the following topics:
- golang web hello world
- containerization (Docker)
- orchestration (Kubernetes)
- continuous integration Github Actions
- infrastructure as code (Terraform)
- cloud services AWS
- Kubernetes Chaos Testing LitmusChaos
This project deploys a simple webpage built into a docker image, replicated 5 times in a aws kubernetes cluster. The cluster and all resources are AWS based and created automatically with a terraform configuration. All of these actions are triggered by github actions CI/CD tool.
A simple Go webserver which prints server IP address to verify load balancing of infra. Along with a fun pong game.
- Run app as
$ go run src/*.go
- Access the webserver on your browser as http://localhost:8080
Note: Set a different port like
$ PORT=8999 go run src/*.go
Build an image for Docker Hub. With a lightweight alpine linux base.
-
Build the image with
$ cd src/ $ docker build -t hackerman/hello-pong:v1.0 . $ # docker login $ docker push hackerman/hello-pong:v1.0
-
Run the image exposing container port 8080 on host 8082 like
$ docker run -it -p 8082:8080 hackerman/hello-pong:v1.0 2020/08/16 18:25:06 Server listening on port 8080
then go ahead an browse to http://localhost:8082
Terraform was used to create the following infrastructure in aws
vpc: hello-pong-vpc
├── eks: hello-pong-eks-xxxx
│ ├── ec2: t2.micro
│ ├── ec2: t2.micro
│ └── ec2: t2.small
├── elb
├── subnets
├── security groups
└── s3: hello-pong-state-bucket
Glossary:
- vpc: virtual private cloud
- eks: elastic kubernetes service
- elastic compute cloud (k8s nodes)
- elb: elastic load balancer (through worker nodes)
- s3: simple storage service
Elastic Kubernetes Service provides a master node to manage a k8s cluster, joining other ec2 nodes to create deployment pods as required.
- Create IAM role, also attach S3 permissions (AWSS3FullAccess)
- Create a new cluster with aws eks
The cluster includes:
- Deployment: Creates 5 replicas of the webserver pod with "hello-pong" label
- Load balancer: Distribute requests among the 5 created pods
Create an S3 bucket. This is a manual step to avoid destruction of this resource on $ terraform destroy
$ aws s3 mb s3://hello-pong-state-bucket --region us-east-2
$ aws s3api put-bucket-versioning --bucket hello-pong-state-bucket --versioning-configuration Status=Enabled
Then add the following section into a .tf file
terraform {
...
backend "s3" {
# bucket = "hello-pong-state-bucket" # managed by tf init parameter
# key = "eks/terraform.tfstate" # managed by tf init parameter
region = "us-east-2"
}
}
This enables S3 as terraform backend, where the infrastructure state will be saved (terraform.tfstate
). State persistence enables the destruction of the infrastructure from a separate CI task. First time deployment pipeline is run, it creates the file. Every other time, it updates it's content. It's also read by the terraform destroy action.
Litmus Chaos will be used for this test.
With the created cluster, run the following commands:
$ aws eks update-kubeconfig --name hello-pong-eks-30eCLOFM # Config kubectl credentials
$ kubectl get pods # Verify pods were correctly started
$ kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-v1.7.0.yaml # Install Litmus
$ kubectl apply -f https://hub.litmuschaos.io/api/chaos/1.7.0?file=charts/generic/experiments.yaml # Get generic tests
$ kubectl get all -n litmus # Verify required resources were created
NAME READY STATUS RESTARTS AGE
pod/chaos-operator-ce-66866b6469-bdfzp 1/1 Running 0 7m34s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/chaos-operator-metrics ClusterIP 172.20.10.124 <none> 8383/TCP 7m27s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/chaos-operator-ce 1/1 1 1 7m34s
NAME DESIRED CURRENT READY AGE
replicaset.apps/chaos-operator-ce-66866b6469 1 1 1 7m34s
Unfinished. The results were not as expected. Dropped for the scope of this delivery
This job compiles new code, builds the docker image and pushes it into Docker Hub if the event happened on a tagged commit (enabling image versioning: v1.1, 1.12).
This job provisions all the aws infra required to run the eks cluster, and updates the pods with new versions of the software. The following actions are performed (roughly):
- Checkout: Get the code for webserver, deployments, infra updates
- Configure AWS environment: Setup aws-cli for the pipeline interactions
- Setup Terraform: Setup terraform command
- Terraform fmt, init, validate, plan: Verify the terraform configuration is set and sound
- Terraform Apply: Provision aws infra with terraform
- Configure kubectl: Get created cluster's credentials and configure kubectl for cluster management
- K8s Deployment: Create or update k8s pods as required
The following are issues that took a lot of time to solve:
This is more a hack than an issue. It'd be good to fix. For some reason, the bucket name and state file key were not read from terraform configuration (comented out in ./tf/versions.tf:6).
Solved by using "--backend-config=
" flag on deploy pipeline.
After the first k8s deployment, the website was live under the elb public IP. However, the following error appeared in 5/6 pods that were expected to run:
0/3 nodes are available: 3 Insufficient cpu.
This first configuration included three t2.micro ec2 nodes. Using kubernetes-metrics-scaper, those nodes seemed very low on cpu consumption, which caused confusion; how can that be a cpu issue if cpu consumption seems to be very low? Each of the three nodes, was running 4 pods. Most of those were standard pods (I figure) required to join the node into the cluster and the metrics service.
This post was the only reference found to this problem. After that, I played a little bit with nodes sizes, concluding into a good fit of two t2.micro instances and one t2.small to run the 5 pods set on deployment
The tf/
folder was previously called terraform/
to group every terraform configuration file (".tf
").
Once the project was pushed and the github actions started to run, the "Setup Terraform" action would download terraform. When "$ terraform init
" was run, the process printed
"Error: No configuration files"
Leading to solve the issue by setting the working-directory
with a flag ( $ terraform init terraform/
) and failing. Then setting the working-directory
with the defaults
property from github actions jobs. This would still not solve the problem.
After trying with renaming the folder to tf/
, the issue disappeared. It seems the github action creates a collision with terraform/
folder name. Unluckily, I couldn't find a source to confirm this but that was the observed behaviour.
After creating all the infrastructure, github requires 2 terraform outputs to configure kubectl and deploy to k8s:
- cluster's name
- cluster's kubeconfig data (credentials)
The terraform cli behavior to read an output is the folowing:
$ terraform output cluster_name
hello-pong-eks-xxxx
But when running said command inside github actions, output received would be similar to
$ terraform output cluster_name
[command]/home/runner/work/_temp/a15b4c47-57cb-45cb-8187-bb37cde344e3/terraform-bin output kubectl_con
fig hello-pong-eks-xxxx ::debug::stdout: hello-pong-eks-xxxx ::debug::stderr: ::debug::exitcode: 0
After some hours of debugging, I learnt that Terraform setup github action states:
terraform_wrapper
- (optional) Whether or not to install a wrapper to wrap subsequent calls of theterraform
binary and expose its STDOUT, STDERR, and exit code as outputs namedstdout
,stderr
, andexitcode
respectively. Defaults totrue
.
The unexpected output search brought no useful results on the internet, becoming a tough to debug issue. After some more debugging I found out that the format "::debug::stdout:
" is a github actions standard or seems to be an under the hood debugging tool. At first, I wasn't even able to pin the issue to terraform, to github actions, to base64 shell tool (which steebchen/kubectl required)
The final solution was to simply disable the wrapper in the pipeline:
steps:
- uses: hashicorp/setup-terraform@v1
with:
terraform_wrapper: false
Once the terraform, aws-cli, kubectl were all setup, it was time to deploy to k8s-cluster. It printed out a cryptic message of
The connection to the server localhost:8080 was refused - did you specify the right host or port?
This was a very straightforward and quick issue to solve thanks to this answer that stated:
you need to specify kubeconfig for kubectl like this.
kubectl --kubeconfig .kube/config get nodes
- Chaos testing
- Load testing
- Autoscaling
- Create IAM role with Terraform
- Create and destroy (somehow) S3 bucket with Terraform
- Pong game: https://gist.github.com/straker/81b59eecf70da93af396f963596dfdc5
- Host info and k8s https://github.com/christianhxc/intro-to-kubernetes
- Docker image build and push: https://github.com/marketplace/actions/build-and-push-docker-images
- K8s Chaos Github Action: https://github.com/marketplace/actions/kubernetes-chaos
- Terraform Github Action: https://github.com/marketplace/actions/hashicorp-setup-terraform
- Terraform EKS Cluster: https://github.com/hashicorp/learn-terraform-provision-eks-cluster
- Low CPU on workers issue: https://managedkube.com/kubernetes/k8sbot/troubleshooting/pending/pod/2019/02/22/pending-pod.html
- Terraform init bucket config: https://github.com/ArunaLakmal/Terraform-Backend
- Github action kubectl server not started: https://stackoverflow.com/a/51122584
- And lots of other lost links