Skip to content

mtonxbjss/terraform-aws-autoscaling-github-runners

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AWS GitHub Actions Private Runner Terraform Module

Terraform modules for creating autoscaling groups of private GitHub Actions Runners in your AWS account.

Overview

This repo contains three different Terraform modules, each themed around launching autoscaling groups of GitHub Actions Runners in your own AWS account.

modules/imagebuilder-terraform-container

Deploys an AWS ImageBuilder pipeline and associated resources to create a docker container that is able to run Terraform commands. This container will be used to execute GitHub Actions jobs in your workflows.

You don't have to use this module if you already have your own container image that is able to run Terraform; this is just a barebones container that is minimally functional.

modules/imagebuilder-github-runner-ami

Deploys an AWS ImageBuilder pipeline and associated resources to create an EC2 AMI that features all of the prerequisites for registering a GitHub Actions runner.

This includes:

  • Base Ubuntu 20.04 (Focal) image
  • Baseline packages required from the OS package repository, e.g. curl, git, cron, etc.
  • SSM Agent
  • CloudWatch Logs Agent
  • The GitHub Actions Runner agent binary, either downloaded from GitHub's own releases page, or from your own S3 cache
  • Pre-cached container images (optional)

Autoscaling EC2 instances will be launched with AMIs built from this pipeline.

[root module] autoscaling-github-runners

Deploys an autoscaling group of EC2 instances that each register as a self-hosted runner with one or more GitHub projects of your choice. The autoscaling is performed based on a custom CloudWatch metric generated by a Lambda that polls your GitHub project to determine how many jobs are running/pending.

The module also includes:

Feature Purpose
EC2 Launch Template that will deploy instances into the VPC of your choosing
EC2 Autoscaling Group with scaling rules controlled by a schedule and/or an accurate custom metric (both optional)
EC2 IAM Role & Instance Profile to allow access to the AWS APIs required to deploy your application (you can customise this). Also allows SSM Session Manager access for when your runners are in Private Subnets
EC2 Security Group to protect instances launched by the ASG and provide minimal outbound traffic
CloudWatch Log Groups for key OS logs with configuration to stream those logs to CloudWatch from each EC2 instance in the ASG, using the CloudWatch Logs Agent
CloudWatch Metric Filter & Alarms to notify about failed instances
CloudWatch Dashboard to monitor the autoscaling activity and key EC2 metrics (CPU, Memory, IO)
Secrets Manager Secret with a placeholder value, that should be overwritten with a valid Personal Access Token (PAT) for the GitHub project you wish to register with
Bootstrapping BASH code to self-register the instance with a GitHub server, organisation and project of your choosing
Multi-Registration gives you the ability to register a single EC2 instance as multiple independent GitHub Runner agents, so you can run parallel jobs on a single instance. This is good for saving on infrastructure costs
Lambda Function that polls the GitHub API to determine how busy your project is (i.e. how many pending/running jobs) and writes custom CloudWatch Metrics based on the response
CloudWatch Event Rule to run the Lambda function every minute during the working day (as defined by you)
Scale In Protection implemented as a systemd service that calls the AWS API, this prevents a runner from being scaled-in whilst it is busy running a job
Automatic De-Registration when your ASG scales in and instances are destroyed, those destroyed instances automatically de-register themselves from your GitHub project so they will receive no further jobs

How to use these modules

Deploying the runners

  1. Start with the imagebuilder-terraform-container submodule, but only if you don't already have a suitable container for running jobs. Create a file in your terraform project that calls this module with appropriate parameters as outlined below. The result will be an imagebuilder pipeline that creates a docker container for running Terraform commands.

  2. The pipeline is set to run on a schedule by default, but you should immediately run the pipeline to generate a baseline container for later use. This will take around 20 minutes to complete.

  3. Now onto the imagebuilder-github-runner-ami module. You will always need this one. Create a file in your terraform project that calls this module with appropriate parameters as outlined below. The result will be an imagebuilder pipeline that creates an EC2 AMI with all of the pre-requisites for running the GitHub Agent.

  4. The pipeline is set to run on a schedule by default, but you should immediately run the pipeline to generate an AMI for later use. This will take around 45 minutes to complete.

  5. Finally onto the autoscaling-github-runners module. This is the module that creates the actual runners, so you will always need this one too. Create a file in your terraform project that calls this module with appropriate parameters as outlined below

  6. Once fully deployed the ASG will begin deploying EC2 instances (as per your min/max/desired requirements) immediately; however, these instances will not connect to GitHub successfully because they don't have a valid access token yet. Navigate to Secrets Manager and find the placeholder secret that has been created for you. Its name will end in the word pat (also check the description field to be sure). Overwrite the value of this secret with a valid Personal Access Token from the GitHub project you want to register with.

  7. Terminate all running EC2 instances and let them be replaced by the ASG. This time they should successfully register with your GitHub project and be visible in the page Settings -> Actions --> Runners

  8. In your GitHub Actions Workflow YAML, ensure your jobs feature a runs-on section that includes the tag self-hosted and the name of the tag you gave your runners in the ec2_github_runner_tag_list input variable. This will ensure that your workflow jobs will be deployed to your private self-hosted runners inside your own AWS account.

    runs-on:
      - self-hosted
      - example

Refining the runner permissions

By default the runners will only have basic permissions to write their own logs and metrics, pull containers from ECR, and read/write files to a CI/CD artifacts bucket defined by you.

If you want to give your runners greater permissions (and you probably do, because they need to have permission to do things like deploy an application) then you can supply additional IAM Policies in the input variable ec2_iam_role_extra_policy_attachments, which is a list of ARNs.

Alternatively, you may prefer to keep the runner permissions light and allow Terraform to assume roles that have sufficient permission to deploy things. In this case simply supply the ARNs of all roles that Terraform should be allowed to assume in the input variable ec2_terraform_deployment_roles. Your runner IAM Role will be granted permission to assume these roles on demand.


autoscaling-github-runners Module

Pre-Requisites

Before deploying this module you must:

  • Have a VPC with at least one subnet. The Subnets can be private or public, but they must have access to the Internet via IGW or NAT
  • Have deployed the imagebuilder-github-runner-ami submodule, so that you can supply the image ARN as an input to this module
  • Have a GitHub project, and have generated a Personal Access Token with these permissions as a minimum: Read Access to Metadata, Read and Write Access to Administration

Simplest Possible Example

This is a single GitHub Runner that does not autoscale and does not shut down overnight

module "autoscaling_github_runners_simple" {
  source = "git::https://github.com/mtonxbjss/terraform-aws-autoscaling-github-runners?ref=v1.0.0"

  ec2_imagebuilder_image_arn                       = module.imagebuilder_github_runner_ami.imagebuilder_image_arn_xxx
  ec2_subnet_ids                                   = module.vpc.private_subnets
  ec2_vpc_id                                       = module.vpc.vpc_id
  iam_roles_with_admin_access_to_created_resources = [local.identifiers.account_admin_role_arn]
  github_organization_url                          = "https://github.com/myorg"
  github_repository_names                          = [ "my-repo-1" ]
  region                                           = var.region
  runner_account_id                                = var.aws_account_id
  state_bucket_name                                = data.terraform_remote_state.bootstrap.outputs.state_bucket_name
  state_bucket_key_arn                             = data.terraform_remote_state.bootstrap.outputs.state_bucket_key
  state_lock_table_arn                             = data.terraform_remote_state.bootstrap.outputs.state_lock_table_arn
  unique_prefix                                    = "${local.prefix}-min"
}

More Complex Example

This is an auto-scaling group of GitHub Runners using multiple runner tags, spot instances and registering with an ECR Registry on boot

module "autoscaling_github_runners_mid" {
  source = "git::https://github.com/mtonxbjss/terraform-aws-autoscaling-github-runners?ref=v1.0.0"

  ec2_autoscaling_maximum_instances                = 5
  ec2_dynamic_scaling_enabled                      = true
  ec2_github_runner_tag_list                       = "primary,miscellaneous"
  ec2_imagebuilder_image_arn                       = module.imagebuilder_github_runner_ami.imagebuilder_image_arn_xxx
  ec2_spot_instances_max_price                     = 0.5
  ec2_spot_instances_preferred                     = true
  ec2_subnet_ids                                   = module.vpc.private_subnets
  ec2_vpc_id                                       = module.vpc.vpc_id
  iam_roles_with_admin_access_to_created_resources = [local.identifiers.account_admin_role_arn]
  github_job_image_ecr_account                     = var.aws_account_id
  github_organization_url                          = "https://github.com/myorg"
  github_repository_names                          = ["my-repo-1"]
  region                                           = var.region
  runner_account_id                                = var.aws_account_id
  state_bucket_name                                = data.terraform_remote_state.bootstrap.outputs.state_bucket_name
  state_bucket_key_arn                             = data.terraform_remote_state.bootstrap.outputs.state_bucket_key
  state_lock_table_arn                             = data.terraform_remote_state.bootstrap.outputs.state_lock_table_arn
  unique_prefix                                    = "${local.prefix}-mid"
}

Full Worked Example With Most Parameters Expressed

This is an Auto Scaling Group of GitHub Runners that scales down to a single pilot-light runner overnight

module "autoscaling_github_runners" {
  source = "git::https://github.com/mtonxbjss/terraform-aws-autoscaling-github-runners?ref=v1.0.0"

  cicd_artifacts_bucket_name        = aws_s3_bucket.cicd.bucket
  cicd_artifacts_bucket_key_arn     = aws_kms_key.cicd.arn
  ec2_autoscaling_desired_instances = 1
  ec2_autoscaling_maximum_instances = 10
  ec2_autoscaling_minimum_instances = 1
  ec2_dynamic_scaling_enabled       = true
  ec2_github_runner_name            = "general"
  ec2_github_runner_tag_list        = "primary,miscellaneous"

  ec2_iam_role_extra_policy_attachments = [
    aws_iam_policy.deploy_application.arn,
  ]

  ec2_imagebuilder_image_arn          = module.imagebuilder_github_runner_ami.imagebuilder_image_arn_xxx
  ec2_instance_type                   = "t3a.large"
  ec2_maximum_concurrent_github_jobs  = 3
  ec2_nightly_shutdown_enabled        = true
  ec2_nightly_shutdown_scale_in_time  = "0 19 * * *"
  ec2_nightly_shutdown_scale_out_time = "0 8 * * MON-FRI"
  ec2_root_volume_size                = 100
  ec2_runner_role_tag                 = "Demo GitHub Actions jobs"
  ec2_spot_instances_max_price        = 0.5
  ec2_spot_instances_preferred        = true
  ec2_subnet_ids                      = module.vpc.private_subnets

  ec2_terraform_deployment_roles = [
   local.identifiers.app_deployer_role_arn,
  ]

  ec2_vpc_id                   = module.vpc.vpc_id
  github_job_image_ecr_account = var.aws_account_id
  github_organization_url      = "https://github.com/myorg"
  github_repository_names      = ["my-repo-1"]

  iam_roles_with_admin_access_to_created_resources = [
    local.identifiers.app_deployer_role_arn,
    local.identifiers.account_admin_role_arn,
  ]

  iam_roles_with_read_access_to_created_resources = [
    local.identifiers.developer_role_arn
  ]

  permission_boundary_arn = aws_iam_policy.permissions_boundary.arn
  region                  = var.region
  runner_account_id       = var.aws_account_id
  state_bucket_name       = data.terraform_remote_state.bootstrap.outputs.state_bucket_name
  state_bucket_key_arn    = data.terraform_remote_state.bootstrap.outputs.state_bucket_key
  state_lock_table_arn    = data.terraform_remote_state.bootstrap.outputs.state_lock_table_arn
  unique_prefix           = "${local.prefix}-max"
}

Inputs

Name Description Type Default Required
cicd_artifacts_bucket_key_arn Encryption key ARN for the bucket that stores all CICD artifacts for the pipeline(s). The runner will be granted permission to encrypt/decrypt using this key string "" no
cicd_artifacts_bucket_name Bucket that stores all CICD artifacts for the pipeline(s). The runner will be granted permission to read/write contents of this bucket string "" no
cloudwatch_metric_cloud_init_failure_name The name to give the metric that tracks Cloud Init failures on GitHub Runner EC2 instances. Defaults to CloudInitFailureCount string "CloudInitFailureCount" no
cloudwatch_metric_github_runner_failure_name The name to give the metric that tracks GitHub Connectivity failures on GitHub Runner EC2 instances. Defaults to GithubRunnerFailureCount string "GithubRunnerFailureCount" no
ec2_ami EC2 AMI to use for GitHub Runners. Only has any effect if you do not pass the ec2_imagebuilder_image_arn parameter string "" no
ec2_associate_public_ip_address Should all runner instances have public IP addresses attached (required only if you're deploying into a public subnet) bool false no
ec2_autoscaling_desired_instances Desired number of instances in the autoscaling group of github Runners number 1 no
ec2_autoscaling_maximum_instances Maximum number of instances in the autoscaling group of github Runners number 1 no
ec2_autoscaling_minimum_instances Minimum number of instances in the autoscaling group of github Runners number 0 no
ec2_dynamic_scaling_enabled Controls whether GitHub runners dynamically scale up/down depending on how busy the server is bool false no
ec2_dynamic_scaling_metric_collection_cron_expression Cron expression that dictates how often to run the cron expression that gathers github runner utilisation metrics. Default is every minute between 0700-1959 Monday-Friday UTC (0800-2059 during BST) string "0/1 07-19 ? * MON-FRI *" no
ec2_extra_security_groups List of additional security group IDs to append to the EC2 instances for running GitHub jobs. Defaults to an empty list. All runners will be allowed unrestricted egress traffic on ports 80, 443 and ICMP as standard list(string) [] no
ec2_github_runner_name Name by which the github Server will know this runner string "default" no
ec2_github_runner_tag_list Comma-delimited list of tags that can be used to target this runner string "default" no
ec2_iam_role_extra_policy_attachments List of policy ARNs to append to the runner's EC2 Instance Profile. Use this to give your runner permission to deploy things in your accounts. list(string) [] no
ec2_imagebuilder_image_arn ARN of the AWS ImageBuilder image that results from the GitHub AMI creation pipeline. If ec2_ami is also supplied, this parameter is used as first preference string "" no
ec2_instance_type Instance type for the temporary EC2 instance that will be created in order to generate the AMI. Defaults to t3a.large string "t3a.large" no
ec2_maximum_concurrent_github_jobs How many concurrent jobs the github runner can do number 2 no
ec2_nightly_shutdown_enabled scale in/out the runners on a nightly basis bool false no
ec2_nightly_shutdown_scale_in_time time to scale in string "0 20 * * *" no
ec2_nightly_shutdown_scale_out_time time to scale out string "0 6 * * 1-5" no
ec2_root_volume_size Size of root volume for the EC2 instances for running GitHub jobs. Defaults to 100GiB number 100 no
ec2_runner_role_tag Adds a new EC2 tag named Role to each runner with this value, to indicate the functional role performed by particular group of runners string "general" no
ec2_spot_instances_max_price Specifies the maximum spot price to pay for github runners. Only applies if ec2_spot_instances_preferred is true string "0.5" no
ec2_spot_instances_preferred Set to true in order to run github runners as spot instances. Defaults to false. bool false no
ec2_subnet_ids List of IDs of the subnets used to host the EC2 instances for running GitHub jobs list(string) n/a yes
ec2_terraform_deployment_roles List of deployment role ARNs that can be assumed by the runner in order to execute Terraform commands. The runner will be granted permission to assume these roles. The roles can be in different AWS accounts. This is an alternative to giving the runner permissions directly via policy attachments. list(string) [] no
ec2_vpc_id ID of the VPC used to host the EC2 instances for running GitHub jobs string n/a yes
github_job_image_ecr_account Account ID containing the ECR Docker Registry that hosts the images used for GitHub Actions jobs. Used so that the runner can proactively log into that registry. Default is empty (i.e. no docker images required) string "" no
github_organization_url The full https URL of the GitHub Organization or Owner, not including project name string n/a yes
github_repository_names A list of names of GitHub Repositories to which these runners should register. They must all be in the same organization or domain list(string) n/a yes
iam_roles_with_admin_access_to_created_resources List of IAM Role ARNs that should have admin access to any resources created in this module that have resource policies list(string) n/a yes
iam_roles_with_read_access_to_created_resources List of IAM Role ARNs that should have read access to any resources created in this module that have resource policies list(string) [] no
kms_deletion_window_in_days The number of days to retain a KMS key scheduled for deletion. Defaults to 7 number 7 no
permission_boundary_arn ARN of the IAM Policy to use as a permission boundary for the EC2 IAM Role created by this module. Defaults to empty (i.e. no permission boundary required) string "" no
region The AWS region in which to create resources string n/a yes
resource_tags Map of tags to be applied to all resources. Don't include provider tags in here or it will cause continual re-plans of tagged resources map(string) {} no
runner_account_id The AWS account ID that should host the GitHub Runners string n/a yes
state_bucket_key_arn Encryption key ARN for the bucket that stores all Terraform State for the pipeline(s). The runner will be granted permission to encrypt/decrypt using this key string "" no
state_bucket_name Bucket that stores all Terraform State for the pipeline(s). The runner will be granted permission to read/write contents of this bucket string "" no
state_lock_table_arn DynamoDB Table that stores all Terraform State Locks for the pipeline(s). The runner will be granted permission to read/write this table. string "" no
unique_prefix This unique prefix will be prepended to all resource names to ensure no clashes with other resources in the same account string n/a yes

Outputs

Name Description
auto_scaling_group_arn n/a
auto_scaling_group_name n/a
github_pat_secret_arn n/a
instance_profile_arn n/a
instance_profile_name n/a
instance_role_arn n/a
instance_role_id n/a
instance_role_name n/a
launch_template_arn n/a
launch_template_id n/a
security_group_arn n/a
security_group_id n/a
security_group_name n/a

Providers

Name Version
archive n/a
aws >=4.31.0
cloudinit >=2.2.0

SubModule Docs

About

Autoscaling GitHub Private Runners based on AWS EC2

Resources

License

Stars

Watchers

Forks

Packages

No packages published