Skip to content

AWS Lambda Power Tuning is an open-source tool that can help you visualize and fine-tune the memory/power configuration of Lambda functions. It runs in your own AWS account - powered by AWS Step Functions - and it supports three optimization strategies: cost, speed, and balanced.

License

Notifications You must be signed in to change notification settings

andrew-vqn/aws-lambda-power-tuning

 
 

Repository files navigation

AWS Lambda Power Tuning

Build Status Coverage Status Maintenance GitHub issues Open Source Love svg2

AWS Lambda Power Tuning is a state machine powered by AWS Step Functions that helps you optimize your Lambda functions for cost and/or performance in a data-driven way.

The state machine is designed to be easy to deploy and fast to execute. Also, it's language agnostic so you can optimize any Lambda functions in your account.

Basically, you can provide a Lambda function ARN as input and the state machine will invoke that function with multiple power configurations (from 128MB to 10GB, you decide which values). Then it will analyze all the execution logs and suggest you the best power configuration to minimize cost and/or maximize performance.

Please note that the input function will be executed in your AWS account - performing real HTTP requests, SDK calls, cold starts, etc. The state machine also supports cross-region invocations and you can enable parallel execution to generate results in just a few seconds.

What does the state machine look like?

It's pretty simple and you can visually inspect each step in the AWS management console.

state-machine

What results can I expect from Lambda Power Tuning?

The state machine will generate a visualization of average cost and speed for each power configuration.

For example, this is what the results look like for two CPU-intensive functions, which become cheaper AND faster with more power:

visualization1

How to interpret the chart above: execution time goes from 35s with 128MB to less than 3s with 1.5GB, while being 14% cheaper to run.

visualization2

How to interpret the chart above: execution time goes from 2.4s with 128MB to 300ms with 1GB, for the very same average cost.

How to deploy the state machine

There are 5 deployment options for deploying the tool using Infrastructure as Code (IaC).

  1. The easiest way is to deploy the app via the AWS Serverless Application Repository (SAR).
  2. Using the AWS SAM CLI
  3. Using the AWS CDK
  4. Using Terraform by Hashicorp and SAR
  5. Using native Terraform

Read more about the deployment options here.

State machine configuration (at deployment time)

The CloudFormation template (used for option 1 to 4) accepts the following parameters:

Parameter
Description
PowerValues
type: list of numbers
default: [128,256,512,1024,1536,3008]
These power values (in MB) will be used as the default in case no powerValues input parameter is provided at execution time
visualizationURL
type: string
default: lambda-power-tuning.show
The base URL for the visualization tool, you can bring your own visualization tool
totalExecutionTimeout
type: number
default: 300
The timeout in seconds applied to all functions of the state machine
lambdaResource
type: string
default: *
The Resource used in IAM policies; it's * by default but you could restrict it to a prefix or a specific function ARN
permissionsBoundary
type: string
The ARN of a permissions boundary (policy), applied to all functions of the state machine
payloadS3Bucket
type: string
The S3 bucket name used for large payloads (>256KB); if provided, it's added to a custom managed IAM policy that grants read-only permission to the S3 bucket; more details below in the S3 payloads section
payloadS3Key
type: string
default: *
The S3 object key used for large payloads (>256KB); the default value grants access to all S3 objects in the bucket specified with payloadS3Bucket; more details below in the S3 payloads section
layerSdkName
type: string
The name of the SDK layer, in case you need to customize it (optional)
logGroupRetentionInDays
type: number
default: 7
The number of days to retain log events in the Lambda log groups. Before this parameter existed, log events were retained indefinitely
securityGroupIds
type: list of SecurityGroup IDs
List of Security Groups to use in every Lambda function's VPC Configuration (optional); please note that your VPC should be configured to allow public internet access (via NAT Gateway) or include VPC Endpoints to the Lambda service
subnetIds
type: list of Subnet IDs
List of Subnets to use in every Lambda function's VPC Configuration (optional); please note that your VPC should be configured to allow public internet access (via NAT Gateway) or include VPC Endpoints to the Lambda service
stateMachineNamePrefix
type: string
default: powerTuningStateMachine
Allows you to customize the name of the state machine. Maximum 43 characters, only alphanumeric (plus - and _). The last portion of the AWS::StackId will be appended to this value, so the full name will look like powerTuningStateMachine-89549da0-a4f9-11ee-844d-12a2895ed91f. Note: StateMachineName has a maximum of 80 characters and 36+1 from the StackId are appended, allowing 43 for a custom prefix.

Please note that the total execution time should stay below 300 seconds (5 min), which is the default timeout. You can easily estimate the total execution timeout based on the average duration of your functions. For example, if your function's average execution time is 5 seconds and you haven't enabled parallelInvocation, you should set totalExecutionTimeout to at least num * 5: 50 seconds if num=10, 500 seconds if num=100, and so on. If you have enabled parallelInvocation, usually you don't need to tune the value of totalExecutionTimeout unless your average execution time is above 5 min. If you have a sleep between invocations set, you should include that in your timeout calculations.

How to execute the state machine

You can execute the state machine manually or programmatically, see the documentation here.

State machine input (at execution time)

Each execution of the state machine will require an input where you can define the following input parameters:

Parameter
Description
lambdaARN (required)
type: string
Unique identifier of the Lambda function you want to optimize
num (required)
type: integer
The # of invocations for each power configuration (minimum 5, recommended: between 10 and 100)
powerValues
type: string or list of integers
The list of power values to be tested; if not provided, the default values configured at deploy-time are used; you can provide any power values between 128MB and 10,240MB (⚠️ New AWS accounts have reduced concurrency and memory quotas (3008MB max))
payload
type: string, object, or list
The static payload that will be used for every invocation (object or string); when using a list, a weighted payload is expected in the shape of [{"payload": {...}, "weight": X }, {"payload": {...}, "weight": Y }, {"payload": {...}, "weight": Z }], where the weights X, Y, and Z are treated as relative weights (not percentages); more details below in the Weighted Payloads section
payloadS3
type: string
A reference to Amazon S3 for large payloads (>256KB), formatted as s3://bucket/key; it requires read-only IAM permissions, see payloadS3Bucket and payloadS3Key below and find more details in the S3 payloads section
parallelInvocation
type: boolean
default: false
If true, all the invocations will be executed in parallel (note: depending on the value of num, you may experience throttling when setting parallelInvocation to true)
strategy
type: string
default: "cost"
It can be "cost" or "speed" or "balanced"; if you use "cost" the state machine will suggest the cheapest option (disregarding its performance), while if you use "speed" the state machine will suggest the fastest option (disregarding its cost). When using "balanced" the state machine will choose a compromise between "cost" and "speed" according to the parameter "balancedWeight"
balancedWeight
type: number
default: 0.5
Parameter that express the trade-off between cost and time. Value is between 0 & 1, 0.0 is equivalent to "speed" strategy, 1.0 is equivalent to "cost" strategy
autoOptimize
type: boolean
default: false
If true, the state machine will apply the optimal configuration at the end of its execution
autoOptimizeAlias
type: string
If provided - and only if autoOptimize if true, the state machine will create or update this alias with the new optimal power value
dryRun
type: boolean
default: false
If true, the state machine will execute the input function only once and it will disable every functionality related to logs analysis, auto-tuning, and visualization; the dry-run mode is intended for testing purposes, for example to verify that IAM permissions are set up correctly
preProcessorARN
type: string
It must be the ARN of a Lambda function; if provided, the function will be invoked before every invocation of lambdaARN; more details below in the Pre/Post-processing functions section
postProcessorARN
type: string
It must be the ARN of a Lambda function; if provided, the function will be invoked after every invocation of lambdaARN; more details below in the Pre/Post-processing functions section
discardTopBottom
type: number
default: 0.2
By default, the state machine will discard the top/bottom 20% of "outliers" (the fastest and slowest), to filter out the effects of cold starts that would bias the overall averages. You can customize this parameter by providing a value between 0 and 0.4, with 0 meaning no results are discarded and 0.4 meaning that 40% of the top/bottom results are discarded (i.e. only 20% of the results are considered).
sleepBetweenRunsMs
type: integer
If provided, the time in milliseconds that the tuner function will sleep/wait after invoking your function, but before carrying out the Post-Processing step, should that be provided. This could be used if you have aggressive downstream rate limits you need to respect. By default this will be set to 0 and the function won't sleep between invocations. Setting this value will have no effect if running the invocations in parallel.
disablePayloadLogs
type: boolean
default: false
If provided and set to a truthy value, suppresses payload from error messages and logs. If preProcessorARN is provided, this also suppresses the output payload of the pre-processor.
includeOutputResults
type: boolean
default: false
If provided and set to true, the average cost and average duration for every power value configuration will be included in the state machine output.

Here's a typical execution input with basic parameters:

{
    "lambdaARN": "your-lambda-function-arn",
    "powerValues": [128, 256, 512, 1024],
    "num": 50,
    "payload": {}
}

State Machine Output

The state machine will return the following output:

{
  "results": {
    "power": "128",
    "cost": 0.0000002083,
    "duration": 2.9066666666666667,
    "stateMachine": {
      "executionCost": 0.00045,
      "lambdaCost": 0.0005252,
      "visualization": "https://lambda-power-tuning.show/#<encoded_data>"
    },
    "stats": [{ "averagePrice": 0.0000002083, "averageDuration": 2.9066666666666667, "value": 128}, ... ]
  }
}

More details on each value:

  • results.power: the optimal power configuration (RAM)
  • results.cost: the corresponding average cost (per invocation)
  • results.duration: the corresponding average duration (per invocation)
  • results.stateMachine.executionCost: the AWS Step Functions cost corresponding to this state machine execution (fixed value for "worst" case)
  • results.stateMachine.lambdaCost: the AWS Lambda cost corresponding to this state machine execution (depending on num and average execution time)
  • results.stateMachine.visualization: if you visit this autogenerated URL, you will be able to visualize and inspect average statistics about cost and performance; important note: average statistics are NOT shared with the server since all the data is encoded in the URL hash (example), which is available only client-side
  • results.stats: the average duration and cost for every tested power value configuration (only included if includeOutputResults is set to a truthy value)

Data visualization

You can visually inspect the tuning results to identify the optimal tradeoff between cost and performance.

visualization

The data visualization tool has been built by the community: it's a static website deployed via AWS Amplify Console and it's free to use. If you don't want to use the visualization tool, you can simply ignore the visualization URL provided in the execution output. No data is ever shared or stored by this tool.

Website repository: matteo-ronchetti/aws-lambda-power-tuning-ui

Optionally, you could deploy your own custom visualization tool and configure the CloudFormation Parameter named visualizationURL with your own URL.

Additional features, considerations, and internals

Here you can find out more about some advanced features of this project, its internals, and some considerations about security and execution cost.

Contributing

Feature requests and pull requests are more than welcome!

How to get started with local development?

For this repository, install dev dependencies with npm install. You can run tests with npm test, linting with npm run lint, and coverage with npm run coverage. Unit tests will run automatically on every commit and PR.

About

AWS Lambda Power Tuning is an open-source tool that can help you visualize and fine-tune the memory/power configuration of Lambda functions. It runs in your own AWS account - powered by AWS Step Functions - and it supports three optimization strategies: cost, speed, and balanced.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 83.6%
  • HCL 6.9%
  • Java 2.2%
  • Go 1.8%
  • C# 1.7%
  • Python 1.5%
  • Other 2.3%