This is an application that is used to ingest information to a database using API endpoint URL. Data is validated in backend before actually being written to a cloud database. Same endpoint can also be used to list the items in the database.
This is a simple application that exposes AWS Lambda Function
as a public URL. The URL is used to ingest data to an AWS DynamoDB Table
and list items in the same table. Data is recieved via query string parameters from the URL and validated using an AWS Statemachine
. If data is in correct format then it is ingested otherwise reponse code 400
is shared as a PUT
response via the AWS Lambda Function
. Data processing is done using Python
based AWS Lambda Function
.
The application traffic is monitored using AWS Cloudwatch Dashboard
which tracks the AWS Lambda Function
invocations, durations, error/success rates and AWS Account
billings as four independent widgets.
The application is put through load-testing by simulating traffic using AWS ECS Fargate
which runs 2 instances of AWS ECS Tasks
(this adds costs exponentially, so check with your organization or sandbox provider) running custom Docker
image. Each task hits the URL endpoint once every second using a simple shell script. The traffic can be monitored safely through AWS Cloudwatch Dashboard
.
- Running The Application
- Github Actions
- Deploying from Local System
- AWS Architecture
- Data Ingestion App
- Traffic Simulation
- Traffic Monitor Dashboard
- Infrastructure-as-Code
- Future Plans/Improvements
Java
:java -version
Maven
:mvn --version
Node Package Manager
:npm --version
aws-cdk
:npm info aws-cdk version
aws-cli
:aws-cli --version
aws-cli
should be configured correctly with access key and secret access key or correct keys should be supplied to aws-actions/configure-aws-credentials@master
.
The privileges of the user is primal for deploying the application to AWS. The AWS IAM User
should have appropriate privileges for AWS Cloudformation
. There should be an appropriate AWS IAM Role
for deployment using CFT.
For the context of this project, an admin AWS IAM User
is created and AdministratorAccess
, which is an AWS IAM Policy
managed by AWS
is attached to it.
Make sure, AWS_ACCESS_KEY
and AWS_SECRET_KEY
are created/updated with valid credentials.
Go to Actions>> .github/workflows/aws-deploy.yml >> Run Workflow
cd data_ingestion_infra/
# Needs to be run intially. Subsequent executions will not have any impact.
cdk bootstrap
# Deploy Application
cdk deploy DataIngestionInfraStack --require-approval never
# Get Lambda API URL
aws cloudformation describe-stacks \
--stack-name DataIngestionInfraStack \
--query "Stacks[?StackName=='DataIngestionInfraStack'][].Outputs[?OutputKey=='FunctionURLAPI'].OutputValue" \
--no-paginate --output text
# Deploy Traffic
cdk deploy LoadTesterInfraStack --require-approval never
# Deploy Monitor Dashboard
cdk deploy MonitorLoadStack --require-approval never
The application is designed to ingest user information using query string paramaters. If query string paramters are in correct format, they are ingested into an AWS DynamoDB Table. The acceptable query string paramaters with data formats are mentioned below.
- user_name: No pattern check.
- email: Must be of (alphabet)+(alphabet|digit)*\@(alphabet|digit)+\.com
- pincode: Must be numeric with length between 4 and 6 (both inclusive)
Example : https://sd373c3bj3zedsxsjuszrfgjoy0gzceu.lambda-url.us-east-1.on.aws/?user_name=Rishabh&[email protected]&pincode=3055
To list data points in table:
https://sd373c3bj3zedsxsjuszrfgjoy0gzceu.lambda-url.us-east-1.on.aws/
There is one AWS ECS Cluster
service which is spun to run an AWS ECS Fargate
service. There are two tasks running concurrrently which hit https://$FUNC_URL/?user_name=user_name=Rishabh&[email protected]&pincode=3055
once every second each. The tasks are running on custom Alpine
docker-image.
This is an AWS CloudWatch Dashboard
meant to track load on the application.
Note : The dip in invocations, durations and error/success rates are due to
completion of load testing and reducing traffic on the application
The entire application has three AWS Cloudformation
stacks: DataIngestionInfraStack
, LoadTesterInfraStack
, MonitorLoadStack
. All three cloudformation stacks are deployed to AWS
using AWS Cloud Development Kit
.
This cdk
application is a Java
based. Make sure node
, aws-cdk
, aws-cli
, java
, mvn
are installed on software release machine.
-
Add a new Github Action that waits for a specifc amount of time after execution of aws-deploy and then tears down the LoadTesterInfraStack automatically. After all, the load testing is done only for a period of time. Current, this tear down process is manual.
-
If only one of
email
orpincode
is accpeted, then send the input to the admin asking for approval of the data point. This involves a callback state in Stepfunction. The statemachine waits until the callback is completed. The wait for callback can be anywhere upto 1 year for standard workflows. -
A good first issue would be to update pinger.sh to input random but valid data points. The data points can be generated in same shell script using regex. Please feel free to contact me about more details on this.
-
Alternatively, an
AWS Kinsesis DataStream
can be setup to putAWS DynamoDB Table
under load testing. New code for iac of kinesis data stream should be added in LoadTesterInfraStack.java. -
Also a another good first issue would be to add an HTML template to render list of entries as single cards on a webpage. The HTML template would be used by triggerWorkflow.