Skip to content

Commit

Permalink
Athena moved to monitoring and logging.
Browse files Browse the repository at this point in the history
  • Loading branch information
Ernyoke committed Apr 13, 2024
1 parent facbffa commit bf0fc94
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 9 deletions.
9 changes: 0 additions & 9 deletions 03-resilient-cloud-solutions/s3.md
Original file line number Diff line number Diff line change
Expand Up @@ -523,12 +523,3 @@
- It is a Virtual Tape Library (VTL) backed by S3 and Glacier
- Back up data using existing tape-based processes (and iSCSI interface)
- Works with leading backup software vendors

## Athena

- Serverless service used to perform analytics directly against S3 files
- Uses SQL language to query the files
- Provides JDBC/ODBC driver
- We are charged per query and amount of data scanned
- Supports CSV, JSON, ORC, Avro and Parquet. In the backend is using Presto'
- Use cases: business intelligence, analytics, reporting, analyze and query VPC Flow Logs, ELB Logs and CloudTrail logs, etc.
21 changes: 21 additions & 0 deletions 04-monitoring-and-logging/athena.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Amazon Athena

- It is serverless query service that can be used to analyze data stored in an S3 bucket
- Uses standard SQL language to query file (built on Presto)
- Supports CSV, JSON, ORC, Avro and Parquet
- Pricing: $5 per TB of data scanned
- Commonly used with Amazon Quicksight for reporting/dashboards
- Use cases: business intelligence, analytics, reporting, analyzing VPC Flow Logs, ELB Logs, CloudTrail trails, etc.

## Performance Improvements

- Use columnar data for cost-savings (scan less data). Examples: Parquet or ORC formats. We can use Glue to convert data to these formats
- Compress data for smaller retrievals (bzip, gzip, lz4, snappy, zlip, zstd)
- Partition datasets in S3 for easy querying on virtual columns
- User larger files (>128 MB) to minimize overhead

## Federated Queries

- Allows us to run SQL queries across data stored in relational, non-relational, object and custom data sources (AWS or on-premises)
- This can be accomplished with Data Source Connectors that run AWS Lambda to execute Federated Queries
- Results will be stored in S3 buckets
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@
- [CloudWatch](04-monitoring-and-logging/cloudwatch.md)
- [Amazon Lookout for Metrics](04-monitoring-and-logging/lookout.md)
- [Amazon OpenSearch](04-monitoring-and-logging/opensearch.md)
- [Amazon Athena](04-monitoring-and-logging/athena.md)
5. Incident and Event Response
- [Health](05-incident-and-event-response/health.md)
- [CloudWatch Events](05-incident-and-event-response/eventbridge.md)
Expand Down

0 comments on commit bf0fc94

Please sign in to comment.