Useful SQL scripts, split from DevOps Bash tools, for which this is now a submodule.
Hari Sekhon
Cloud & Big Data Contractor, United Kingdom
aws_athena_cloudtrail_ddl.sql
- AWS Athena DDL to setup up integration to query CloudTrail logs from Athenabigquery_*.sql
- Google BigQuery scripts:bigquery_billing_*.sql
- billing queries for GCP usage eg. highest cost services, most used GCP products, recent charges etc.bigquery_info_*.sql
- information schema queries for datasets, tables, columns, partitioning, clustering etc.
mysql_*.sql
:postgres_*.sql
:- PostgreSQL queries for DBA investigating + performance tuning
- postgres_info.sql - big summary overview, recommend you start here
- tested on PostgreSQL 8.4, 9.x, 10.x, 11.x, 12.x, 13.x
bigquery_*.sql
- Google BigQuery scripts:bigquery_billing_*.sql
- billing queries for GCP usage eg. highest cost services, most used GCP products, recent charges etc.bigquery_info_*.sql
- information schema queries for datasets, tables, columns, partitioning, clustering etc.- analytics/
bigquery_*.sql
- ecommerce queries and BigQuery ML machine learning classification logistic regression models and purchasing predictions - for more BigQuery examples, see Data Engineering demos
You can quickly test the PostgreSQL / MySQL scripts using postgres.sh
/ mysqld.sh
/ mariadb.sh
in the DevOps Bash tools repo, which boots a docker container and drops straight in to a mysql
/ psql
shell with this directory mounted at /sql
and used as $PWD
for fast easy sourcing eg.
postgres:
\i /sql/postgres_query_times.sql
\i postgres_query_times.sql
mysql:
source /sql/mysql_sessions.sql
\. mysql_sessions.sql
- .psqlrc - advanced PostgreSQL psql client config
- psql.sh - quickly connect to PostgreSQL with command line switches inferred from environment variables
- mysql.sh - quickly connect to MySQL / MariaDB with command line switches inferred from environment variables
- postgres.sh - one-touch PostgreSQL, boots docker container and drops you in to
psql
shell. Version can be given as an argument - mysqld.sh / mariadb.sh - one-touch MySQL / MariaDB, boots docker container and drops you in to
mysql
shell. Version can be given as an argument - postgres_foreach_table.sh / mysql_foreach_table.sh - execute templated SQL queries/statements against all or a subset of tables
- postgres_tables_row_counts.sh / mysql_tables_row_counts.sh - get row counts for all or a subset of tables
- sqlcase.pl - autocases your SQL code
- I use this a lot and call it via hotkey configured in my .vimrc
- there are
*case.pl
specializations for most of the major RDBMS and distributed SQL systems, even several NoSQL systems, using each one's language specific keywords
- Hive & Impala SQL:
- beeline.sh - quickly connect to Hive, auto-determines HiveServer2 address, Kerberos & SSL options, ZooKeeper quorum
- impala_shell.sh - quickly connect to Impala, auto-determines a Hadoop worker node address and Kerberos options (can use an environment variable for a Load Balancer setup)
- hive_foreach_table.sh / impala_foreach_table.sh - execute templated SQL queries/statements against all or a subset of tables
- hive_tables_row_counts.sh / impala_tables_row_counts.sh - get row counts for all or a subset of tables
- hive_tables_column_counts.sh / impala_tables_column_counts.sh - get the column counts for big tables in Hive / Impala
- hive_tables_metadata.sh / impala_tables_metadata.sh / hive_tables_locations.sh / impala_tables_locations.sh - get Hive / Impala metadata for all or a subset of tables, eg. Location to determine where the external tables data is being stored (HDFS / S3 paths)
-
DevOps Bash Tools - 550+ DevOps Bash Scripts, Advanced
.bashrc
,.vimrc
,.screenrc
,.tmux.conf
,.gitconfig
, CI configs & Utility Code Library - AWS, GCP, Kubernetes, Docker, Kafka, Hadoop, SQL, BigQuery, Hive, Impala, PostgreSQL, MySQL, LDAP, DockerHub, Jenkins, Spotify API & MP3 tools, Git tricks, GitHub API, GitLab API, BitBucket API, Code & build linting, package management for Linux / Mac / Python / Perl / Ruby / NodeJS / Golang, and lots more random goodies -
Templates - dozens of Code & Config templates - AWS, GCP, Docker, Jenkins, Terraform, Vagrant, Puppet, Python, Bash, Go, Perl, Java, Scala, Groovy, Maven, SBT, Gradle, Make, GitHub Actions Workflows, CircleCI, Jenkinsfile, Makefile, Dockerfile, docker-compose.yml, M4 etc.
-
Kubernetes configs - Kubernetes YAML configs - Best Practices, Tips & Tricks are baked right into the templates for future deployments
-
DevOps Python Tools - 80+ DevOps CLI tools for AWS, GCP, Hadoop, HBase, Spark, Log Anonymizer, Ambari Blueprints, AWS CloudFormation, Linux, Docker, Spark Data Converters & Validators (Avro / Parquet / JSON / CSV / INI / XML / YAML), Elasticsearch, Solr, Travis CI, Pig, IPython
-
The Advanced Nagios Plugins Collection - 450+ programs for Nagios monitoring your Hadoop & NoSQL clusters. Covers every Hadoop vendor's management API and every major NoSQL technology (HBase, Cassandra, MongoDB, Elasticsearch, Solr, Riak, Redis etc.) as well as message queues (Kafka, RabbitMQ), continuous integration (Jenkins, Travis CI) and traditional infrastructure (SSL, Whois, DNS, Linux)
-
DevOps Perl Tools - 25+ DevOps CLI tools for Hadoop, HDFS, Hive, Solr/SolrCloud CLI, Log Anonymizer, Nginx stats & HTTP(S) URL watchers for load balanced web farms, Dockerfiles & SQL ReCaser (MySQL, PostgreSQL, AWS Redshift, Snowflake, Apache Drill, Hive, Impala, Cassandra CQL, Microsoft SQL Server, Oracle, Couchbase N1QL, Dockerfiles, Pig Latin, Neo4j, InfluxDB), Ambari FreeIPA Kerberos, Datameer, Linux...
-
HAProxy Configs - 80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Elasticsearch, SolrCloud, HBase, Cloudera, Hortonworks, MapR, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, ZooKeeper, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, SSH, RabbitMQ, Redis, Riak, Rancher etc.
-
Dockerfiles - 50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Mesos, Consul, Riak, OpenTSDB, Jython, Advanced Nagios Plugins & DevOps Tools repos on Alpine, CentOS, Debian, Fedora, Ubuntu, Superset, H2O, Serf, Alluxio / Tachyon, FakeS3