This cookiecutter generates a high availability airflow(webserver, scheduler, workers) application. The generated application uses celery, redis, postgres sql and runs on a highly scalable docker/kubernetes environment. It also uses slack for realtime notifications.
Table of Contents generated with DocToc
- Features
- Requirements
- Installation
- Running the test coverages locally
- Quickstart
- Instructions to run Airflow environment
- License
In order to experience the features below, please build a project from this cookiecutter template and
check the Readme file of the generated project here.
To generate your project, simply run cookiecutter https://github.com/condemane/cookiecutter-airflow-ha.git
Apache Airflow for workflows scheduling and monitoring
DAG Programming showcasing how to build DAGs in python
Meta Programming to write code that dynamically generate programs
High Availability : Multiple scalable micro-service containers (Web server, schedulers, workers)
REST API : Interacting with Airflow via REST
Cookiecutter to build DRY code
Dependencies management with
pipenv
Testing with
pytest
andhypothesis
Flake8 for code style guide enforcement
Travis-CI for continuous Integration testing
Codeclimate for automated code review tool
Doctoc to generate table of contents for markdown files
Install cookiecutter
command line: pip install cookiecutter
Install pipenv
: brew install pipenv
if you are running on a mac
cookiecutter https://github.com/condemane/cookiecutter-airflow-ha.git
Video Demo: https://youtu.be/2YqSdeWqnIg
This also runs flake8 as part of the tests
pipenv install
pipenv run pytest -v --flake8 --cov-report html --cov=./
To install extra packages, run pipenv install
I am also using hypothesis for effective testing. Here is a very quick TL;DR to get started with hypothesis
from hypothesis import assume, given, settings
from hypothesis.strategies import integers
@settings(max_examples=150)
@given(lists(integers(min_value=3, max_value=10)),integers(min_value=3, max_value=10))
def test_this_thoroughly(test_list, r):
print('input list',test_list)
print('input k:', r)
I strongly recommend also reading the documentation here.
Install Cookiecutter Cookiecutter 1.4.0 or higher)::
pip install -U cookiecutter
Generate an airflow high availability project ::
cookiecutter https://github.com/condemane/cookiecutter-airflow-ha.git
Then:
- consult the Readme file in the generated project here
Step-by-step instructions to run airflow and test dags, meta programming, etc. are located here
This project is licensed under the terms of the BSD License