Skip to content

Commit

Permalink
docs: more updates
Browse files Browse the repository at this point in the history
  • Loading branch information
Greg Clunies committed Nov 4, 2021
1 parent 795e800 commit 6176770
Show file tree
Hide file tree
Showing 4 changed files with 21 additions and 17 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Workflows are listed below by contributing team with a brief description. To lea
- Start here if you are new to workflows!

- [Surfline](./menu_of_workflows/surfline)
- Lints only the _changed_ models in `/models`
- Lints any added or modified models in `/models`
- Uses `conda` to setup a virtual environment and manage `python`, `dbt`, and `sqlfluff` dependencies.
- Uses `templater = dbt` - this requires a dummy `profiles.yml` and a connection to your data warehouse from the workflow.
- Handles connecting to VPN if your data warehouse requires it (optional).
Expand Down
13 changes: 9 additions & 4 deletions menu_of_workflows/surfline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,19 @@ Developed by Greg Clunies @ [Surfline](https://www.surfline.com/).
To add this to your repo, copy the contents of [`sqlfluff_lint_dbt_models.yml`](./sqlfluff_lint_dbt_models.yml) in this folder into a file named `.github/workflows/sqlfluff_lint_dbt_models.yml`.

## This GitHub Workflow
1. Lints all of the SQL files in the `models/` folder of your dbt project.
1. Uses `templater = dbt` when running `sqlfluff`. See the `.sqlfluff` we use [here](./.sqlfluff).
1. Uses [`conda`](https://docs.conda.io/projects/conda/en/latest/index.html) to manage a virtual environment that specifies the versions of `dbt`, `sqlfluff`, and any other dependencies. An example [`environment.yml`](./environment.yml) can be found in this folder. You can modify this workflow to handle your dependencies as you wish (e.g., `pip`, `virtualenv`, `poetry`, etc.).
1. Handles VPN connection to warehouse (if required by warehouse) to allow the dbt compiler (used by SQLFluff) to query the warehouse - this used by `templater = dbt` to handle dbt macros like `dbt_utils.star()`. If your warehouse doesn't require connection via VPN, you can delete the `Install OpenVPN` and `Connect to VPN` steps from the workflow.
- Lints any added or modified models in `/models`
- Uses [`conda`](https://docs.conda.io/en/latest/miniconda.html) to setup a virtual environment and manage `python`, `dbt`, and `sqlfluff` dependencies. An example [`environment.yml`](./environment.yml) can be found in this folder. You can modify this workflow to handle your dependencies as you wish (e.g., `pip`, `virtualenv`, `poetry`, etc.).
- Uses `templater = dbt` - this requires a dummy `profiles.yml` and a connection to your data warehouse from the workflow.
- Handles connecting to warehouse via VPN if your data warehouse requires it (optional). If your warehouse doesn't require connection via VPN, you can delete the `Install OpenVPN` and `Connect to VPN` steps from the workflow.



__NOTE:__ This workflow has been tested on Redshift. Config for Snowflake is included in this repo, but has not been tested.

## Setup
### `.sqlfluff`
We use the `.sqlfluff` found [here](./.sqlfluff).

### `templater = dbt` & dummy `profiles.yml`

When `sqlfluff` uses `templater = dbt`, it is *actually using the dbt compiler* to compile your SQL before `sqlfluff` lints it. When the dbt compiler, uh ... compiles... the SQL in your models containing macros like `dbt_utils.star()`, it needs to *connect and query the warehouse* to get information about the table and columns referenced in the marco.
Expand Down
9 changes: 3 additions & 6 deletions menu_of_workflows/surfline/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,8 @@ channels:
- conda-forge
- defaults
dependencies:
- agate=1.6.1
- pip
- python=3.8
- python=3.9
- pip:
- dbt==0.19.1
- sqlfluff==0.5.2
# Alternatively, to use latest SQLFluff updates from master. USE WITH CAUTION
# - "git+https://github.com/sqlfluff/sqlfluff.git@master#egg=sqlfluff"
- dbt==0.21.0
- sqlfluff==0.6.8
14 changes: 8 additions & 6 deletions menu_of_workflows/surfline/sqlfluff_lint_dbt_models.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,20 +64,22 @@ jobs:
with:
output: ' '

- name: Get changed .sql files in /models and /analysis to lint
- name: Get new and changed .sql files in /models to lint
id: get_files_to_lint
shell: bash -l {0}
# Full credit for this step to Teghan Nightengale!
run: |
# Set the command in the $() brackets as an output to use in later steps
echo "::set-output name=lintees::$(
# Issue where grep regular expressions don't work as expected on the
# Github Actions shell, check models/ and analysis/ folders seperately
# Github Actions shell, check dbt/models/ folder
echo \
$(echo ${{ steps.get_file_changes.outputs.files }} |
$(echo ${{ steps.get_file_changes.outputs.files_modified }} |
tr -s ' ' '\n' |
grep -E '^dbt/models.*[.]sql$' |
tr -s '\n' ' ') \
$(echo ${{ steps.get_file_changes.outputs.files_added }} |
tr -s ' ' '\n' |
grep -E '^models.*[.]sql$' |
grep -E '^dbt/models.*[.]sql$' |
tr -s '\n' ' ')
)"
Expand Down

0 comments on commit 6176770

Please sign in to comment.