forked from airbytehq/airbyte
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
🎉 Incremental Normalization (airbytehq#7162)
- Loading branch information
1 parent
9c78351
commit 5fc50df
Showing
62 changed files
with
1,833 additions
and
346 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
7 changes: 7 additions & 0 deletions
7
...bases/base-normalization/dbt-project-template/macros/cross_db_utils/current_timestamp.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
{% macro mysql__current_timestamp() %} | ||
CURRENT_TIMESTAMP | ||
{% endmacro %} | ||
|
||
{% macro oracle__current_timestamp() %} | ||
CURRENT_TIMESTAMP | ||
{% endmacro %} |
8 changes: 0 additions & 8 deletions
8
...tions/bases/base-normalization/dbt-project-template/macros/cross_db_utils/drop_schema.sql
This file was deleted.
Oops, something went wrong.
36 changes: 36 additions & 0 deletions
36
airbyte-integrations/bases/base-normalization/dbt-project-template/macros/incremental.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
{# | ||
These macros control how incremental models are updated in Airbyte's normalization step | ||
- get_max_normalized_cursor retrieve the value of the last normalized data | ||
- incremental_clause controls the predicate to filter on new data to process incrementally | ||
#} | ||
{% macro incremental_clause(col_emitted_at) -%} | ||
{{ adapter.dispatch('incremental_clause')(col_emitted_at) }} | ||
{%- endmacro %} | ||
{%- macro default__incremental_clause(col_emitted_at) -%} | ||
{% if is_incremental() %} | ||
and {{ col_emitted_at }} >= (select max({{ col_emitted_at }}) from {{ this }}) | ||
{% endif %} | ||
{%- endmacro -%} | ||
{# -- see https://on-systems.tech/113-beware-dbt-incremental-updates-against-snowflake-external-tables/ #} | ||
{%- macro snowflake__incremental_clause(col_emitted_at) -%} | ||
{% if is_incremental() %} | ||
and {{ col_emitted_at }} >= cast('{{ get_max_normalized_cursor(col_emitted_at) }}' as {{ type_timestamp_with_timezone() }}) | ||
{% endif %} | ||
{%- endmacro -%} | ||
{% macro get_max_normalized_cursor(col_emitted_at) %} | ||
{% if execute and is_incremental() %} | ||
{% if env_var('INCREMENTAL_CURSOR', 'UNSET') == 'UNSET' %} | ||
{% set query %} | ||
select coalesce(max({{ col_emitted_at }}), cast('1970-01-01 00:00:00' as {{ type_timestamp_with_timezone() }})) from {{ this }} | ||
{% endset %} | ||
{% set max_cursor = run_query(query).columns[0][0] %} | ||
{% do return(max_cursor) %} | ||
{% else %} | ||
{% do return(env_var('INCREMENTAL_CURSOR')) %} | ||
{% endif %} | ||
{% endif %} | ||
{% endmacro %} |
51 changes: 51 additions & 0 deletions
51
...integrations/bases/base-normalization/dbt-project-template/macros/should_full_refresh.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
{# | ||
This overrides the behavior of the macro `should_full_refresh` so full refresh are triggered if: | ||
- the dbt cli is run with --full-refresh flag or the model is configured explicitly to full_refresh | ||
- the column _airbyte_ab_id does not exists in the normalized tables and make sure it is well populated. | ||
#} | ||
|
||
{%- macro need_full_refresh(col_ab_id, target_table=this) -%} | ||
{%- if not execute -%} | ||
{{ return(false) }} | ||
{%- endif -%} | ||
{%- set found_column = [] %} | ||
{%- set cols = adapter.get_columns_in_relation(target_table) -%} | ||
{%- for col in cols -%} | ||
{%- if col.column == col_ab_id -%} | ||
{% do found_column.append(col.column) %} | ||
{%- endif -%} | ||
{%- endfor -%} | ||
{%- if found_column -%} | ||
{{ return(false) }} | ||
{%- else -%} | ||
{{ dbt_utils.log_info(target_table ~ "." ~ col_ab_id ~ " does not exist. The table needs to be rebuilt in full_refresh") }} | ||
{{ return(true) }} | ||
{%- endif -%} | ||
{%- endmacro -%} | ||
|
||
{%- macro should_full_refresh() -%} | ||
{% set config_full_refresh = config.get('full_refresh') %} | ||
{%- if config_full_refresh is none -%} | ||
{% set config_full_refresh = flags.FULL_REFRESH %} | ||
{%- endif -%} | ||
{%- if not config_full_refresh -%} | ||
{% set config_full_refresh = need_full_refresh(get_col_ab_id(), this) %} | ||
{%- endif -%} | ||
{% do return(config_full_refresh) %} | ||
{%- endmacro -%} | ||
|
||
{%- macro get_col_ab_id() -%} | ||
{{ adapter.dispatch('get_col_ab_id')() }} | ||
{%- endmacro -%} | ||
|
||
{%- macro default__get_col_ab_id() -%} | ||
_airbyte_ab_id | ||
{%- endmacro -%} | ||
|
||
{%- macro oracle__get_col_ab_id() -%} | ||
"_AIRBYTE_AB_ID" | ||
{%- endmacro -%} | ||
|
||
{%- macro snowflake__get_col_ab_id() -%} | ||
_AIRBYTE_AB_ID | ||
{%- endmacro -%} |
46 changes: 46 additions & 0 deletions
46
airbyte-integrations/bases/base-normalization/dbt-project-template/macros/star_intersect.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
{# | ||
Similar to the star macro here: https://github.com/dbt-labs/dbt-utils/blob/main/macros/sql/star.sql | ||
|
||
This star_intersect macro takes an additional 'intersect' relation as argument. | ||
Its behavior is to select columns from both 'intersect' and 'from' relations with the following rules: | ||
- if the columns are existing in both 'from' and the 'intersect' relations, then the column from 'intersect' is used | ||
- if it's not in the both relation, then only the column in the 'from' relation is used | ||
#} | ||
{% macro star_intersect(from, intersect, from_alias=False, intersect_alias=False, except=[]) -%} | ||
{%- do dbt_utils._is_relation(from, 'star_intersect') -%} | ||
{%- do dbt_utils._is_ephemeral(from, 'star_intersect') -%} | ||
{%- do dbt_utils._is_relation(intersect, 'star_intersect') -%} | ||
{%- do dbt_utils._is_ephemeral(intersect, 'star_intersect') -%} | ||
{#-- Prevent querying of db in parsing mode. This works because this macro does not create any new refs. #} | ||
{%- if not execute -%} | ||
{{ return('') }} | ||
{% endif %} | ||
{%- set include_cols = [] %} | ||
{%- set cols = adapter.get_columns_in_relation(from) -%} | ||
{%- set except = except | map("lower") | list %} | ||
{%- for col in cols -%} | ||
{%- if col.column|lower not in except -%} | ||
{% do include_cols.append(col.column) %} | ||
{%- endif %} | ||
{%- endfor %} | ||
{%- set include_intersect_cols = [] %} | ||
{%- set intersect_cols = adapter.get_columns_in_relation(intersect) -%} | ||
{%- for col in intersect_cols -%} | ||
{%- if col.column|lower not in except -%} | ||
{% do include_intersect_cols.append(col.column) %} | ||
{%- endif %} | ||
{%- endfor %} | ||
{%- for col in include_cols %} | ||
{%- if col in include_intersect_cols -%} | ||
{%- if intersect_alias %}{{ intersect_alias }}.{% else %}{%- endif -%}{{ adapter.quote(col)|trim }} | ||
{%- if not loop.last %},{{ '\n ' }}{% endif %} | ||
{%- else %} | ||
{%- if from_alias %}{{ from_alias }}.{% else %}{{ from }}.{%- endif -%}{{ adapter.quote(col)|trim }} as {{ adapter.quote(col)|trim }} | ||
{%- if not loop.last %},{{ '\n ' }}{% endif %} | ||
{%- endif %} | ||
{%- endfor -%} | ||
{%- endmacro %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.