Skip to content

Commit

Permalink
Added TriggerBucket variables
Browse files Browse the repository at this point in the history
  • Loading branch information
adranwit committed Aug 31, 2020
1 parent 68e1c31 commit 51d5a89
Show file tree
Hide file tree
Showing 34 changed files with 315 additions and 29 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
## August 31 2020 2.6.0
* Extended export request with UseAvroLogicalTypes (default true for avro format export)
* Added $TriggerBucket variable

## August 25 2020 2.5.0
* Upgraded cloud function to g113 runtime
* Upgraded CLI to go1.15 SDK
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ Dest:
ProjectID: myProject
Info:
LeadEngineer: awitas
URL: mem://localhost/BqTail/config/rule/rule.yaml
URL: mem://localhost/BqTail/config/rule/performance.yaml
Workflow: rule
OnSuccess:
- Action: delete
Expand All @@ -104,7 +104,7 @@ When:
You can save it as rule.yaml to [extend/customize](https://github.com/viant/bqtail/tree/master/tail#data-ingestion-rules) the rule, then you can ingest data with updated rule:
```yaml
bqtail -s=gs://myBuckey/folder/mydatafile.csv -r=rule.yaml
bqtail -s=gs://myBuckey/folder/mydatafile.csv -r=performance.yaml
```


Expand Down
2 changes: 1 addition & 1 deletion Version
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.5.0
2.6.0
4 changes: 2 additions & 2 deletions cmd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ Dest:
ProjectID: myProject
Info:
LeadEngineer: awitas
URL: mem://localhost/BqTail/config/rule/rule.yaml
URL: mem://localhost/BqTail/config/rule/performance.yaml
Workflow: rule
OnSuccess:
- Action: delete
Expand All @@ -96,7 +96,7 @@ When:
You can save it as rule.yaml to extend/customize the rule, then you can ingest data with updated rule:
```yaml
bqtail -s=gs://myBuckey/folder/mydatafile.csv -r=rule.yaml
bqtail -s=gs://myBuckey/folder/mydatafile.csv -r=performance.yaml
```


Expand Down
2 changes: 1 addition & 1 deletion cmd/build.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ func (s *service) Build(ctx context.Context, request *build.Request) error {
}
request.Init(s.config)
if request.RuleURL == "" {
request.RuleURL = url.Join(ruleBaseURL, "rule.yaml")
request.RuleURL = url.Join(ruleBaseURL, "performance.yaml")
}
rule := &config.Rule{
Async: true,
Expand Down
2 changes: 1 addition & 1 deletion deployment/test/sync/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ pipeline:
action: storage:copy
expand: true
source:
URL: rule.yaml
URL: performance.yaml
dest:
credentials: $gcpCredentials
URL: gs://${configBucket}/BqTail/Rules/deployment_sync_test.yaml
Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/030_pubsub_push/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ pipeline:
'@indexBy@': Attributes.CaseNo
'${parentIndex}':
Transformed:
RuleURL: '/BqTail/Rules/case_${parent.index}/rule.yaml/'
RuleURL: '/BqTail/Rules/case_${parent.index}/performance.yaml/'
URLs: '/dummy.json/'

info:
Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/034_cli_single/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ pipeline:
checkError: true
commands:
- export GOOGLE_APPLICATION_CREDENTIALS='${env.HOME}/.secret/${gcpCredentials}.json'
- ${bqtailCmd} -l=info -r='${parent.path}/rule/rule.yaml' -s='${parent.path}/data/trigger/dummy.json'
- ${bqtailCmd} -l=info -r='${parent.path}/rule/performance.yaml' -s='${parent.path}/data/trigger/dummy.json'

validate:
check-db:
Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/035_cli_batch/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ pipeline:
errors: []
commands:
- export GOOGLE_APPLICATION_CREDENTIALS='${env.HOME}/.secret/${gcpCredentials}.json'
- ${bqtailCmd} -r='${parent.path}/rule/rule.yaml' -s='${parent.path}/data/trigger'
- ${bqtailCmd} -r='${parent.path}/rule/performance.yaml' -s='${parent.path}/data/trigger'

validate:
check-db:
Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/037_cli_dml_copy/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ pipeline:
checkError: true
commands:
- export GOOGLE_APPLICATION_CREDENTIALS='${env.HOME}/.secret/${gcpCredentials}.json'
- ${bqtailCmd} -r='${parent.path}/rule/rule.yaml' -s='gs://${triggerBucket}/data/case${parent.index}/'
- ${bqtailCmd} -r='${parent.path}/rule/performance.yaml' -s='gs://${triggerBucket}/data/case${parent.index}/'

validate:
check-db:
Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/038_cli_json_field_addition/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ pipeline:
checkError: true
commands:
- export GOOGLE_APPLICATION_CREDENTIALS='${env.HOME}/.secret/${gcpCredentials}.json'
- ${bqtailCmd} -r='${parent.path}/rule/rule.yaml' -s='${parent.path}/data/trigger'
- ${bqtailCmd} -r='${parent.path}/rule/performance.yaml' -s='${parent.path}/data/trigger'

validate:
check-db:
Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/039_cli_stress_test/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ pipeline:
checkError: true
commands:
- export GOOGLE_APPLICATION_CREDENTIALS='${env.HOME}/.secret/${gcpCredentials}.json'
- ${bqtailCmd} -r='${parent.path}/rule/rule.yaml' -s='gs://${triggerBucket}/data/case039'
- ${bqtailCmd} -r='${parent.path}/rule/performance.yaml' -s='gs://${triggerBucket}/data/case039'



Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/040_cli_regexp/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ pipeline:
checkError: true
commands:
- export GOOGLE_APPLICATION_CREDENTIALS='${env.HOME}/.secret/${gcpCredentials}.json'
- ${bqtailCmd} -l=info -r='${parent.path}/rule/rule.yaml' -s='${parent.path}/data/trigger/'
- ${bqtailCmd} -l=info -r='${parent.path}/rule/performance.yaml' -s='${parent.path}/data/trigger/'

validate:
check-db:
Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/041_cli_transform_except/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ pipeline:
checkError: true
commands:
- export GOOGLE_APPLICATION_CREDENTIALS='${env.HOME}/.secret/${gcpCredentials}.json'
- ${bqtailCmd} -l=info -r='${parent.path}/rule/rule.yaml' -s='${parent.path}/data/trigger/dummy.json'
- ${bqtailCmd} -l=info -r='${parent.path}/rule/performance.yaml' -s='${parent.path}/data/trigger/dummy.json'

validate:
check-db:
Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/042_cli_transient_schema/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ pipeline:
checkError: true
commands:
- export GOOGLE_APPLICATION_CREDENTIALS='${env.HOME}/.secret/${gcpCredentials}.json'
- ${bqtailCmd} -l=info -r='${parent.path}/rule/rule.yaml' -s='${parent.path}/data/trigger/dummy.json'
- ${bqtailCmd} -l=info -r='${parent.path}/rule/performance.yaml' -s='${parent.path}/data/trigger/dummy.json'

validate:
check-db:
Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/043_split_dml/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ pipeline:
checkError: true
commands:
- export GOOGLE_APPLICATION_CREDENTIALS='${env.HOME}/.secret/${gcpCredentials}.json'
- ${bqtailCmd} -l=info -r='${parent.path}/rule/rule.yaml' -s='${parent.path}/data/trigger/dummy.json'
- ${bqtailCmd} -l=info -r='${parent.path}/rule/performance.yaml' -s='${parent.path}/data/trigger/dummy.json'

validate:
check-db:
Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/044_load_journal/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ pipeline:
errors: []
commands:
- export GOOGLE_APPLICATION_CREDENTIALS='${env.HOME}/.secret/${gcpCredentials}.json'
- ${bqtailCmd} -r='${parent.path}/rule/rule.yaml' -s='${parent.path}/data/trigger'
- ${bqtailCmd} -r='${parent.path}/rule/performance.yaml' -s='${parent.path}/data/trigger'

validate:
check-db:
Expand Down
2 changes: 1 addition & 1 deletion e2e/regression/cases/045_override/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ pipeline:
errors: []
commands:
- export GOOGLE_APPLICATION_CREDENTIALS='${env.HOME}/.secret/${gcpCredentials}.json'
- ${bqtailCmd} -r='${parent.path}/rule/rule.yaml' -s='${parent.path}/data/trigger' -l=debug
- ${bqtailCmd} -r='${parent.path}/rule/performance.yaml' -s='${parent.path}/data/trigger' -l=debug

validate:
check-db:
Expand Down
51 changes: 51 additions & 0 deletions e2e/regression/cases/046_multi_step/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
### Data aggregation with side input with yaml rule

### Scenario:

This scenario test data aggregation with side input.

It uses the following rule:

[@rule.json](rule/performance.yaml)
```yaml
- When:
Prefix: "/data/case022"
Suffix: ".json"
Async: true
Batch:
Window:
DurationInSec: 10
Dest:
Table: bqtail.transactions
Transient:
Dataset: temp
tAlias: t
Transform:
charge: (CASE WHEN type_id = 1 THEN t.payment + f.value WHEN type_id = 2 THEN t.payment * (1 + f.value) END)
SideInputs:
- Table: bqtail.fees
Alias: f
'On': t.fee_id = f.id
OnSuccess:
- Action: query
Request:
SQL: SELECT
DATE(timestamp) AS date,
sku_id,
supply_entity_id,
MAX($EventID) AS batch_id,
SUM( payment) payment,
SUM((CASE WHEN type_id = 1 THEN t.payment + f.value WHEN type_id = 2 THEN t.payment * (1 + f.value) END)) charge,
SUM(COALESCE(qty, 1.0)) AS qty
FROM $TempTable t
LEFT JOIN bqtail.fees f ON f.id = t.fee_id
GROUP BY 1, 2, 3
Dest: bqtail.supply_performance
Append: true
OnSuccess:
- Action: delete
```
Note that storage trigger $EventID is used as batch id.
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[
{"@indexBy@": ["sku_id", "supply_entity_id"]},
{"date":"$FormatTime('2daysAgo','yyyy-MM-dd')","sku_id":"86622","supply_entity_id":"22012","payment":102.7,"qty":1,"charge":128.375},
{"date":"$FormatTime('2daysAgo','yyyy-MM-dd')","sku_id":"50623","supply_entity_id":"22010","payment":30.4,"qty":2,"charge":32.4},
{"date":"$FormatTime('1daysAgo','yyyy-MM-dd')","sku_id":"86621","supply_entity_id":"22010","payment":56.31,"qty":1,"charge":70.3875},
{"date":"$FormatTime('2daysAgo','yyyy-MM-dd')","sku_id":"10023","supply_entity_id":"22012","payment":64.4,"qty":2,"charge":66.4}
]
5 changes: 5 additions & 0 deletions e2e/regression/cases/046_multi_step/bqtail/prepare/fees.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[
{},
{"id": 1, "type_id": 1, "value": 1},
{"id": 2, "type_id": 2, "value": 0.25}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[
{}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[
{}
]
34 changes: 34 additions & 0 deletions e2e/regression/cases/046_multi_step/bqtail/schema.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@


CREATE OR REPLACE TABLE transactions_v${parentIndex} (
id STRING,
timestamp TIMESTAMP,
batch_id INT64,
event_id INT64,
sku_id INT64,
supply_entity_id INT64,
payment FLOAT64,
demany_entity_id INT64,
charge FLOAT64,
qty FLOAT64,
fee_id INT64
);


CREATE OR REPLACE TABLE fees (
id INT64,
type_id INT64,
value FLOAT64
);


CREATE OR REPLACE TABLE supply_performance_v${parentIndex} (
date DATE,
batch_id INT64,
sku_id INT64,
supply_entity_id INT64,
payment FLOAT64,
qty FLOAT64,
charge FLOAT64
) PARTITION BY date CLUSTER BY supply_entity_id, sku_id;

Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{ "id":"A8B0492A-330D-48A2-A18F-78BF42EECFF", "timestamp": "$FormatTime('2daysAgo','yyyy-MM-dd hh:mm:ss')", "event_id": 101, "sku_id": 10023, "supply_entity_id": 22012, "payment": 32.2, "demany_entity_id":332421, "fee_id": 1}
{ "id":"021C1141-1342-4243-BFF8-3DCD0F0732A8", "timestamp": "$FormatTime('2daysAgo','yyyy-MM-dd hh:mm:ss')", "event_id": 102, "sku_id": 50623, "supply_entity_id": 22010, "payment": 15.2, "demany_entity_id":332421, "fee_id": 1}
{ "id":"46CAC8F6-6F15-4A94-B0B1-D97C18A05F2F", "timestamp": "$FormatTime('2daysAgo','yyyy-MM-dd hh:mm:ss')", "event_id": 101, "sku_id": 10023, "supply_entity_id": 22012, "payment": 32.2, "demany_entity_id":632438, "fee_id": 1 }
{ "id":"6D50CC7E-1944-489A-AFE4-CD8FEE58A33C", "timestamp": "$FormatTime('2daysAgo','yyyy-MM-dd hh:mm:ss')", "event_id": 102, "sku_id": 50623, "supply_entity_id": 22010, "payment": 15.2, "demany_entity_id":632438, "fee_id": 1 }
{ "id":"337059BF-7494-4697-8CC3-46F960D77834", "timestamp": "$FormatTime('2daysAgo','yyyy-MM-dd hh:mm:ss')", "event_id": 101, "sku_id": 86622, "supply_entity_id": 22012, "payment":102.7, "demany_entity_id":332421, "fee_id": 2 }
{ "id":"C9260B5E-DEED-4934-880E-A81077C5A6BD", "timestamp": "$FormatTime('1daysAgo','yyyy-MM-dd hh:mm:ss')", "event_id": 102, "sku_id": 86621, "supply_entity_id": 22010, "payment": 56.31, "demany_entity_id":632438, "fee_id": 2}
1 change: 1 addition & 0 deletions e2e/regression/cases/046_multi_step/info.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
data aggregation with side input
11 changes: 11 additions & 0 deletions e2e/regression/cases/046_multi_step/rule/performance.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
When:
Prefix: "/data/supply_performance"
Suffix: ".avro"
Async: true
Batch:
Window:
DurationInSec: 10
Dest:
Table: bqtail.supply_performance_v${parentIndex}
OnSuccess:
- Action: delete
46 changes: 46 additions & 0 deletions e2e/regression/cases/046_multi_step/rule/transaction.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
When:
Prefix: "/data/case${parentIndex}"
Suffix: ".json"
Async: true
Batch:
Window:
DurationInSec: 10
Dest:
Table: bqtail.transactions_v${parentIndex}
Transient:
Dataset: temp
Alias: t
Transform:
charge: (CASE WHEN type_id = 1 THEN t.payment + f.value WHEN type_id = 2 THEN t.payment * (1 + f.value) END)
SideInputs:
- Table: bqtail.fees
Alias: f
'On': t.fee_id = f.id
OnSuccess:
- Action: query
Request:
SQL: SELECT
DATE(timestamp) AS date,
sku_id,
supply_entity_id,
MAX($EventID) AS batch_id,
SUM( payment) payment,
SUM((CASE WHEN type_id = 1 THEN t.payment + f.value WHEN type_id = 2 THEN t.payment * (1 + f.value) END)) charge,
SUM(COALESCE(qty, 1.0)) AS qty
FROM $TempTable t
LEFT JOIN bqtail.fees f ON f.id = t.fee_id
GROUP BY 1, 2, 3
Dest: ${ProjectID}.temp.supply_performance_v${EventID}
Append: false
OnSuccess:
- Action: export
Request:
Source: ${ProjectID}.temp.supply_performance_v${EventID}
DestURL: gs://${TriggerBucket}/data/supply_performance/transient-*.avro
Format: AVRO
UseAvroLogicalTypes: true
OnSuccess:
- Action: delete
- Action: drop
Request:
Table: ${ProjectID}.temp.supply_performance_v${EventID}
Loading

0 comments on commit 51d5a89

Please sign in to comment.