Skip to content

Latest commit

 

History

History
128 lines (81 loc) · 2.81 KB

README.md

File metadata and controls

128 lines (81 loc) · 2.81 KB

BqTail - command line loader

Stand alone Google Storage based BigQuery loader.

Introduction

BqTail command loader manages ingestion process as stand along process using Data ingestion rules. For each source datafile an event is triggered to local BqTail process. For any source where URL is not Google Storage (gs://), the tail process copies data file to Google Storage bucket followed by triggering local events. BqTail client supports all feature of serverless BqTail with exception that is always run in sync mode, on top of that it also support constant data streaming option.

When ingesting data bqtail process manages history for all successfully process file, to avoid processing the same file more than once. By default only streaming mode stores history file in file:///${env.HOME}/.bqtail location, otherwise memory filesystem is used.

Installation

OSX

wget https://github.com/viant/bqtail/releases/download/v2.0.2/bqtail_osx_2.0.2.tar.gz tar -xvzf bqtail_osx_2.0.2.tar.gz cp bqtail /usr/local/bin/

Linux

wget https://github.com/viant/bqtail/releases/download/v2.0.2/bqtail_linux_2.0.2.tar.gz tar -xvzf bqtail_linux_2.0.2.tar.gz cp bqtail /usr/local/bin/

Usage

Data ingestion rule validation

To validate rule use -V option.

bqtail -r='myRuleURL -V' -p=myProject
bqtail -s=mydatafile -d='myProject:mydataset.mytable' -V
bqtail -r=gs://MY_CONFIG_BUCKET/BqTail/Rules/sys/bqjob.yaml -V

Local data file ingestion

bqtail -s=mydatafile -d='myProject:mydataset.mytable'

Local data ingestion with data ingestion rule

bqtail -s=mydatafile -r='myRuleURL' 

Local data files ingestion

bqtail -s=mylocaldatafolder -d='myProject:mydataset.mytable'

Local data files ingestion in batch with 120 sec window

bqtail -s=mylocaldatafolder -d='myProject:mydataset.mytable' -w=120

Local data files streaming ingestion with rule

bqtail -s=mylocaldatafolder -r='myRuleURL' -X 

Local data files ingestion in batch with 120 sec window with processed file tracking

bqtail -s=mylocaldatafolder -d='myProject:mydataset.mytable' -w=120 -h=~/.bqtail

Authentication

BqTail client can use one the following auth method

  1. With BqTail BigQuery OAuth client (by default)
  • no env setting needed

2.With Google Service Account Secrets

export GOOGLE_APPLICATION_CREDENTIALS=myGoogle.secret
  1. With gsutil authentication
    gcloud config set project my-project
    gcloud auth login`
    export GCLOUD_AUTH=true
  1. With custom BigQuery Oath clent

-c switch

bqtail -c=pathTo/custom.json

where:

  • @pathTo/custom.json
{
   "Id": "xxxx.apps.googleusercontent.com",
  "Secret": "xxxxxx"

}

Help:

bqtail -h