chronicle-export-data

Python code that call the Chronicle Data Export API to initiate data export to Google Cloud Storage.

Invoking export_http() without any parameters will lead to an export triggered for:

270d in past
4 export jobs (0-6, 6-12, 12-18, 18-24) to avoid hitting the 10TB per request limit

Invoking export_http() and providing HTTP request parameter start_time and end_time will only run a single export for the specified timeframe

As the Chronicle Data Export API is asynchronous all export_http() operations log the status to a BigQuery table. Running status_update_http() will fetch all non completed job from the BigQuery table, query the their status via Chronicle Data Export API and update the BigQuery table if the status has changed.

It is recommended to schedule both Cloud Functions daily via Cloud Scheduler. status_update_http() should run a few hours after the export_http() so that exports for that day have finished.

Deployment

It is recommended to deploy export_http() and status_update_http() as two independent Cloud Function instances with the entry points:

export_http()
status_update_http()

For that use the same directory but provide only different entry points for the Cloud Function.

The Cloud Function will require the following environment variables to be set:

EXPORT_GCS_BUCKET -
SA_FILE - path to the service account key file that has permissions to access the Chronicle Data Export API
BQ_METADATA_TABLE - BigQuery table name in the projectid.dataset.tablename to store the job status

Ensure that the service account used to run the Cloud Function has roles/bigquery.jobUser permission on the BigQuery dataset/table

You can deploy it using gcloud:

gcloud functions deploy chronicle_export \
  --entry-point export_http \
  --runtime=python312 \
  --no-allow-unauthenticated \
  --ingress-settings=internal-and-gclb \
  --set-secrets=/secrets/sa-key.json=SECRET_NAME:latest \
  --set-env-vars=EXPORT_GCS_BUCKET=my-bucket,SA_FILE=/secrets/sa-key.json,BQ_METADATA_TABLE=projectid.dataset.tablename

BigQuery Table

As the Chronicle Data Export API is asynchronous job status is tracked in BigQuery. For that purpose a table needs to be created

# create dataset
bq --project_id GCP_PROJECT mk chronicle_export

# create table
bq --project_id GCP_PROJECT mk -t --description "Status tracking for Chronicle Data Export jobs" \
  chronicle_export.job_status \
  id:STRING,event_time:TIMESTAMP,status:STRING,data:JSON

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
backstory.py		backstory.py
backstory_test.py		backstory_test.py
date_calculator.py		date_calculator.py
date_calculator_test.py		date_calculator_test.py
export.py		export.py
export_test.py		export_test.py
main.py		main.py
metadata.py		metadata.py
requirements.txt		requirements.txt
status.py		status.py
status_test.py		status_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

chronicle-export-data

Deployment

BigQuery Table

About

Releases

Packages

Languages

License

mrbuk/chronicle-export-data

Folders and files

Latest commit

History

Repository files navigation

chronicle-export-data

Deployment

BigQuery Table

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages