Skip to content

Latest commit

 

History

History
 
 

dlp

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Google Data Loss Prevention Python Samples

https://gstatic.com/cloudssh/images/open-btn.png

This directory contains samples for Google Data Loss Prevention. Google Data Loss Prevention provides programmatic access to a powerful detection engine for personally identifiable information and other privacy-sensitive data in unstructured data streams.

To run the sample, you need to enable the API at: https://console.cloud.google.com/apis/library/dlp.googleapis.com

To run the sample, you need to have the following roles: * DLP Administrator * DLP API Service Agent

Setup

Authentication

This sample requires you to have authentication setup. Refer to the Authentication Getting Started Guide for instructions on setting up credentials for applications.

Install Dependencies

  1. Clone python-docs-samples and change directory to the sample directory you want to use.

    $ git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
  2. Install pip and virtualenv if you do not already have them. You may want to refer to the Python Development Environment Setup Guide for Google Cloud Platform for instructions.

  3. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.

    $ virtualenv env
    $ source env/bin/activate
  4. Install the dependencies needed to run the samples.

    $ pip install -r requirements.txt

Samples

Quickstart

https://gstatic.com/cloudssh/images/open-btn.png

To run this sample:

$ python quickstart.py

Inspect Content

https://gstatic.com/cloudssh/images/open-btn.png

To run this sample:

$ python inspect_content.py

usage: inspect_content.py [-h] {string,table,file,gcs,datastore,bigquery} ...

Sample app that uses the Data Loss Prevention API to inspect a string, a local
file or a file on Google Cloud Storage.

positional arguments:
  {string,table,file,gcs,datastore,bigquery}
                        Select how to submit content to the API.
    string              Inspect a string.
    table               Inspect a table.
    file                Inspect a local file.
    gcs                 Inspect files on Google Cloud Storage.
    datastore           Inspect files on Google Datastore.
    bigquery            Inspect files on Google BigQuery.

optional arguments:
  -h, --help            show this help message and exit

Redact Content

https://gstatic.com/cloudssh/images/open-btn.png

To run this sample:

$ python redact.py

usage: redact.py [-h] {info_types,all_text} ...

Sample app that uses the Data Loss Prevent API to redact the contents of an
image file.

positional arguments:
  {info_types,all_text}
                        Select which content should be redacted.
    info_types          Redact specific infoTypes from an image.
    all_text            Redact all text from an image. The MIME type of the
                        file is inferred via the Python standard library's
                        mimetypes module.

optional arguments:
  -h, --help            show this help message and exit

Metadata

https://gstatic.com/cloudssh/images/open-btn.png

To run this sample:

$ python metadata.py

usage: metadata.py [-h] [--language_code LANGUAGE_CODE] [--filter FILTER]

Sample app that queries the Data Loss Prevention API for supported categories
and info types.

optional arguments:
  -h, --help            show this help message and exit
  --language_code LANGUAGE_CODE
                        The BCP-47 language code to use, e.g. 'en-US'.
  --filter FILTER       An optional filter to only return info types supported
                        by certain parts of the API. Defaults to
                        "supported_by=INSPECT".

Jobs

https://gstatic.com/cloudssh/images/open-btn.png

To run this sample:

$ python jobs.py

usage: jobs.py [-h] {list,delete} ...

Sample app to list and delete DLP jobs using the Data Loss Prevent API.

positional arguments:
  {list,delete}  Select how to submit content to the API.
    list         List Data Loss Prevention API jobs corresponding to a given
                 filter.
    delete       Delete results of a Data Loss Prevention API job.

optional arguments:
  -h, --help     show this help message and exit

Templates

https://gstatic.com/cloudssh/images/open-btn.png

To run this sample:

$ python templates.py

usage: templates.py [-h] {create,list,delete} ...

Sample app that sets up Data Loss Prevention API inspect templates.

positional arguments:
  {create,list,delete}  Select which action to perform.
    create              Create a template.
    list                List all templates.
    delete              Delete a template.

optional arguments:
  -h, --help            show this help message and exit

Triggers

https://gstatic.com/cloudssh/images/open-btn.png

To run this sample:

$ python triggers.py

usage: triggers.py [-h] {create,list,delete} ...

Sample app that sets up Data Loss Prevention API automation triggers.

positional arguments:
  {create,list,delete}  Select which action to perform.
    create              Create a trigger.
    list                List all triggers.
    delete              Delete a trigger.

optional arguments:
  -h, --help            show this help message and exit

Risk Analysis

https://gstatic.com/cloudssh/images/open-btn.png

To run this sample:

$ python risk.py

usage: risk.py [-h] {numerical,categorical,k_anonymity,l_diversity,k_map} ...

Sample app that uses the Data Loss Prevent API to perform risk anaylsis.

positional arguments:
  {numerical,categorical,k_anonymity,l_diversity,k_map}
                        Select how to submit content to the API.
    numerical
    categorical
    k_anonymity         Computes the k-anonymity of a column set in a Google
                        BigQuerytable.
    l_diversity         Computes the l-diversity of a column set in a Google
                        BigQuerytable.
    k_map               Computes the k-map risk estimation of a column set in
                        a GoogleBigQuery table.

optional arguments:
  -h, --help            show this help message and exit

DeID

https://gstatic.com/cloudssh/images/open-btn.png

To run this sample:

$ python deid.py
usage: deid.py [-h] {deid_mask,deid_fpe,reid_fpe,deid_date_shift,replace_with_infotype} ...

Uses of the Data Loss Prevention API for deidentifying sensitive data.

positional arguments:
  {deid_mask,deid_fpe,reid_fpe,deid_date_shift,redact}
                            Select how to submit content to the API.
    deid_mask               Deidentify sensitive data in a string by masking it
                            with a character.
    deid_fpe                Deidentify sensitive data in a string using Format
                            Preserving Encryption (FPE).
    reid_fpe                Reidentify sensitive data in a string using Format
                            Preserving Encryption (FPE).
    deid_date_shift         Deidentify dates in a CSV file by pseudorandomly
                            shifting them.
    replace_with_infotype   Deidentify sensitive data in a string by replacing it with
                            the info type of the data.

optional arguments:
  -h, --help                show this help message and exit

The client library

This sample uses the Google Cloud Client Library for Python. You can read the documentation for more details on API usage and use GitHub to browse the source and report issues.