Project import generated by Copybara.

PiperOrigin-RevId: 231295457
WihanB · Jan 28, 2019 · d321b37 · d321b37
1 parent 6b2481c
commit d321b37
Show file tree

Hide file tree

Showing 167 changed files with 8,694 additions and 4,210 deletions.
diff --git a/README.md b/README.md
@@ -1,6 +1,10 @@
 <!-- See: www.tensorflow.org/tfx/model_analysis/ -->
 
-# TensorFlow Model Analysis [![PyPI](https://img.shields.io/pypi/pyversions/tensorflow-model-analysis.svg?style=plastic)](https://github.com/tensorflow/model-analysis)
+# TensorFlow Model Analysis
+
+[![Python](https://img.shields.io/pypi/pyversions/tensorflow-model-analysis.svg?style=plastic)](https://github.com/tensorflow/model-analysis)
+[![PyPI](https://badge.fury.io/py/tensorflow-model-analysis.svg)](https://badge.fury.io/py/tensorflow-model-analysis)
+[![Documentation](https://img.shields.io/badge/api-reference-blue.svg)](https://www.tensorflow.org/tfx/model_analysis/api_docs/python/tfma)
 
 *TensorFlow Model Analysis* (TFMA) is a library for evaluating TensorFlow models.
 It allows users to evaluate their models on large amounts of data in a

diff --git a/RELEASE.md b/RELEASE.md
diff --git a/examples/chicago_taxi/chicago_taxi_client.py b/examples/chicago_taxi/chicago_taxi_client.py
@@ -15,7 +15,6 @@
 
 from __future__ import print_function
 
-
 import argparse
 import base64
 import os

diff --git a/examples/chicago_taxi/chicago_taxi_tfma.ipynb b/examples/chicago_taxi/chicago_taxi_tfma.ipynb
@@ -112,7 +112,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "tfma.view.render_plot(result, tfma.SingleSliceSpec(features=[('trip_start_hour', 0)]))"
+    "tfma.view.render_plot(result, tfma.slicer.SingleSliceSpec(features=[('trip_start_hour', 0)]))"
    ]
   },
   {
@@ -144,7 +144,7 @@
     "\n",
     "eval_results = tfma.make_eval_results([result_1, result_2, result_3], \n",
     "                                      tfma.constants.MODEL_CENTRIC_MODE)\n",
-    "tfma.view.render_time_series(eval_results, tfma.SingleSliceSpec())\n"
+    "tfma.view.render_time_series(eval_results, tfma.slicer.SingleSliceSpec())\n"
    ]
   },
   {
@@ -170,7 +170,7 @@
     "                                                 PATH_TO_RESULT_2, \n",
     "                                                 PATH_TO_RESULT_3], \n",
     "                                                tfma.constants.MODEL_CENTRIC_MODE)\n",
-    "tfma.view.render_time_series(eval_results_from_disk, tfma.SingleSliceSpec())"
+    "tfma.view.render_time_series(eval_results_from_disk, tfma.slicer.SingleSliceSpec())"
    ]
   },
   {

diff --git a/examples/chicago_taxi/chicago_taxi_tfma_local_playground.ipynb b/examples/chicago_taxi/chicago_taxi_tfma_local_playground.ipynb
@@ -538,7 +538,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### You can also compute metrics on slices of your data in TFMA. Slices can be specified using ``tfma.SingleSliceSpec``.\n",
+    "#### You can also compute metrics on slices of your data in TFMA. Slices can be specified using ``tfma.slicer.SingleSliceSpec``.\n",
     "\n",
     "Below are examples of how slices can be specified."
    ]
@@ -550,23 +550,23 @@
    "outputs": [],
    "source": [
     "# An empty slice spec means the overall slice, that is, the whole dataset.\n",
-    "OVERALL_SLICE_SPEC = tfma.SingleSliceSpec()\n",
+    "OVERALL_SLICE_SPEC = tfma.slicer.SingleSliceSpec()\n",
     "\n",
     "# Data can be sliced along a feature column\n",
     "# In this case, data is sliced along feature column trip_start_hour.\n",
-    "FEATURE_COLUMN_SLICE_SPEC = tfma.SingleSliceSpec(columns=['trip_start_hour'])\n",
+    "FEATURE_COLUMN_SLICE_SPEC = tfma.slicer.SingleSliceSpec(columns=['trip_start_hour'])\n",
     "\n",
     "# Data can be sliced by crossing feature columns\n",
     "# In this case, slices are computed for trip_start_day x trip_start_month.\n",
-    "FEATURE_COLUMN_CROSS_SPEC = tfma.SingleSliceSpec(columns=['trip_start_day', 'trip_start_month'])\n",
+    "FEATURE_COLUMN_CROSS_SPEC = tfma.slicer.SingleSliceSpec(columns=['trip_start_day', 'trip_start_month'])\n",
     "\n",
     "# Metrics can be computed for a particular feature value.\n",
     "# In this case, metrics is computed for all data where trip_start_hour is 12.\n",
-    "FEATURE_VALUE_SPEC = tfma.SingleSliceSpec(features=[('trip_start_hour', 12)])\n",
+    "FEATURE_VALUE_SPEC = tfma.slicer.SingleSliceSpec(features=[('trip_start_hour', 12)])\n",
     "\n",
     "# It is also possible to mix column cross and feature value cross.\n",
     "# In this case, data where trip_start_hour is 12 will be sliced by trip_start_day.\n",
-    "COLUMN_CROSS_VALUE_SPEC = tfma.SingleSliceSpec(columns=['trip_start_day'], features=[('trip_start_hour', 12)])\n",
+    "COLUMN_CROSS_VALUE_SPEC = tfma.slicer.SingleSliceSpec(columns=['trip_start_day'], features=[('trip_start_hour', 12)])\n",
     "\n",
     "ALL_SPECS = [\n",
     "    OVERALL_SLICE_SPEC,\n",
@@ -606,7 +606,7 @@
    "source": [
     "## Visualization: Slicing Metrics\n",
     "\n",
-    "To see the slices, either use the name of the column (by setting slicing_column) or provide a tfma.SingleSliceSpec (by setting slicing_spec). If neither is provided, the overall will be displayed.\n",
+    "To see the slices, either use the name of the column (by setting slicing_column) or provide a tfma.slicer.SingleSliceSpec (by setting slicing_spec). If neither is provided, the overall will be displayed.\n",
     "\n",
     "The default visualization is **slice overview** when the number of slices is small. It shows the value of a metric for each slice sorted by the another metric. It is also possible to set a threshold to filter out slices with smaller weights.\n",
     "\n",
@@ -683,9 +683,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Plots must be visualized for an individual slice. To specify a slice, use ``tfma.SingleSliceSpec``.\n",
+    "Plots must be visualized for an individual slice. To specify a slice, use ``tfma.slicer.SingleSliceSpec``.\n",
     "\n",
-    "In the example below, we are using ``tfma.SingleSliceSpec(features=[('trip_start_hour', 0)])`` to specify the slice where trip_start_hour is 0.\n",
+    "In the example below, we are using ``tfma.slicer.SingleSliceSpec(features=[('trip_start_hour', 0)])`` to specify the slice where trip_start_hour is 0.\n",
     "\n",
     "Plots are interactive:\n",
     "- Drag to pan\n",
@@ -701,7 +701,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "tfma.view.render_plot(tfma_vis, tfma.SingleSliceSpec(features=[('trip_start_hour', 0)]))"
+    "tfma.view.render_plot(tfma_vis, tfma.slicer.SingleSliceSpec(features=[('trip_start_hour', 0)]))"
    ]
   },
   {

diff --git a/examples/chicago_taxi/preprocess.py b/examples/chicago_taxi/preprocess.py
@@ -94,8 +94,8 @@ def preprocessing_fn(inputs):
 
     for key in taxi.VOCAB_FEATURE_KEYS:
       # Build a vocabulary for this feature.
-      outputs[
-          taxi.transformed_name(key)] = transform.compute_and_apply_vocabulary(
+      outputs[taxi.transformed_name(
+          key)] = transform.compute_and_apply_vocabulary(
               _fill_in_missing(inputs[key]),
               top_k=taxi.VOCAB_SIZE,
               num_oov_buckets=taxi.OOV_SIZE)
@@ -150,8 +150,7 @@ def preprocessing_fn(inputs):
 
         _ = (
             transform_fn
-            | ('WriteTransformFn' >>
-               tft_beam.WriteTransformFn(working_dir)))
+            | ('WriteTransformFn' >> tft_beam.WriteTransformFn(working_dir)))
       else:
         transform_fn = pipeline | tft_beam.ReadTransformFn(transform_dir)
 

diff --git a/examples/chicago_taxi/process_tfma.py b/examples/chicago_taxi/process_tfma.py
@@ -61,8 +61,8 @@ def process_tfma(eval_result_dir,
         'one of --input_csv or --big_query_table should be provided.')
 
   slice_spec = [
-      tfma.SingleSliceSpec(),
-      tfma.SingleSliceSpec(columns=['trip_start_hour'])
+      tfma.slicer.SingleSliceSpec(),
+      tfma.slicer.SingleSliceSpec(columns=['trip_start_hour'])
   ]
 
   schema = taxi.read_schema(schema_file)

diff --git a/examples/chicago_taxi/tfdv_analyze_and_validate.py b/examples/chicago_taxi/tfdv_analyze_and_validate.py
@@ -101,8 +101,9 @@ def compute_stats(input_handle,
           | 'ReadBigQuery' >> beam.io.Read(
               beam.io.BigQuerySource(query=query, use_standard_sql=True))
           | 'ConvertToTFDVInput' >> beam.Map(
-              lambda x: {key: np.asarray([x[key]])  # pylint: disable=g-long-lambda
-                         for key in x if x[key] is not None}))
+              lambda x: {  # pylint: disable=g-long-lambda
+                  key: np.asarray([x[key]]) for key in x if x[key] is not None
+              }))
 
     _ = (
         raw_data

diff --git a/examples/chicago_taxi/trainer/model.py b/examples/chicago_taxi/trainer/model.py
@@ -12,14 +12,15 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 """Defines the model used to predict who will tip in the Chicago Taxi demo."""
+from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function
 
 import os
-import taxi
 import tensorflow as tf
 
 import tensorflow_model_analysis as tfma
+from trainer import taxi
 from tensorflow_transform.beam.tft_beam_io import transform_fn_io
 from tensorflow_transform.saved import saved_transform_io
 from tensorflow_transform.tf_metadata import metadata_io

diff --git a/examples/chicago_taxi/trainer/task.py b/examples/chicago_taxi/trainer/task.py
@@ -12,15 +12,17 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 """Trainer for the chicago_taxi demo."""
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
 import argparse
 import os
 
-import model
-import taxi
-
 import tensorflow as tf
-
 import tensorflow_model_analysis as tfma
+from trainer import model
+from trainer import taxi
 
 SERVING_MODEL_DIR = 'serving_model_dir'
 EVAL_MODEL_DIR = 'eval_model_dir'
@@ -45,22 +47,20 @@ def train_and_maybe_evaluate(hparams):
   """
   schema = taxi.read_schema(hparams.schema_file)
 
-  train_input = lambda: model.input_fn(
+  train_input = lambda: model.input_fn(  # pylint: disable=g-long-lambda
       hparams.train_files,
       hparams.tf_transform_dir,
-      batch_size=TRAIN_BATCH_SIZE
-  )
+      batch_size=TRAIN_BATCH_SIZE)
 
-  eval_input = lambda: model.input_fn(
+  eval_input = lambda: model.input_fn(  # pylint: disable=g-long-lambda
       hparams.eval_files,
       hparams.tf_transform_dir,
-      batch_size=EVAL_BATCH_SIZE
-  )
+      batch_size=EVAL_BATCH_SIZE)
 
   train_spec = tf.estimator.TrainSpec(
       train_input, max_steps=hparams.train_steps)
 
-  serving_receiver_fn = lambda: model.example_serving_receiver_fn(
+  serving_receiver_fn = lambda: model.example_serving_receiver_fn(  # pylint: disable=g-long-lambda
       hparams.tf_transform_dir, schema)
 
   exporter = tf.estimator.FinalExporter('chicago-taxi', serving_receiver_fn)
@@ -164,8 +164,7 @@ def main():
       default=100,
       type=int)
   parser.add_argument(
-      '--schema-file',
-      help='File holding the schema for the input data')
+      '--schema-file', help='File holding the schema for the input data')
 
   args = parser.parse_args()
 

diff --git a/examples/chicago_taxi/trainer/taxi.py b/examples/chicago_taxi/trainer/taxi.py
@@ -12,6 +12,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 """Utility and schema methods for the chicago_taxi sample."""
+from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function
 

diff --git a/g3doc/api_docs/python/index.md b/g3doc/api_docs/python/index.md
@@ -1,31 +1,31 @@
 # All symbols in TensorFlow Model Analysis
 
-*  <a href="./tfma.md"><code>tfma</code></a>
-*  <a href="./tfma/EvaluateAndWriteResults.md"><code>tfma.EvaluateAndWriteResults</code></a>
-*  <a href="./tfma/SingleSliceSpec.md"><code>tfma.SingleSliceSpec</code></a>
-*  <a href="./tfma/constants.md"><code>tfma.constants</code></a>
-*  <a href="./tfma/export.md"><code>tfma.export</code></a>
-*  <a href="./tfma/export/build_parsing_eval_input_receiver_fn.md"><code>tfma.export.build_parsing_eval_input_receiver_fn</code></a>
-*  <a href="./tfma/export/make_export_strategy.md"><code>tfma.export.make_export_strategy</code></a>
-*  <a href="./tfma/exporter.md"><code>tfma.exporter</code></a>
-*  <a href="./tfma/exporter/FinalExporter.md"><code>tfma.exporter.FinalExporter</code></a>
-*  <a href="./tfma/exporter/LatestExporter.md"><code>tfma.exporter.LatestExporter</code></a>
-*  <a href="./tfma/load_eval_result.md"><code>tfma.load_eval_result</code></a>
-*  <a href="./tfma/load_eval_results.md"><code>tfma.load_eval_results</code></a>
-*  <a href="./tfma/make_eval_results.md"><code>tfma.make_eval_results</code></a>
-*  <a href="./tfma/multiple_data_analysis.md"><code>tfma.multiple_data_analysis</code></a>
-*  <a href="./tfma/multiple_model_analysis.md"><code>tfma.multiple_model_analysis</code></a>
-*  <a href="./tfma/post_export_metrics.md"><code>tfma.post_export_metrics</code></a>
-*  <a href="./tfma/post_export_metrics/auc.md"><code>tfma.post_export_metrics.auc</code></a>
-*  <a href="./tfma/post_export_metrics/auc_plots.md"><code>tfma.post_export_metrics.auc_plots</code></a>
-*  <a href="./tfma/post_export_metrics/calibration_plot_and_prediction_histogram.md"><code>tfma.post_export_metrics.calibration_plot_and_prediction_histogram</code></a>
-*  <a href="./tfma/post_export_metrics/confusion_matrix_at_thresholds.md"><code>tfma.post_export_metrics.confusion_matrix_at_thresholds</code></a>
-*  <a href="./tfma/post_export_metrics/example_count.md"><code>tfma.post_export_metrics.example_count</code></a>
-*  <a href="./tfma/post_export_metrics/example_weight.md"><code>tfma.post_export_metrics.example_weight</code></a>
-*  <a href="./tfma/post_export_metrics/precision_recall_at_k.md"><code>tfma.post_export_metrics.precision_recall_at_k</code></a>
-*  <a href="./tfma/run_model_analysis.md"><code>tfma.run_model_analysis</code></a>
-*  <a href="./tfma/version.md"><code>tfma.version</code></a>
-*  <a href="./tfma/view.md"><code>tfma.view</code></a>
-*  <a href="./tfma/view/render_plot.md"><code>tfma.view.render_plot</code></a>
-*  <a href="./tfma/view/render_slicing_metrics.md"><code>tfma.view.render_slicing_metrics</code></a>
-*  <a href="./tfma/view/render_time_series.md"><code>tfma.view.render_time_series</code></a>
+*   <a href="./tfma.md"><code>tfma</code></a>
+*   <a href="./tfma/EvaluateAndWriteResults.md"><code>tfma.EvaluateAndWriteResults</code></a>
+*   <a href="./tfma/SingleSliceSpec.md"><code>tfma.slicer.SingleSliceSpec</code></a>
+*   <a href="./tfma/constants.md"><code>tfma.constants</code></a>
+*   <a href="./tfma/export.md"><code>tfma.export</code></a>
+*   <a href="./tfma/export/build_parsing_eval_input_receiver_fn.md"><code>tfma.export.build_parsing_eval_input_receiver_fn</code></a>
+*   <a href="./tfma/export/make_export_strategy.md"><code>tfma.export.make_export_strategy</code></a>
+*   <a href="./tfma/exporter.md"><code>tfma.exporter</code></a>
+*   <a href="./tfma/exporter/FinalExporter.md"><code>tfma.exporter.FinalExporter</code></a>
+*   <a href="./tfma/exporter/LatestExporter.md"><code>tfma.exporter.LatestExporter</code></a>
+*   <a href="./tfma/load_eval_result.md"><code>tfma.load_eval_result</code></a>
+*   <a href="./tfma/load_eval_results.md"><code>tfma.load_eval_results</code></a>
+*   <a href="./tfma/make_eval_results.md"><code>tfma.make_eval_results</code></a>
+*   <a href="./tfma/multiple_data_analysis.md"><code>tfma.multiple_data_analysis</code></a>
+*   <a href="./tfma/multiple_model_analysis.md"><code>tfma.multiple_model_analysis</code></a>
+*   <a href="./tfma/post_export_metrics.md"><code>tfma.post_export_metrics</code></a>
+*   <a href="./tfma/post_export_metrics/auc.md"><code>tfma.post_export_metrics.auc</code></a>
+*   <a href="./tfma/post_export_metrics/auc_plots.md"><code>tfma.post_export_metrics.auc_plots</code></a>
+*   <a href="./tfma/post_export_metrics/calibration_plot_and_prediction_histogram.md"><code>tfma.post_export_metrics.calibration_plot_and_prediction_histogram</code></a>
+*   <a href="./tfma/post_export_metrics/confusion_matrix_at_thresholds.md"><code>tfma.post_export_metrics.confusion_matrix_at_thresholds</code></a>
+*   <a href="./tfma/post_export_metrics/example_count.md"><code>tfma.post_export_metrics.example_count</code></a>
+*   <a href="./tfma/post_export_metrics/example_weight.md"><code>tfma.post_export_metrics.example_weight</code></a>
+*   <a href="./tfma/post_export_metrics/precision_recall_at_k.md"><code>tfma.post_export_metrics.precision_recall_at_k</code></a>
+*   <a href="./tfma/run_model_analysis.md"><code>tfma.run_model_analysis</code></a>
+*   <a href="./tfma/version.md"><code>tfma.version</code></a>
+*   <a href="./tfma/view.md"><code>tfma.view</code></a>
+*   <a href="./tfma/view/render_plot.md"><code>tfma.view.render_plot</code></a>
+*   <a href="./tfma/view/render_slicing_metrics.md"><code>tfma.view.render_slicing_metrics</code></a>
+*   <a href="./tfma/view/render_time_series.md"><code>tfma.view.render_time_series</code></a>
diff --git a/g3doc/api_docs/python/tfma/SingleSliceSpec.md b/g3doc/api_docs/python/tfma/SingleSliceSpec.md
@@ -1,5 +1,5 @@
 <div itemscope itemtype="http://developers.google.com/ReferenceObject">
-<meta itemprop="name" content="tfma.SingleSliceSpec" />
+<meta itemprop="name" content="tfma.slicer.SingleSliceSpec" />
 <meta itemprop="path" content="Stable" />
 <meta itemprop="property" content="__eq__"/>
 <meta itemprop="property" content="__init__"/>
@@ -9,7 +9,7 @@
 <meta itemprop="property" content="is_slice_applicable"/>
 </div>
 
-# tfma.SingleSliceSpec
+# tfma.slicer.SingleSliceSpec
 
 ## Class `SingleSliceSpec`
 

diff --git a/g3doc/api_docs/python/tfma/run_model_analysis.md b/g3doc/api_docs/python/tfma/run_model_analysis.md
@@ -33,8 +33,8 @@ Evaluate PTransform instead.
 * <b>`data_location`</b>: The location of the data files.
 * <b>`file_format`</b>: The file format of the data, can be either 'text' or
     'tfrecords' for now. By default, 'tfrecords' will be used.
-* <b>`slice_spec`</b>: A list of tfma.SingleSliceSpec. Each spec represents a way to
-    slice the data.
+* <b>`slice_spec`</b>: A list of tfma.slicer.SingleSliceSpec. Each spec 
+    represents a way to slice the data.
     Example usages:
     - tfma.SingleSiceSpec(): no slice, metrics are computed on overall data.
     - tfma.SingleSiceSpec(columns=['country']): slice based on features in
@@ -62,4 +62,4 @@ An EvalResult that can be used with the TFMA visualization functions.
 
 #### Raises:
 
-* <b>`ValueError`</b>: If the file_format is unknown to us.
+* <b>`ValueError`</b>: If the file_format is unknown to us.