review updates

mamunsyuhada · May 4, 2021 · 6080b96 · 6080b96
1 parent 4396418
commit 6080b96
Show file tree

Hide file tree

Showing 7 changed files with 440 additions and 405 deletions.
diff --git a/python/example_code/lookoutvision/README.md b/python/example_code/lookoutvision/README.md
@@ -4,7 +4,7 @@
 
 Shows how to use the AWS SDK for Python (Boto3) with Amazon Lookout for Vision to
 create a model that detects anomalies in images. Examples are used in the 
-service documentation - https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/what-is.html.
+service documentation - [Amazon Lookout for Vision Developer Guide](https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/what-is.html).
 
 * Create a project to manage your model.
 * Add a training dataset (and optional test dataset) that's used to train the model.
@@ -19,7 +19,6 @@ service documentation - https://docs.aws.amazon.com/lookout-for-vision/latest/de
   Credentials Reference Guide](https://docs.aws.amazon.com/credref/latest/refdocs/creds-config-files.html).
 - Python 3.7 or later
 - Boto3 1.17.47 or later
-- argparse 3.9 or later
 
 ## Cautions
 
@@ -39,21 +38,21 @@ service documentation - https://docs.aws.amazon.com/lookout-for-vision/latest/de
 
 There are three demonstrations in this set of examples:
 
-* Creating and hosting a model.
-* Detecting anomalies in images using a model.
+* Create and host a model.
+* Detect anomalies in images using a model.
 * Find tags attached to a model.
 
 Before running these demonstrations do the following:
 - Follow the [setup instructions](https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/su-set-up.html).
-- Read the [Getting started with the SDK](https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/getting-started-sdk.html).
-- Create an an Amazon S3 bucket in your AWS account. You'll use the bucket to store your training images, manifest files, and training output.
+- Read [Getting started with the SDK](https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/getting-started-sdk.html).
+- Create an Amazon S3 bucket in your AWS account. You'll use the bucket to store your training images, manifest files, and training output.
 - Copy your training and test images to your S3 bucket. To try this code with example images, You can use the example [circuit board dataset](https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/su-prepare-example-images.html). 
 
 The folder structures for the training and test images must be as follows:
 ```
-    s3://my-bucket/<train or test>/
-        normal/
-        anomaly/
+s3://doc-example-bucket/<train or test>/
+    normal/
+    anomaly/
 ```
 `train` and `test` can be any folder path.
 Place normal images in the `normal` folder. Anomalous images in the `anomaly` folder.
@@ -68,20 +67,20 @@ python train_host.py <project> <bucket> <train> <test>
 ``` 
 
 - `project` - A name for your project.
-- `bucket` - The name of the S3 bucket in which to store your manifest files and training output (The bucket must be in your AWS account and in the same AWS Region as the S3 path supplied for `train` and `test`). For example, `my-bucket`.
-- `train` - The S3 path to where your training images are stored. For example, `s3://my-bucket/circuitboard/train/`.
-- `test` - (Optional) the S3 path to where your test images are stored. For example, `s3://my-bucket/circuitboard/test/`. If you don't supply a value, 
+- `bucket` - The name of the S3 bucket in which to store your manifest files and training output (The bucket must be in your AWS account and in the same AWS Region as the S3 path supplied for `train` and `test`). For example, `doc-example-bucket`.
+- `train` - The S3 path to where your training images are stored. For example, `s3://doc-example-bucket/circuitboard/train/`.
+- `test` - (Optional) the S3 path to where your test images are stored. For example, `s3://doc-example-bucket/circuitboard/test/`. If you don't supply a value, 
 Lookout for Vision splits the training dataset to create a test dataset.
 
 After training completes, use the performance metrics to decide if the model's performance is acceptable.
-For more information, see https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/improve.html.
+For more information, see [Improving your model](https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/improve.html).
 If you are satisfied with the model's performance, the code allows you to start the model. 
 After the model starts, use `inference.py` to analyze an image.
 
 **You are charged for the amount of time that your model is running and for the time taken to successfully train your model.**
 
 
-### Detecting anomalies in images using a trained model
+### Detect anomalies in images using a trained model
 
 Shows how to detect anomalies in an image by using a trained model. 
 Run this example at a command prompt with the following command.
@@ -91,7 +90,7 @@ python inference.py <project> <version> <image>
 ``` 
 - `project` - The project that contains the model that you want to use.
 - `version` - The version of the model that you want to use.
-- `image` - The image that you want to analyze. You can supply a JPEG for PNG format file. You can also supply the S3 path of an image stored in an S3 bucket.
+- `image` - The image that you want to analyze. You can supply a JPEG or PNG format file. You can also supply the S3 path of an image stored in an S3 bucket.
 
 
 ### Find tags
@@ -103,7 +102,7 @@ python find_tag.py <tag> <value>
 - tag - The key of the tag that you want to find.
 - value - The value of the tag that you want to find.
 
- For more information about tags, see https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/tagging-model.html. 
+ For more information about tags, see [Tagging models](https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/tagging-model.html). 
 
 ## Example structure
 
@@ -122,7 +121,7 @@ A simple class that shows how to create and manage an Amazon Lookout for Vision
 A simple class that shows how to create and manage datasets. Also shows how to create a manifest file based on images found in an S3 bucket. Used by `train_host.py`.
 
 Manifest files are used to create training and test datasets. `train_host.py` uses `datasets.py` to create training and (optionally) test manifest files, and upload them to an S3 bucket location that you specify. 
-For more information about manifest files, see https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/create-dataset-ground-truth.html. 
+For more information about manifest files, see [Creating a dataset using an Amazon SageMaker Ground Truth manifest file](https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/create-dataset-ground-truth.html). 
 
 ### hosting.py
 A simple class that shows how to start and stop an Amazon Lookout for Vision project. Also shows how to list hosted 
@@ -135,7 +134,7 @@ A simple class that shows how to analyze an image (JPEG/PNG) with a hosted Amazo
 Shows how to find a tag attached to an Amazon Lookout for Vision model.
 
 ### train_host.py
-Shows how to create and hosts a model. The code creates a project, creates a manifest file, creates
+Shows how to create and host a model. The code creates a project, creates a manifest file, creates
 a dataset using the manifest file,  and trains a model. Finally, if desired, the example shows how to host the model. Used by `train_host.py`.
 
 ## Additional information

diff --git a/python/example_code/lookoutvision/datasets.py b/python/example_code/lookoutvision/datasets.py
@@ -1,7 +1,7 @@
 # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
 # SPDX-License-Identifier: Apache-2.0
 """
-Amazon lookout for vision dataset code examples used in the service documentation:
+Amazon lookout for Vision dataset code examples used in the service documentation:
 https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/model-create-dataset.html
 Shows how to create and manage datasets. Also, how to create a manifest file and
 upload to an S3 bucket.
@@ -27,11 +27,12 @@ class Datasets:
     def create_dataset(lookoutvision_client, project_name, manifest_file, dataset_type):
         """
         Creates a new Amazon Lookout for Vision dataset
-        param: lookoutvision_client: The Amazon Lookout for Vision boto 3 client.
-        param: project_name: The name of the project in which you want to create a dataset.
-        param: bucket:  The bucket that contains the manifest file.
-        param: manifest_file: The path and name of the manifest file.
-        param: dataset_type: The type of the dataset (train or test).
+        :param lookoutvision_client: The Amazon Lookout for Vision Boto3 client.
+        :param project_name: The name of the project in which you want to
+         create a dataset.
+        :param bucket:  The bucket that contains the manifest file.
+        :param manifest_file: The path and name of the manifest file.
+        :param dataset_type: The type of the dataset (train or test).
         """
 
         try:
@@ -41,13 +42,9 @@ def create_dataset(lookoutvision_client, project_name, manifest_file, dataset_ty
             # Create a dataset
             logger.info("Creating %s dataset type...", dataset_type)
 
-            dataset = json.loads(
-                '{ "GroundTruthManifest": { "S3Object": { "Bucket": "'
-                + bucket
-                + '", "Key": "'
-                + key
-                + '" } } }'
-            )
+            dataset = {
+                "GroundTruthManifest": {"S3Object": {"Bucket": bucket, "Key": key}}
+            }
 
             response = lookoutvision_client.create_dataset(
                 ProjectName=project_name,
@@ -89,14 +86,9 @@ def create_dataset(lookoutvision_client, project_name, manifest_file, dataset_ty
                 finished = True
 
             if status != "CREATE_COMPLETE":
-                logger.exception(
-                    "Couldn't create dataset: %s",
-                    dataset_description["DatasetDescription"]["StatusMessage"],
-                )
-                raise Exception(
-                    "Couldn't create dataset: {}".format(
-                    dataset_description["DatasetDescription"]["StatusMessage"],
-                ))
+                message = dataset_description["DatasetDescription"]["StatusMessage"]
+                logger.exception("Couldn't create dataset: %s", message)
+                raise Exception(f"Couldn't create dataset: {message}")
 
         except ClientError as err:
             logger.exception(
@@ -105,24 +97,26 @@ def create_dataset(lookoutvision_client, project_name, manifest_file, dataset_ty
             raise
 
     @staticmethod
-    def create_manifest_file_s3(image_s3_path, manifest_s3_path):
+    def create_manifest_file_s3(s3_resource, image_s3_path, manifest_s3_path):
         """
         Creates a manifest file and uploads to S3.
-        param: image_s3_path: The S3 path to the images referenced by the manifest file. The images
-        must be in an S3 bucket with the following folder structure.
+        :param image_s3_path: The S3 path to the images referenced by the manifest file.
+        The images must be in an S3 bucket with the following folder structure.
         s3://my-bucket/<train or test>/
             normal/
             anomaly/
-        Place normal images in the normal folder. Anomalous images in the anomaly folder.
+        Place normal images in the normal folder. Anomalous images in the anomaly
+        folder.
         https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/create-dataset-s3.html
-        param: manifest_s3_path:  The S3 location in which to store the created manifest file.
+        :param manifest_s3_path: The S3 location in which to store the created
+        manifest file.
         """
 
         try:
             output_manifest_file = "temp.manifest"
 
             # Current date and time in manifest file format
-            #now=datetime.now()
+
             dttm = datetime.now().strftime("%Y-%m-%dT%H:%M:%S.%f")
 
             # get bucket and folder from image and manifest file paths
@@ -132,62 +126,63 @@ def create_manifest_file_s3(image_s3_path, manifest_s3_path):
                 "s3://", ""
             ).split("/", 1)
 
-            s3_client = boto3.client("s3")
-
             # create local temp manifest file
             with open(output_manifest_file, "w") as mfile:
 
                 logger.info("Creating manifest file")
                 # create JSON lines for anomalous images
-                response = s3_client.list_objects_v2(
-                    Bucket=bucket, Prefix=prefix + "anomaly/", Delimiter="/"
-                )
-                for file in response["Contents"]:
-                    image_path = "s3://{}/{}".format(bucket, file["Key"])
-                    manifest = Datasets.create_json_line(image_path, 1, dttm)
+
+                src_bucket = s3_resource.Bucket(bucket)
+                # create json lines for abnormal images.
+
+                for obj in src_bucket.objects.filter(
+                    Prefix=prefix + "anomaly/", Delimiter="/"
+                ):
+                    image_path = f"s3://{src_bucket.name}/{obj.key}"
+                    manifest = Datasets.create_json_line(image_path, "anomaly", dttm)
                     mfile.write(json.dumps(manifest) + "\n")
+
                 # create json lines for normal images
-                response = s3_client.list_objects_v2(
-                    Bucket=bucket, Prefix=prefix + "normal/", Delimiter="/"
-                )
-                for file in response["Contents"]:
-                    image_path = "s3://{}/{}".format(bucket, file["Key"])
-                    manifest = Datasets.create_json_line(image_path, 0, dttm)
+                for obj in src_bucket.objects.filter(
+                    Prefix=prefix + "normal/", Delimiter="/"
+                ):
+                    image_path = f"s3://{src_bucket.name}/{obj.key}"
+                    manifest = Datasets.create_json_line(image_path, "normal", dttm)
                     mfile.write(json.dumps(manifest) + "\n")
 
             # copy local manifest to target S3 location
             logger.info("Uploading manifest file to %s", manifest_s3_path)
-            response = s3_client.upload_file(
-                output_manifest_file, manifest_bucket, manifest_prefix
+            s3_resource.Bucket(manifest_bucket).upload_file(
+                output_manifest_file, manifest_prefix
             )
+
             # delete local manifest file
             os.remove(output_manifest_file)
 
         except ClientError as err:
-            print("S3 Service Error: {}".format(err))
+            logger.exception("S3 Service Error: %s", format(err))
             raise
 
         except Exception as err:
-            print(err)
+            logger.exception(format(err))
             raise
         else:
             logger.info("Completed manifest file creation and upload.")
 
     @staticmethod
-    def create_json_line(image, label, dttm):
+    def create_json_line(image, class_name, dttm):
         """
         Creates a single JSON line for an image.
-        param: image: The S3 location for the image.
-        param: label: The label for the image (normal or anomaly)
-        param: dttm: The date and time that the JSON is created.
+        :param image: The S3 location for the image.
+        :param label: The label for the image (normal or anomaly)
+        :param dttm: The date and time that the JSON is created.
         """
 
-        class_name = ""
-
-        if label == 0:
-            class_name = "normal"
-        elif label == 1:
-            class_name = "anomaly"
+        label = 0
+        if class_name == "normal":
+            label = 0
+        elif class_name == "anomaly":
+            label = 1
         else:
             logger.exception("Unexpected label value: %s for %s", str(label), image)
 
@@ -197,10 +192,10 @@ def create_json_line(image, label, dttm):
 
         manifest = {
             "source-ref": image,
-            "auto-label": label,
-            "auto-label-metadata": {
+            "anomaly-label": label,
+            "anomaly-label-metadata": {
                 "confidence": 1,
-                "job-name": "labeling-job/auto-label",
+                "job-name": "labeling-job/anomaly-label",
                 "class-name": class_name,
                 "human-annotated": "yes",
                 "creation-date": dttm,
@@ -213,38 +208,37 @@ def create_json_line(image, label, dttm):
     def delete_dataset(lookoutvision_client, project_name, dataset_type):
         """
         Deletes an Amazon Lookout for Vision dataset
-        param: lookoutvision_client: The Amazon Lookout for Vision boto 3 client.
-        param: project_name: The name of the project that contains the dataset that
+        :param lookoutvision_client: The Amazon Lookout for Vision Boto3 client.
+        :param project_name: The name of the project that contains the dataset that
         you want to delete.
-        param: dataset_type: The type (train or test) of the dataset that you
+        :param dataset_type: The type (train or test) of the dataset that you
         want to delete.
         """
         try:
-            lookoutvision_client = boto3.client("lookoutvision")
 
             # Delete the dataset
             logger.info(
-                "Deleting the " + dataset_type + " dataset for project " + project_name
+                "Deleting the %s dataset for project %s.", dataset_type, project_name
             )
             lookoutvision_client.delete_dataset(
                 ProjectName=project_name, DatasetType=dataset_type
             )
-            logger.info("Dataset deleted")
+            logger.info("Dataset deleted.")
 
         except ClientError as err:
             logger.exception(
-                "Service error: Couldn't delete dataset: %s", err.response["Message"]
+                "Service error: Couldn't delete dataset: %s.", err.response["Message"]
             )
             raise
 
     @staticmethod
     def describe_dataset(lookoutvision_client, project_name, dataset_type):
         """
         Gets information about an Amazon Lookout for Vision dataset.
-        param: lookoutvision_client: The Amazon Lookout for Vision boto3 client.
-        param: project_name: The name of the project that contains the dataset that
+        :param lookoutvision_client: The Amazon Lookout for Vision Boto3 client.
+        :param project_name: The name of the project that contains the dataset that
         you want to describe.
-        param: dataset_type: The type (train or test) of the dataset that you want
+        :param dataset_type: The type (train or test) of the dataset that you want
         to describe.
         """
 
@@ -254,23 +248,21 @@ def describe_dataset(lookoutvision_client, project_name, dataset_type):
             response = lookoutvision_client.describe_dataset(
                 ProjectName=project_name, DatasetType=dataset_type
             )
-            print("Name: " + response["DatasetDescription"]["ProjectName"])
-            print("Type: " + response["DatasetDescription"]["DatasetType"])
-            print("Status: " + response["DatasetDescription"]["Status"])
-            print("Message: " + response["DatasetDescription"]["StatusMessage"])
+            print(f"Name: {response['DatasetDescription']['ProjectName']}")
+            print(f"Type: {response['DatasetDescription']['DatasetType']}")
+            print(f"Status: {response['DatasetDescription']['Status']}")
+            print(f"Message: {response['DatasetDescription']['StatusMessage']}")
             print(
-                "Images: " + str(response["DatasetDescription"]["ImageStats"]["Total"])
+                f"Images: {str(response['DatasetDescription']['ImageStats']['Total'])}"
             )
             print(
-                "Labeled: "
-                + str(response["DatasetDescription"]["ImageStats"]["Labeled"])
+                f"Labeled: {str(response['DatasetDescription']['ImageStats']['Labeled'])}"
             )
             print(
-                "Normal: " + str(response["DatasetDescription"]["ImageStats"]["Normal"])
+                f"Normal: {str(response['DatasetDescription']['ImageStats']['Normal'])}"
             )
             print(
-                "Anomaly: "
-                + str(response["DatasetDescription"]["ImageStats"]["Anomaly"])
+                f"Anomaly: {str(response['DatasetDescription']['ImageStats']['Anomaly'])}"
             )
 
             print("Done...")