Add jupyter notebook for AutoQuant 2.0 (#1967)

Signed-off-by: Kyunggeun Lee <[email protected]> Co-authored-by: Kyunggeun Lee <[email protected]>
quic · Feb 27, 2023 · 247161b · 247161b
1 parent 8ff1477
commit 247161b
Show file tree

Hide file tree

Showing 4 changed files with 578 additions and 113 deletions.
diff --git a/Examples/torch/quantization/autoquant.ipynb b/Examples/torch/quantization/autoquant.ipynb
@@ -21,14 +21,17 @@
     "This notebook covers the following\n",
     "1. Instantiate the example evaluation and training pipeline\n",
     "2. Load a pretrained FP32 model\n",
-    "3. Prepare the model\n",
-    "4. Validate the model\n",
-    "5. Determine the baseline FP32 accuracy\n",
-    "6. Define constants and helper functions\n",
-    "7. Apply AutoQuant\n",
+    "3. Determine the baseline FP32 accuracy\n",
+    "4. Define constants and helper functions\n",
+    "5. Run AutoQuant\n",
     "\n",
     "#### What this notebook is not\n",
-    "* This notebook is not designed to show state-of-the-art AutoQuant results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly."
+    "This notebook is not designed to show state-of-the-art AutoQuant results. For example, it uses a relatively quantization-friendly model like Resnet18. Also, some optimization parameters are deliberately chosen to have the notebook execute more quickly.\n",
+    "\n",
+    "<div class=\"alert alert-info\">\n",
+    "    NOTE: This notebook is for auto_quant_v2.AutoQuant. For examples of auto_quant.AutoQuant (will be deprecated), see \n",
+    "    <a href=\"autoquant_v1.ipynb\">autoquant_v1.ipynb</a>.\n",
+    "</div>"
    ]
   },
   {
@@ -91,6 +94,8 @@
    "outputs": [],
    "source": [
     "import os\n",
+    "import sys\n",
+    "sys.path.append(\"../../..\")\n",
     "import torch\n",
     "from Examples.common import image_net_config\n",
     "from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator\n",
@@ -166,60 +171,19 @@
    "source": [
     "from torchvision.models import resnet18\n",
     "\n",
-    "model = resnet18(pretrained=True)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 3. Prepare the Model\n",
-    "AIMET quantization features require the user's model definition to follow certain guidelines. For example, functionals defined in forward pass should be changed to equivalent torch.nn.Module. AIMET user guide lists all these guidelines. The following ModelPreparer API uses new graph transformation feature available in PyTorch 1.9+ version and automates model definition changes required to comply with the above guidelines."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from aimet_torch.model_preparer import prepare_model\n",
+    "model = resnet18(pretrained=True).eval()\n",
     "\n",
-    "model = prepare_model(model)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 4. Validate the model\n",
-    "\n",
-    "AIMET provides a model validator utility to help check whether AIMET feature can be applied on a Pytorch model. The model validator currently checks for the following conditions:\n",
-    "- No modules are reused\n",
-    "- Operations have modules associated with them\n",
-    "- Opeartions are not defined as Functionals\n",
-    "\n",
-    "It is recommeneded to use the validate_model() API after using the prepare_model() API to identify any problem areas that were not automatically addressed by the model_preparer() API."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "scrolled": true
-   },
-   "outputs": [],
-   "source": [
-    "from aimet_torch.model_validator.model_validator import ModelValidator\n",
-    "input_shape = (1, 3, 224, 224)\n",
-    "result = ModelValidator.validate_model(model, model_input=torch.rand(input_shape))"
+    "use_cuda = False\n",
+    "if torch.cuda.is_available():\n",
+    "    use_cuda = True\n",
+    "    model.to(torch.device('cuda'))"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## 5.  Determine the baseline FP32 accuracy\n",
+    "## 3.  Determine the baseline FP32 accuracy\n",
     "Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine"
    ]
   },
@@ -230,22 +194,6 @@
     "We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed."
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "use_cuda = False\n",
-    "if torch.cuda.is_available():\n",
-    "    use_cuda = True\n",
-    "    model.to(torch.device('cuda'))"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -264,7 +212,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## 6. Define Constants and Helper functions\n",
+    "## 4. Define Constants and Helper functions\n",
     "\n",
     "In this section the constants and helper functions needed to run this eaxmple are defined.\n",
     "\n",
@@ -348,9 +296,9 @@
     "             normalize])\n",
     "\n",
     "from Examples.torch.utils.image_net_data_loader import ImageFolder\n",
-    "imagenet_dataset = ImageFolder(root=os.path.join(DATASET_DIR, 'val'), transform=val_transforms, num_samples_per_class=2)\n",
+    "imagenet_dataset = ImageFolder(root=os.path.join(DATASET_DIR, 'val'), transform=val_transforms)\n",
     "unlabeled_imagenet_dataset = UnlabeledDatasetWrapper(imagenet_dataset)\n",
-    "unlabeled_imagenet_data_loader = _create_sampled_data_loader(unlabeled_imagenet_dataset, CALIBRATION_DATASET_SIZE)\n"
+    "unlabeled_imagenet_data_loader = _create_sampled_data_loader(unlabeled_imagenet_dataset, CALIBRATION_DATASET_SIZE)"
    ]
   },
   {
@@ -371,19 +319,15 @@
     "from typing import Optional\n",
     "\n",
     "def eval_callback(model: torch.nn.Module, num_samples: Optional[int] = None) -> float:\n",
-    "    return ImageNetDataPipeline.evaluate(model, use_cuda)\n",
-    "    "
+    "    return ImageNetDataPipeline.evaluate(model, use_cuda)"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## 7. Apply AutoQuant\n",
-    "\n",
-    "As a first step, the AutoQuant object is created.\n",
-    "\n",
-    "The **allowed_accuracy_drop** parameter is set by the user to convey to the AutoQuant feature, how much accuracy drop is tolerated by the user. AutoQuant applies a series of quantization features. When the allowed accuracy is reached, AutoQuant stops applying any subsequent quantization feature. Please refer AutoQuant User Guide and API documentation for complete details."
+    "## 5. Run AutoQuant\n",
+    "This step runs AuotQuant."
    ]
   },
   {
@@ -392,18 +336,23 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "from aimet_torch.auto_quant import AutoQuant\n",
+    "from aimet_torch.auto_quant_v2 import AutoQuant\n",
+    "\n",
+    "dummy_input = torch.randn((1, 3, 224, 224))\n",
+    "if use_cuda:\n",
+    "    dummy_input = dummy_input.cuda()\n",
     "\n",
-    "auto_quant = AutoQuant(allowed_accuracy_drop=0.01,\n",
-    "                       unlabeled_dataset_iterable=unlabeled_imagenet_data_loader,\n",
+    "auto_quant = AutoQuant(model,\n",
+    "                       dummy_input=dummy_input,\n",
+    "                       data_loader=unlabeled_imagenet_data_loader,\n",
     "                       eval_callback=eval_callback)"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Optionally set AdaRound Parameters\n",
+    "## Set AdaRound Parameters (optional)\n",
     "The AutoQuant feature internally uses default parameters to execute the AdaRound step.\n",
     "If and only if necessary, the default AdaRound Parameters should be modified using the API shown below.\n",
     "\n",
@@ -434,35 +383,39 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Run AutoQuant\n",
-    "\n",
-    "This step applies the AutoQuant feature. The best possible quantized model, the associated eval_score and the path to the AdaRound encoding files are returned."
+    "## Run AutoQuant Inference\n",
+    "This step runs AutoQuant inference. AutoQuant inference will run evaluation using the **eval_callback** with the vanilla quantized model without applying PTQ techniques. This will be useful for measuring the baseline evaluation score before running AutoQuant optimization."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {
-    "scrolled": false
-   },
+   "metadata": {},
    "outputs": [],
    "source": [
-    "dummy_input = torch.randn(input_shape)\n",
-    "model.eval()\n",
-    "model, accuracy, encoding_path =\\\n",
-    "    auto_quant.apply(model.cuda(),\n",
-    "                     dummy_input_on_cpu=dummy_input.cpu(),\n",
-    "                     dummy_input_on_gpu=dummy_input.cuda())"
+    "sim, initial_accuracy = auto_quant.run_inference()\n",
+    "print(f\"- Quantized Accuracy (before optimization): {initial_accuracy}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Run AutoQuant Optimization\n",
+    "This step runs AutoQuant optimization, which returns the best possible quantized model, corresponding evaluation score and the path to the encoding file.\n",
+    "The **allowed_accuracy_drop** parameter indicates the tolerable amount of accuracy drop. AutoQuant applies a series of quantization features until the target accuracy (FP32 accuracy - allowed accuracy drop) is satisfied. When the target accuracy is reached, AutoQuant will return immediately without applying furhter PTQ techniques. Please refer AutoQuant User Guide and API documentation for complete details."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {},
+   "metadata": {
+    "scrolled": false
+   },
    "outputs": [],
    "source": [
-    "print(accuracy)\n",
-    "print(encoding_path)"
+    "model, optimized_accuracy, encoding_path = auto_quant.optimize(allowed_accuracy_drop=0.01)\n",
+    "print(f\"- Quantized Accuracy (after optimization):  {optimized_accuracy}\")"
    ]
   },
   {