Skip to content

Latest commit

 

History

History
47 lines (36 loc) · 1.96 KB

pytorch_vision_alexnet.md

File metadata and controls

47 lines (36 loc) · 1.96 KB
layout background-class body-class title summary category image author tags github-link featured_image_1 featured_image_2
pytorch_hub_detail
pytorch-hub-background
pytorch-hub
AlexNet
AlexNet competed in the 2012 ImageNet Large Scale Visual Recognition Challenge. The network achieved a top-5 error of 15.3%, more than 10.8 percentage points lower than that of the runner up.
researchers
pytorch-logo.png
Pytorch Team
CV
image classification
alexnet1.png
alexnet2.png

Model Description

AlexNet competed in the ImageNet Large Scale Visual Recognition Challenge on September 30, 2012. The network achieved a top-5 error of 15.3%, more than 10.8 percentage points lower than that of the runner up. The original paper's primary result was that the depth of the model was essential for its high performance, which was computationally expensive, but made feasible due to the utilization of graphics processing units (GPUs) during training.

The 1-crop error rates on the imagenet dataset with the pretrained model are listed below.

Model structure Top-1 error Top-5 error
alexnet 43.45 20.91

Notes on Inputs

All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]. You can use the following transform to normalize:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

Example:

import torch
model = torch.hub.load('pytorch/vision', 'alexnet', pretrained=True)

Resources: