This repository contains the essential code, data, and pre-trained models for the paper Out-of-Distribution Detection via Probabilistic Graphical Model of Deep Neural Network Inference. Some of the codes of baseline methods are adapted from odin-pytorch, deep_Mahalanobis_detector, outlier-exposure and Generalized-ODIN-Implementation.
We run our experiments under the Ubuntu 18.04 GPU server, with Intel Xeon CPUs Nvidia 3080 cards. Also we use python anaconda distribution of the version 3.8, and with the power of conda
the following packages should be easy to install:
- torch: Building and evaluating DNN models.
- scikit-learn: Machine learning algorithms such as GMM.
- numpy: Matrix manipulation.
- scipy: Statistics algorithms.
We provide the pre-trained models for both the DNN models and the LSGM models for reproduce.
We use the existing pre-trained models when possible, i.e. the ResNet and DenseNet for CIFAR are from deep_Mahalanobis_detector. For the TinyImageNet dataset, we use the script and parameter from outlier-exposure. And we also provide the Generalized-ODIN specific model, trained using code from Generalized-ODIN-Implementation. The mentioned models are saved to directory godin_pretrained
and pre_trained
.
The LSGM models are in the checkpoint
directory, which can be loaded easily with command line options.
Some out-of-distribution datasets we use are from torchvision, such as CIFARs. For the others we provide the download links here for convenient (and for reducing the size of this repository):
- Tiny-ImageNet: The miniature of ImageNet dataset.
- LSUN: Large-scale scene images.
- Textures: Textural images in the wild.
- iSUN: Gaze traces on SUN dataset images.
$ cd CIFAR # or TinyImageNet, etc.
# model: DenseNet, in-distribution: CIFAR-100, batch_size: 200
$ python test_lsgm.py --architecture densenet --dataset cifar100 --test_bs 200
# model: ResNet, in-distribution: CIFAR-10, load the pre-trained model
$ python test_lsgm.py --architecture resnet --dataset cifar100 --load=./checkpoint
If successful, the script will test all OOD datasets with the given configuration.
We provide the thorough experimental results as followed, which are the average results of 5 times. We achieve the better performance compared to existing methods, which has been discussed in detail in our paper.