Skip to content

hxzd5568/TracNe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TracNe

Artifact

Introduction

To diagnose compiler-introduced numerical errors in NN models, we introduce an automated approach TracNe, which consists of two tasks: detecting and tracing DLC-related numerical errors in a given NN model, a valid input range and specific compilation options. The results on two benchmarks show that TracNe is helpful in locating erroneous code and passes for both DL compiler users and developers. It can serve as a unit test for ensuring the model's robustness.

img

Demo

img

Supported Table

Frontend Models ONNX PyTorch TensorFlow2
TVM
XLA 🔨
GLOW 🔨

✅: Supported; 🔨: Developing;

Contents

  • src: Our approach implementation;
    • op: Operators supperted by TracNe that can be extended.
    • gencog: Scripts for generating benchmark. of Wang et al.
  • tests: The interfaces of our approach;
    • out: Benchmark and diagnostic reports for the general NN models.
    • dnn: Handbook and recommended workspace for industrial models.
  • bug: Bug list, TVM bug reports;

Dependency (Can be skipped if you use the docker in the artifact)

Python enviroment

TracNe is written in Python. Run pip install -r requirements.txt to get python dependencies.

Compiler configuration

  • TVM should be built in debugging mode.

Preparing Models and Dataset

One benchmark has been prepared in the tests/out including 60 general NN models.

Industiral models can be downloaded following tests/dnn/readme.md.

Usage

Find error-triggering inputs given a model

cd tests
python test_fuzzer.py  model_dir --low 0 --high 1 --optlevel 5

The tested model should be placed in out/model_dir. After running the python script, the erroneous input triggering maximal errors will be stored in this directory. Augments low and high are the constraints for the input range. Users can control isolation granularity by $granularity$. Default number 64 is enough for error localization. It should be better than 4.

If the model is secure under the selected compilation option and input range, the errors found by the process are zero or less than the tolerance. Otherwise, the model is suspectable to the compilers' optimization.

Error tracing and isolating

python test_replay.py  model_dir

It reproduces the errors by running optimized and un-optimized executable models under searched input. Meanwhile, the process stores concrete results of each function of the models.

python test_traceerror.py  model_dir

It matches corresponding functions between symbolic optimized and up-optimized models and compares the results of each equivalent and paired function. The matching and comparison information are saved in the model_dir/trace.json.

python test_propa.py  model_dir

It backtracks the error-accumulation changes along the calculation graph. For each discrepancy output, it generates an error-accumulation graph from which the generation and amplification of the errors can be clearly understood. If an error arises in function A, then developers can know how A are optimized and transformed when compilation from trace.json.

python test_pass.py  model_dir

This process isolates optimization pass that incurs the numerical errors. Users can disable it to ensure the security and robustness of the model.

Pipeline

python test_batch.py  model_dir1-model_dir9

Above scripts are integrated to a single file which detects and diagnoses numerical errors in a batch of models.

Evaluation

We have provided comparison methods to evaluate the performance and efficiency of the TracNe.

Searching algorithm

python test_fuzzer.py model_dir --method MCMC/DEMC/MEGA

MEGA is our detection algorithm, which MCMC is from Yu et al. and DEMC is devised by Yi et al.

Error localization algorithm

python test_pliner.py model_dir

This method is implemented following Guo et al.

Support New DL Compilers

The utilities for searching and tracing methods can be reused, e.g., mutate_utils.py and fuzzer.py.

What is required for new DL compilers is to update the following:

  • build_workload : function in base_utils.py to compile models and build executable files.
  • run_mod : function in base_utils.py to run executable files.
  • src/pass : passes' name in the DL compiler.
  • src/op : unique operators of the DL compiler.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published