Releases
v0.5.0
Release SuperBench v0.5.0
SuperBench 0.5.0 Release Notes
Micro-benchmark Improvements
Support NIC only NCCL bandwidth benchmark on single node in NCCL/RCCL bandwidth test.
Support bi-directional bandwidth benchmark in GPU copy bandwidth test.
Support data checking in GPU copy bandwidth test.
Update rccl-tests submodule to fix divide by zero error.
Add GPU-Burn micro-benchmark.
Model-benchmark Improvements
Sync results on root rank for e2e model benchmarks in distributed mode.
Support customized env
in local and torch.distributed mode.
Add support for pytorch>=1.9.0.
Keep BatchNorm as fp32 for pytorch cnn models cast to fp16.
Remove FP16 samples type converting time.
Support FAMBench.
Inference Benchmark Improvements
Revise the default setting for inference benchmark.
Add percentile metrics for inference benchmarks.
Support T4 and A10 in GEMM benchmark.
Add configuration with inference benchmark.
Other Improvements
Add command to support listing all optional parameters for benchmarks.
Unify benchmark naming convention and support multiple tests with same benchmark and different parameters/options in one configuration file.
Support timeout to detect the benchmark failure and stop the process automatically.
Add rocm5.0 dockerfile.
Improve output interface.
Data Diagnosis and Analysis
Support multi-benchmark check.
Support result summary in md, html and excel formats.
Support data diagnosis in md and html formats.
Support result output for all nodes in data diagnosis.
You can’t perform that action at this time.