Skip to content

Release SuperBench v0.5.0

Compare
Choose a tag to compare
@abuccts abuccts released this 29 Apr 02:56
· 168 commits to main since this release
7f607e4

SuperBench 0.5.0 Release Notes

Micro-benchmark Improvements

  • Support NIC only NCCL bandwidth benchmark on single node in NCCL/RCCL bandwidth test.
  • Support bi-directional bandwidth benchmark in GPU copy bandwidth test.
  • Support data checking in GPU copy bandwidth test.
  • Update rccl-tests submodule to fix divide by zero error.
  • Add GPU-Burn micro-benchmark.

Model-benchmark Improvements

  • Sync results on root rank for e2e model benchmarks in distributed mode.
  • Support customized env in local and torch.distributed mode.
  • Add support for pytorch>=1.9.0.
  • Keep BatchNorm as fp32 for pytorch cnn models cast to fp16.
  • Remove FP16 samples type converting time.
  • Support FAMBench.

Inference Benchmark Improvements

  • Revise the default setting for inference benchmark.
  • Add percentile metrics for inference benchmarks.
  • Support T4 and A10 in GEMM benchmark.
  • Add configuration with inference benchmark.

Other Improvements

  • Add command to support listing all optional parameters for benchmarks.
  • Unify benchmark naming convention and support multiple tests with same benchmark and different parameters/options in one configuration file.
  • Support timeout to detect the benchmark failure and stop the process automatically.
  • Add rocm5.0 dockerfile.
  • Improve output interface.

Data Diagnosis and Analysis

  • Support multi-benchmark check.
  • Support result summary in md, html and excel formats.
  • Support data diagnosis in md and html formats.
  • Support result output for all nodes in data diagnosis.