Stars
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Winograd minimal convolution algorithm generator for convolutional neural networks.
This repository contains the Simple As Possible Floating Point Unit design based on the IEEE-754 Standard.
Seamless analysis of your PyTorch models (RAM usage, FLOPs, MACs, receptive field, etc.)
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
A Python package for testing hardware (part of the magma ecosystem)
Tutorial notebooks for hls4ml
Brevitas: neural network quantization in PyTorch
[CVPR 2022 Oral] Official repository for "MAXIM: Multi-Axis MLP for Image Processing". SOTA for denoising, deblurring, deraining, dehazing, and enhancement.
NSGA-Net, a Neural Architecture Search Algorithm