TPP MLIR

This is an experiment in using MLIR to automatically select the best Tensor Processing Primitives for linear algebra.

This repository contains an out-of-tree MLIR dialect as well as an opt-like tool to operate on that dialect and a runner-like tool to execute and benchmark MLIR kernels.

It also contains the recipes to use LIBXSMM from inside MLIR and can be used by other tools to drive our passes.

There's work in progress inside IREE to use this work on their pipeline.

This repository was previously called tpp-sandbox. If you have a checkout with the previous name, please follow these instructions to rename the remote locally.

Build Status

Build	Status
Tests
Benchmarks

How to setup the environment

In order to build LLVM and TPP-MLIR, several software development tools such as git, cmake, compilers, etc. are needed. As each operating system has its own package manager and package names, we opted for providing instructions for the user-level package manager conda. This environment has been successfully tested on top of a Fedora Server minimal installation with less than 400 system-wide packages being installed.

Initial Setup (using Conda):

export TPPMLIR_WORKSPACE_DIR=/foo
cd ${TPPMLIR_WORKSPACE_DIR}
export ARCH_NAME=$(uname -m)
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-${ARCH_NAME}.sh
bash Miniconda3-latest-Linux-${ARCH_NAME}.sh -b -p ${TPPMLIR_WORKSPACE_DIR}/miniconda3
eval "$(${TPPMLIR_WORKSPACE_DIR}/miniconda3/bin/conda shell.bash hook)"
conda activate

conda install -y cmake ninja git clang clangxx llvm lld llvm-openmp llvm-tools binutils
if [ "${ARCH_NAME}" == "aarch64" ]; then
   conda install -y gcc_linux-aarch64 gxx_linux-aarch64
elif [ "${ARCH_NAME}" == "x86_64" ]; then
   conda install -y gcc_linux-64 gxx_linux-64
fi
python -m pip install coloredlogs

Reloading the environment after conda deactivate/logout/reboot:

export TPPMLIR_WORKSPACE_DIR=/foo
cd ${TPPMLIR_WORKSPACE_DIR}
eval "$(${TPPMLIR_WORKSPACE_DIR}/miniconda3/bin/conda shell.bash hook)"
conda activate

How to build LLVM

# Clone
git clone https://github.com/llvm/llvm-project.git

# checking out a tpp-mlir compatible version of llvm-project
wget https://raw.githubusercontent.com/plaidml/tpp-mlir/main/build_tools/llvm_version.txt
pushd llvm-project
git checkout `cat ../llvm_version.txt`
popd
rm llvm_version.txt

# create build dir
mkdir llvm-project/build
pushd llvm-project/build

# This is important for the next step
export CUSTOM_LLVM_ROOT=`pwd`
echo $CUSTOM_LLVM_ROOT
export PATH=$CUSTOM_LLVM_ROOT/bin:$PATH

# Configure Build
cmake -G Ninja ../llvm \
   -DLLVM_ENABLE_PROJECTS="mlir" \
   -DLLVM_BUILD_EXAMPLES=ON \
   -DLLVM_INSTALL_UTILS=ON \
   -DLLVM_TARGETS_TO_BUILD="host" \
   -DCMAKE_BUILD_TYPE=RelWithDebInfo \
   -DLLVM_ENABLE_ASSERTIONS=ON \
   -DCMAKE_C_COMPILER=clang \
   -DCMAKE_CXX_COMPILER=clang++ \
   -DLLVM_USE_LINKER=lld

# Build
ninja 

popd

How to build TPP MLIR

This setup assumes that you have built LLVM and MLIR in $CUSTOM_LLVM_ROOT as above.

Note: OpenMP is a requirement to get multi-threaded performance on our code. If you don't want to build with OpenMP, disable with the CMake flag -DUSE_OpenMP=False.

# Clone
git clone https://github.com/plaidml/tpp-mlir.git
mkdir tpp-mlir/build
pushd tpp-mlir/build

# Build & test
# Please, make sure to use clang to build TPP-MLIR
cmake -G Ninja .. \
   -DCMAKE_BUILD_TYPE=RelWithDebInfo \
   -DMLIR_DIR=$CUSTOM_LLVM_ROOT/lib/cmake/mlir \
   -DLLVM_EXTERNAL_LIT=$CUSTOM_LLVM_ROOT/bin/llvm-lit \
   -DCMAKE_C_COMPILER=clang \
   -DCMAKE_CXX_COMPILER=clang++ 
cmake --build . --target check-all

popd

To build the documentation from the TableGen description of the dialect operations, run:

cmake --build . --target mlir-doc

To enable experimental GPU support see: GPU/README.md

License

This dialect template is made available under the Apache License 2.0 with LLVM Exceptions. See the LICENSE.txt file for more details.

References

BRGEMM: High-Performance Deep Learning via a Single Building Block (2019)

TPP: Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning & HPC Workloads (2021)

PARLOOPER: Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures (2023)

Name		Name	Last commit message	Last commit date
Latest commit History 1,111 Commits
benchmarks		benchmarks
build_tools		build_tools
cmake/modules		cmake/modules
docs		docs
include		include
lib		lib
runtime		runtime
scripts		scripts
test		test
tools		tools
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TPP MLIR

Build Status

How to setup the environment

How to build LLVM

How to build TPP MLIR

License

References

About

Releases

Packages

Languages

License

hfp/tpp-mlir

Folders and files

Latest commit

History

Repository files navigation

TPP MLIR

Build Status

How to setup the environment

How to build LLVM

How to build TPP MLIR

License

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages