Forward is a library for high performance deep learning inference on NVIDIA GPUs. It provides a well-designed scheme that directly parse Tensorflow/PyTorch/Keras models to high-performance engine based on TensorRT. Compared to TensorRT, it is easy-to-use and easy-to-expand. So far, Forward supports not only mainstream deep learning models in CV, NLP and Recommend fields, but also some advanced models such as BERT, GAN, FaceSwap, StyleTransfer.
- Utilize TensorRT API and customized operators for high-performance deep learning inference.
- Support not only mainstream deep learning models in CV, NLP and Recommend fields, but also advanced models such as BERT, GAN, FaceSwap, StyleTransfer.
- Support FLOAT/HALF/INT8 infer modes.
- Easy to use: Load directly Tensorflow(.pb)/PyTorch(.pth)/Keras(.h5) models and then do inference with TensorRT.
- Easy to expand: Register customized layers refer to add_support_op.md.
- Provide C++ and Python interfaces.
- NVIDIA CUDA >= 10.0, CuDNN >= 7 (Recommended version: CUDA 10.2 )
- TensorRT >=, (Recommended version: TensorRT-
- CMake >= 3.10.1
- GCC >= 5.4.0, ld >= 2.26.1
- (Pytorch) pytorch == 1.3.1
- (Tensorflow) TensorFlow == 1.15.0 (download Tensorflow 1.15.0 and unzip it to
) - (Keras) HDF 5
Generate Makefiles or VS project (Windows) and build. Forward can be built for different framework, such as Fwd-Torch, Fwd-Python-Torch, Fwd-Tf, Fwd-Python-Tf, Fwd-Keras, Fwd-Python-Keras, which controlled by CMake options. For example, Fwd-Python-Tf is built as below.
mkdir build
cd build
cmake .. \
-DTensorRT_ROOT=/path/to/TensorRT \
make -j
[Required]: Path to the TensorRT installation directory containing libraries- More CMake options refer to CMake Options
When the project is built, unit_test can be used to verify the project is successfully built.
cd build/bin
./unit_test --gtest_filter=TestTfNodes.*
When the project is successfully built, the Forward-Python library can be found in the build/bin
directory, named as forward.cpython.xxx*.so
in Linux or forward.xxx*.pyd
in Windows. Forward-Python library should be copied to the workspace directory of Python project. For example, the directory is organized as:
---- workspace
-- test.py
-- forward.cpython.xxx*.so
Then, test.py
can import Forward to perform high performance deep learning inference.
# test.py
import forward
import numpy as np
# 1. BUILD step: load TensorFlow-Bert model to build Forward engine
builder = forward.TfBuilder()
batch_size = 16
infer_mode = 'float32' # Infer mode: 'float32' / 'float16' / 'int8_calib' / 'int8'
# dict_type dummy input
dummy_input = {"input_ids" : np.ones([batch_size , 48], dtype='int32'),
"input_mask" : np.ones([batch_size , 48], dtype='int32'),
"segment_ids" : np.ones([batch_size , 48], dtype='int32')}
# build engine
builder.set_mode(infer_mode); # optional, 'float32' is default.
model_path = 'bert_model.pb'
tf_engine = builder.build(model_path, dummy_input)
need_save = True
if need_save:
# save engine
engine_path = 'path/to/out/engine'
# load saved engine
tf_engine = forward.TfEngine()
# 2. FORWARD step: do inference with Forward engine
inputs = dummy_input
outputs = tf_engine.forward(inputs) # dict_type outputs
Notice: The name of INPUT in models can be viewed by model viewers, such as Netron.