Neuropod is a library that provides a uniform interface to run deep learning models from multiple frameworks in C++ and Python.
It currently supports TensorFlow, PyTorch, TorchScript, and Keras.
Running a TensorFlow model looks exactly like running a PyTorch model.
x = np.array([1, 2, 3, 4])
y = np.array([5, 6, 7, 8])
for model_path in [TF_ADDITION_MODEL_PATH, PYTORCH_ADDITION_MODEL_PATH]:
# Load the model
neuropod = load_neuropod(model_path)
# Run inference
results = neuropod.infer({"x": x, "y": y})
# array([6, 8, 10, 12])
print results["out"]
See the tutorial or Python guide for more examples.
There are many benefits to this:
- All of your inference code is framework agnostic.
- You can easily switch between deep learning frameworks if necessary without changing runtime code.
- Avoid the learning curve of using the C++ libtorch API and the C/C++ TF API
Any Neuropod model can be run from both C++ and Python (even PyTorch models that have not been converted to TorchScript).
This lets you focus more on the problem you're solving rather than the framework you're using to solve it.
For example, if you define a problem API for 2d object detection, any model that implements it can reuse all the existing inference code and infrastructure for that problem.
INPUT_SPEC = [
# BGR image
{"name": "image", "dtype": "uint8", "shape": (1200, 1920, 3)},
]
OUTPUT_SPEC = [
# shape: (num_detections, 4): (xmin, ymin, xmax, ymax)
# These values are in units of pixels. The origin is the top left corner
# with positive X to the right and positive Y towards the bottom of the image
{"name": "boxes", "dtype": "float32", "shape": ("num_detections", 4)},
# The list of classes that the network can output
# This must be some subset of ['vehicle', 'person', 'motorcycle', 'bicycle']
{"name": "supported_object_classes", "dtype": "string", "shape": ("num_classes",)},
# The probability of each class for each detection
# These should all be floats between 0 and 1
{"name": "object_class_probability", "dtype": "float32", "shape": ("num_detections", "num_classes")},
]
This lets you
- Build a single metrics pipeline for a problem
- Easily compare models solving the same problem (even if they're in different frameworks)
- Build optimized inference code that can run any model that solves a particular problem
- Swap out models that solve the same problem at runtime with no code change (even if the models are from different frameworks)
- Run fast experiments
See the tutorial for more details.
If you have several models that take in a similar set of inputs, you can build and optimize one framework-agnostic input generation pipeline and share it across models.
-
Fully self-contained models (including custom ops)
-
Tested on platforms including
- Mac, Linux, Linux (GPU)
- Four versions of each supported frameworks
- Four versions of Python
-
Model isolation with out-of-process execution
- Use multiple different versions of frameworks in the same application
- Ex: Experimental models using Torch nightly along with models using Torch 1.1.0
- Use multiple different versions of frameworks in the same application
-
Switch from running in-process to running out-of-process with one line of code
See the basic introduction tutorial for an overview of how to get started with Neuropod.
The Python guide and C++ guide go into more detail on running Neuropod models.