Skip to content

Latest commit

 

History

History
 
 

tritonserver

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

JavaCPP Presets for Triton Inference Server

Gitter Maven Central Sonatype Nexus (Snapshots)
Build status for all platforms: tritonserver Commercial support: xscode

License Agreements

By downloading these archives, you agree to the terms of the license agreements for NVIDIA software included in the archives.

Triton Inference Server

To view the license for Triton Inference Server included in these archives, click here

  • Triton Inference Server is widely used software package for inference service
  • Triton supports almost all kinds of model generated by different DL frameworks or tools, such as TensorFlow, PyTorch, ONNX Runtime, TensorRT, OpenVINO...
  • Triton supports both CPU and GPU
  • Triton can be used both as an application and as a shared library. In case you already have your own inference service framework but want to add more features, just try Triton as a shared library.
  • Triton supports Java as a shared library through JavaCPP Presets

Introduction

This directory contains the JavaCPP Presets module for:

Please refer to the parent README.md file for more detailed information about the JavaCPP Presets.

Documentation

Java API documentation is available here:

Sample Usage

Here is a simple example of Triton Inference Server ported to Java from the simple.cc sample file available at:

We can use Maven 3 to download and install automatically all the class files as well as the native binaries. To run this sample code, after creating the pom.xml and Simple.java source files from the samples/simple subdirectory, simply execute on the command line:

 $ mvn compile exec:java -Dexec.args="-r /path/to/models"

This sample intends to show how to call the Java-mapped C API of Triton to execute inference requests.

Steps to run this sample inside an NGC container

  1. Get the source code of Triton Inference Server to prepare the model repository:
 $ wget https://github.com/triton-inference-server/server/archive/refs/tags/v2.26.0.tar.gz
 $ tar zxvf v2.26.0.tar.gz
 $ cd server-2.26.0/docs/examples/model_repository
 $ mkdir models
 $ cd models; cp -a ../simple .

Now, this models directory will be our model repository.

  1. Start the Docker container to run the sample (assuming we are under the models directory created above):
 $ docker run -it --gpus=all -v $(pwd):/workspace nvcr.io/nvidia/tritonserver:22.09-py3 bash
 $ apt update
 $ apt install -y openjdk-11-jdk
 $ wget https://archive.apache.org/dist/maven/maven-3/3.8.4/binaries/apache-maven-3.8.4-bin.tar.gz
 $ tar zxvf apache-maven-3.8.4-bin.tar.gz
 $ export PATH=/opt/tritonserver/apache-maven-3.8.4/bin:$PATH
 $ git clone https://github.com/bytedeco/javacpp-presets.git
 $ cd javacpp-presets
  1. Compile the tritonserver and tritonserver/platform modules with Maven, which will generate the necessary bindings:
 $ mvn clean install --projects .,tritonserver
 $ mvn clean install -f platform --projects ../tritonserver/platform -Djavacpp.platform=linux-x86_64
  1. Execute Simple.java:
 $ cd tritonserver/samples/simple
 $ mvn compile exec:java -Dexec.mainClass=Simple -Djavacpp.platform=linux-x86_64 -Dexec.args="-r /workspace/models"

This sample is the Java implementation of the simple example written for the C API.

Steps to run your *.java files with Triton Inference Server using Maven inside an NGC container

To run your code, you will need to:

  1. Create pom.xml and <your code>.java source files, and
  2. Similar to the pom.xml for Simple.java, execute with:
 $ mvn compile exec:java

Steps to run your *.java files with Triton Inference Server using the "uber JAR" inside an NGC container

After generating tritonserver/platform/target/tritonserver-platform-*-shaded.jar by following steps 1 to 3 above, you can then execute the following to run directly your application:

 $ cd tritonserver/samples/simple
 $ java -cp ../platform/target/tritonserver-platform-*-shaded.jar Simple.java -r /workspace/models