Skip to content

MaTEx Docker configurations for single and multi-node environments


Notifications You must be signed in to change notification settings


Repository files navigation

MaTEx Docker: Machine Learning Toolkit for Extreme Scale in Docker


MaTEx is a collection of high performance parallel machine learning and data mining (MLDM) algorithms using OpenMPI and TensorFlow.

Docker provides operating system-level virtualization via containerization. For more information see Docker and Wikipedia

The matex-docker project supports MaTEx executing in Docker, thus providing a single set of configurations and dependencies to build, ship, and run MaTEx on laptops, data center VMs, or the cloud.

Project Structure


Here is a quick breakdown of how matex-docker is structured. All Dockerfiles are configured to support both single and multi-container MPI execution. Default is single container.


MPI4PY benchmark scripts


Dockerfile support for CentOS and Ubuntu


Support for multi-container OpenMPI using SSH and Docker Compose


User-specific OpenMPI configuration parameter files


User-specific SSH configuration files

Docker Build


IMPORTANT: MaTEx Docker must be built from the root of the repository


docker build -t matex-github:latest -f dockerfiles/ubuntu/16.x/Dockerfile .

Single Container MaTEx Execution


Clone matex-docker project

  • cd LOCAL_DIR

Build and Run Docker container

  • cd matex-docker
  • docker build -t matex-github:latest -f DOCKERFILE_DIR/Dockerfile .
    • For DOCKERFILE_DIR see example above
  • Once docker build is complete
    • docker images
  • Take note of the newest Image ID
  • Run the Docker container
    • docker run -i -t IMAGE_ID /bin/bash

Execute MaTEx inside container

  • cd matex/src/deeplearning/tensorflow/cpu/py3.x/
  • source


  • Execute MaTEx example code (MNIST)
    • cd /matex/src/deeplearning/tensorflow/examples/glibc_after_2.19/MNIST_KERAS/
    • mpirun --allow-run-as-root -mca btl_vader_single_copy_mechanism none -np 4 python

Multi-Container MaTEx Execution (UNDER CONSTRUCTION)


Container Cluster orchestration uses docker-compose

While containers can in principle be started manually via docker run, we suggest that you use Docker Compose, a command-line tool to define and run multi-container applications.

We provide a sample docker-compose.yml file in the repository:

  image: openmpi
   - "22"
   - mpi_node

  image: openmpi

(Note: the above is docker-compose API version 1)

The file defines an mpi_head and an mpi_node. Both containers run the same openmpi image. The only difference is, that the mpi_head container exposes its SSH server to the host system, so you can log into it to start your MPI applications.


The following command, run from the repository's directory, will start one mpi_head container and three mpi_node containers:

$> docker-compose scale mpi_head=1 mpi_node=3

Once all containers are running, you can login into the mpi_head node and start MPI jobs with mpirun. Alternatively, you can execute a one-shot command on that container with the docker-compose exec syntax, as follows:

docker-compose exec --user mpirun --privileged mpi_head mpirun -n 2 python /home/mpirun/mpi4py_benchmarks/
----------------------------------------- ----------- --------------------------------------------------
1.                                        2.          3.

Breaking the above command down:

  1. Execute command on node mpi-head
  2. run on 2 MPI ranks
  3. Command to run (NB: the Python script needs to import MPI bindings)


You can spin up a docker-compose cluster, run a battery of MPI4py tests and remove the cluster using a recipe provided in the included Makefile (handy for development):

make main



OpenMPI SSH work based on docker.openmpi and dispel4py by O. Weidner and R. Filgueira


MaTEx Docker configurations for single and multi-node environments







No releases published


No packages published