Skip to content

HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container

Notifications You must be signed in to change notification settings

MatheMatrix/HAMi-core

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hook library for CUDA Environments

image

Build

sh build.sh

Build in Docker

docker build .

Usage

CUDA_DEVICE_MEMORY_LIMIT indicates the upper limit of device memory (eg 1g,1024m,1048576k,1073741824)

CUDA_DEVICE_SM_LIMIT indicates the sm utility percentage of each device

# Add 1GB bytes limit And set max sm utility to 50% for all devices
export LD_PRELOAD=./libvgpu.so
export CUDA_DEVICE_MEMORY_LIMIT=1g
export CUDA_DEVICE_SM_LIMIT=50

Docker Images

# Make docker image
docker build . -f=dockerfiles/Dockerfile-tf1.8-cu90

# Launch the docker image
export DEVICE_MOUNTS="--device /dev/nvidia0:/dev/nvidia0 --device /dev/nvidia-uvm:/dev/nvidia-uvm --device /dev/nvidiactl:/dev/nvidiactl"
export LIBRARY_MOUNTS="-v /usr/cuda_files:/usr/cuda_files -v $(which nvidia-smi):/bin/nvidia-smi"

docker run ${LIBRARY_MOUNTS} ${DEVICE_MOUNTS} -it \
    -e CUDA_DEVICE_MEMORY_LIMIT=2g \
    cuda_vmem:tf1.8-cu90 \
    python -c "import tensorflow; tensorflow.Session()"

Log

Use environment variable LIBCUDA_LOG_LEVEL to set the visibility of logs

�LIBCUDA_LOG_LEVEL description
not set errors,warnings,messages
3 infos,errors,warnings,messages
4 debugs,errors,warnings,messages

Test with Frameworks

Run operations which requires at least 4GB device memory, thus will OOM under 1GB limit

  • TensorFlow

    python test/python/limit_tensorflow.py --device=0 --tensor_shape=1024,1024,1024
  • TensorFlow 2.0

     python test/python/limit_tensorflow2.py --device=0 --tensor_shape=1024,1024,1024
  • Pytorch

     python test/python/limit_pytorch.py --device=0 --tensor_shape=1024,1024,1024
  • MxNet

     python test/python/limit_mxnet.py --device=0 --tensor_shape=1024,1024,1024

Test Raw APIs

./test/test_alloc

About

HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 97.2%
  • CMake 1.4%
  • Other 1.4%