ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data

ndzip compresses and decompresses multidimensional univariate grids of single- and double-precision IEEE 754 floating-point data. We implement

a single-threaded CPU compressor
an OpenMP-backed multi-threaded compressor
a SYCL-based GPU compressor (currently hipSYCL + NVIDIA only)
a CUDA-based GPU compressor

All variants generate and decode bit-identical compressed stream.

ndzip is currently a research project with the primary use case of speeding up distributed HPC applications by increasing effective interconnect bandwidth.

Prerequisites

CMake >= 3.15
Clang >= 10.0.0
Linux (tested on x86_64 and POWER9)
Boost >= 1.66
Catch2 >= 2.13.3 (optional, for unit tests and microbenchmarks)

Additionaly, for GPU support

CUDA >= 11.0 (not officially compatible with Clang 10/11, but a lower version will optimize insufficiently!)
An Nvidia GPU with Compute Capability >= 6.1
For the SYCL version: hipSYCL >= 8756087f

Building

Make sure to set the right build type and enable the full instruction set of the target CPU architecture:

-DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=native"

If unit tests and microbenchmarks should also be built, add

-DNDZIP_BUILD_TEST=YES

Depending on your system, you might have to configure the correct C/C++ compilers to use (Clang >= 10.0 and GCC >= 8.2 have been known to work in the past):

-DCMAKE_C_COMPILER=/path/to/cc -DCMAKE_CXX_COMPILER=/path/to/c++

For GPU support with SYCL

Build and install hipSYCL

git clone https://github.com/illuhad/hipSYCL
cd hipSYCL
cmake -B build -DCMAKE_INSTALL_PREFIX=../hipSYCL-install -DWITH_CUDA_BACKEND=YES -DCMAKE_BUILD_TYPE=Release
cmake --build build --target install -j

Build ndzip with SYCL

cmake -B build -DCMAKE_PREFIX_PATH='../hipSYCL-install/lib/cmake' -DHIPSYCL_PLATFORM=cuda -DCMAKE_CUDA_ARCHITECTURES=75 -DHIPSYCL_GPU_ARCH=sm_75 -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-U__FLOAT128__ -U__SIZEOF_FLOAT128__ -march=native"
cmake --build build -j

Replace sm_75 and 75 with the string matching your GPU's Compute Capability. The -U__FLOAT128__ define is required due to an open bug in Clang.

For GPU support with CUDA (experimental)

a) Either build ndzip with CUDA + NVCC ...

cmake -B build -DCMAKE_CUDA_ARCHITECTURES=75 -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=native"
cmake --build build -j

Replace sm_75 and 75 with the string matching your GPU's Compute Capability.

If CMAKE_CXX_COMPILER was redefined above, you also need to specify the CUDA host compiler:

-DCMAKE_CUDA_HOST_COMPILER=/path/to/c++

b) ... or with CUDA + Clang

cmake -B build -DCMAKE_CUDA_COMPILER="$(which clang++)" -DCMAKE_CUDA_ARCHITECTURES=75 -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-U__FLOAT128__ -U__SIZEOF_FLOAT128__ -march=native"
cmake --build build -j

The -U__FLOAT128__ define is required due to an open bug in Clang.

Compressing and decompressing files

build/compress -n <size> -i <uncompressed-file> -o <compressed-file> [-t float|double]
build/compress -d -n <size> -i <compressed-file> -o <decompressed-file> [-t float|double]

<size> are one to three arguments depending on the dimensionality of the input grid. In the multi-dimensional case, the first number specifies the width of the slowest-iterating dimension.

By default, compress uses the single-threaded CPU compressor. Passing -e cpu-mt or -e sycl / -e cuda selects the multi-threaded CPU compressor or the GPU compressor if available, respectively.

Running unit tests

Only available if tests have been enabled during build.

build/encoder_test
build/sycl_bits_test  # only if built with SYCL support
build/sycl_ubench     # GPU microbenchmarks, only if built with SYCL support
build/cuda_bits_test  # only if built with CUDA support

References

If you are using ndzip as part of your research, we kindly ask you to cite

Fabian Knorr, Peter Thoman, and Thomas Fahringer. "ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data". In 2021 Data Compression Conference (DCC), IEEE, 2021. [DOI] Preprint PDF
Knorr, Fabian, Peter Thoman, and Thomas Fahringer. "ndzip-gpu: efficient lossless compression of scientific floating-point data on GPUs". In SC'21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ACM, 2021. [DOI] [Preprint PDF]

Name	Name	Last commit message	Last commit date
Latest commit fknorr Readme: Specify compiler paths if necessary Jun 15, 2022 ff4e670 · Jun 15, 2022 History 304 Commits
.github/workflows	.github/workflows	Enable continuous testing	Jun 13, 2022
ci	ci	Enable continuous testing	Jun 13, 2022
cmake	cmake	Readme: Update CMake requirement and command line	Jul 12, 2021
contrib	contrib	Update NVCOMP to work with latest CUB	Jun 13, 2022
dev	dev	Remove outdated LUT generator script	Jul 29, 2021
docs	docs	Update user-facing documentation, disable benchmarks by default	Jul 29, 2021
include/ndzip	include/ndzip	data_type => value_type, get rid of detail::file	Jun 1, 2022
src	src	Update NVCOMP to work with latest CUB	Jun 13, 2022
.clang-format	.clang-format	Reformat: 100 -> 120 character lines	Oct 20, 2021
.git-blame-ignore-revs	.git-blame-ignore-revs	.git-blame-ignore-revs	Oct 20, 2021
.gitignore	.gitignore	Working SYCL compression step	Jul 28, 2020
.gitmodules	.gitmodules	NVCOMP / ZFP submodules, CMake 3rdparty integration	May 19, 2021
CMakeLists.txt	CMakeLists.txt	Use CheckLanguage instead of FindCUDAToolkit for CUDA detection	Jun 13, 2022
LICENSE	LICENSE	LICENSE	Jun 2, 2021
README.md	README.md	Readme: Specify compiler paths if necessary	Jun 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data

Prerequisites

Additionaly, for GPU support

Building

For GPU support with SYCL

For GPU support with CUDA (experimental)

Compressing and decompressing files

Running unit tests

See also

References

About

Releases

Packages

Languages

License

jtian0/ndzip

Folders and files

Latest commit

History

Repository files navigation

ndzip: A High-Throughput Parallel Lossless Compressor for Scientific Data

Prerequisites

Additionaly, for GPU support

Building

For GPU support with SYCL

For GPU support with CUDA (experimental)

Compressing and decompressing files

Running unit tests

See also

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages