This uses autotools. If you're compiling from git, run autogen.sh
and then make
. Otherwise, use ./configure && make
.
To install, run sudo make install
.
To run the pmmeter
program for measuring memory latency and throughput.
This software is derived from NumaTop source code. Please refer https://github.com/intel/numatop
pmmeter is a microbenchmark to measure memory latency and bandwidth which is derived from mgen test program of NumaTop source code.
common.h is derived from OptaneStudy project. Please refer https://github.com/NVSL/OptaneStudy
It requires following libraries:
- numactl-devel or libnuma-dev(el)
- libncurses
- libpthread
- gcc (version 7 or later)
The recommended kernel version is the latest stable kernel, currently 4.15.
The minimum kernel version supported is 3.16
For Haswell supporting, please also apply a perf patch on 3.16. The patch
is kernel_patches/0001-perf-x86-Widen-Haswell-OFFCORE-mask.patch
.
The patch can also be found at following link: http://www.gossamer-threads.com/lists/linux/kernel/1964864
src: pmmeter source code. pmmeter is a microbenchmark which can generate memory access with runtime latency value among CPUs.
common: common code for all platforms.
intel : Intel platform-specific code.
test : test codes
kernel_patches: the required kernel patches.
numatop is supported on Intel Xeon processors: 5500-series, 6500/7500-series, 5600 series, E7-x8xx-series, and E5-16xx/24xx/26xx/46xx-series.
E5-16xx/24xx/26xx/46xx-series had better be updated to latest CPU microcode (microcode must be 0x618+ or 0x70c+).
To learn about NumaTOP, please visit http://01.org/numatop
- print usage;
./pmmeter -h
- mount pmem device as an App Direct mode on numa 0 and numa 1 node resoectively.
$ sudo mount -o dax /dev/pmem0p1 /mnt/pmem_fsdax0/
# on node 0
$ sudo mount -o dax /dev/pmem1p1 /mnt/pmem_fsdax1/
# on node 1 ./pmmeter
- pattern name;
[[bw]_[r|s]_][load|store|ntload|nstore]_[clflush|clwb|clflushopt]_[fence|nofence]_[movq|movd|movb]_[64|128|256]
- bw: bandwidth
- r: random access, s: sequential access
- load: load, store: store, ntload: non temporal load, nstore: non temporal store
- clflush: clflush, clwb: clwb, clflushopt: clflushopt
- fence: with fence, nofence: without fence
- movq: mov quad word, movd: mov dobule word, movb: mov byte
- 64: 64 byte, 128: 128 byte, 256: 256 byte