jevents is a C library to use from C programs to make access to the kernel Linux perf interface easier. It also includes some examples to use the library.
- Resolving symbolic event names using downloaded event files
- Reading performance counters from ring 3 in C programs,
- Handling the perf ring buffer (for example to read memory addresses)
For more details see the API reference
cd jevents
make
sudo make install
Before using event lists they need to be downloaded. Use the pmu-tools event_download.py script for this.
% event_download.py
- listevents: List all named perf and JSON events
- showevent: Convert JSON name or perf alias to perf format and test with perf
- event-rmap: Map low level perf event to named high-level event
- addr: Profile a loadable test kernel with address profiling
- jstat: Simple perf stat like tool with JSON event resolution.
Functions accessing the JSON event data load the JSON file lazily when first
used. This might result in data races when multiple threads call jevent
functions. In such cases the event list can be loaded from the main thread by
read_events(NULL);
.
Reading performance counters directly in the program without entering the kernel.
This is very simplified, for a real benchmark you almost certainly want some warmup, multiple iterations, possibly context switch filtering and some filler code to avoid cache effects.
#include "rdpmc.h"
struct rdpmc_ctx ctx;
unsigned long long start, end;
if (rdpmc_open(PERF_COUNT_HW_CPU_CYCLES, &ctx) < 0) ... error ...
start = rdpmc_read(&ctx);
... your workload ...
end = rdpmc_read(&ctx);
/sys/devices/cpu/rdpmc must be 1.
http://halobates.de/modern-pmus-yokohama.pdf provides some additional general information on cycle counting. The techniques used with simple-pmu described there can be used with jevents too.
Resolving named events to a perf event and set up reading from the perf ring buffer.
First run event_download.py to download a current event list for your CPU.
#include "jevents.h"
#include "rdpmc.h"
#include <linux/perf_event.h>
struct perf_event_attr attr;
if (resolve_event("cpu_clk_thread_unhalted.ref_xclk", &attr) < 0) {
... error ...
}
/* You can change attr, see the perf_event_open man page for details */
struct rdpmc_ctx ctx;
if (rdpmc_open_attr(PERF_COUNT_HW_CPU_CYCLES, &ctx, &attr) < 0)
... error ...
'''
Or alternatively use the resolve attr for sampling, set up the sampling attributes in attr, and use perf_fd_open / perf_iter_*. See examples/addr.c