Skip to content

Commit

Permalink
ring-buffer: Call trace_clock_local() directly for RETPOLINE kernels
Browse files Browse the repository at this point in the history
After doing some benchmarks and examining the code, I found that the ring
buffer clock calls were quite expensive, and noticed that it uses
retpolines. This is because the ring buffer clock is programmable, and can
be set. But in most cases it simply uses the fastest ns unit clock which is
the trace_clock_local(). For RETPOLINE builds, checking if the ring buffer
clock is set to trace_clock_local() and then calling it directly has brought
the time of an event on my i7 box from an average of 93 nanoseconds an event
down to 83 nanoseconds an event, and the minimum time from 81 nanoseconds to
68 nanoseconds!

Suggested-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
  • Loading branch information
rostedt committed Jul 2, 2020
1 parent 74e8793 commit bbeba3e
Showing 1 changed file with 9 additions and 1 deletion.
10 changes: 9 additions & 1 deletion kernel/trace/ring_buffer.c
Original file line number Diff line number Diff line change
Expand Up @@ -970,8 +970,16 @@ __poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu,

static inline u64 rb_time_stamp(struct trace_buffer *buffer)
{
u64 ts;

/* Skip retpolines :-( */
if (IS_ENABLED(CONFIG_RETPOLINE) && likely(buffer->clock == trace_clock_local))
ts = trace_clock_local();
else
ts = buffer->clock();

/* shift to debug/test normalization and TIME_EXTENTS */
return buffer->clock() << DEBUG_SHIFT;
return ts << DEBUG_SHIFT;
}

u64 ring_buffer_time_stamp(struct trace_buffer *buffer, int cpu)
Expand Down

0 comments on commit bbeba3e

Please sign in to comment.