Skip to content

Latest commit

 

History

History
150 lines (128 loc) · 10.8 KB

performance_counters.rst

File metadata and controls

150 lines (128 loc) · 10.8 KB

Performance Counters

Ibex implements performance counters according to the RISC-V Privileged Specification, version 1.11 (see Hardware Performance Monitor, Section 3.1.11). The performance counters are placed inside the Control and Status Registers (CSRs) and can be accessed with the CSRRW(I) and CSRRS/C(I) instructions.

Ibex implements the clock cycle counter mcycle(h), the retired instruction counter minstret(h), as well as the 29 event counters mhpmcounter3(h) - mhpmcounter31(h) and the corresponding event selector CSRs mhpmevent3 - mhpmevent31, and the mcountinhibit CSR to individually enable/disable the counters. mcycle(h) and minstret(h) are always available and 64 bit wide. The mhpmcounter performance counters are optional (unavailable by default) and parametrizable in width.

Event Selector

The following events can be monitored using the performance counters of Ibex.

Event ID/Bit Event Name Event Description
0 NumCycles Number of cycles
2 NumInstrRet Number of instructions retired
3 NumCyclesLSU Number of cycles waiting for data memory
4 NumCyclesIF Cycles waiting for instruction fetches, i.e., number of instructions wasted due to non-ideal caching
5 NumLoads Number of data memory loads. Misaligned accesses are counted as two accesses
6 NumStores Number of data memory stores. Misaligned accesses are counted as two accesses
7 NumJumps Number of unconditional jumps (j, jal, jr, jalr)
8 NumBranches Number of branches (conditional)
9 NumBranchesTaken Number of taken branches (conditional)
10 NumInstrRetC Number of compressed instructions retired
11 NumCyclesMulWait Cycles waiting for multiply to complete
12 NumCyclesDivWait Cycles waiting for divide to complete

The event selector CSRs mhpmevent3 - mhpmevent31 define which of these events are counted by the event counters mhpmcounter3(h) - mhpmcounter31(h). If a specific bit in an event selector CSR is set to 1, this means that events with this ID are being counted by the counter associated with that selector CSR. If an event selector CSR is 0, this means that the corresponding counter is not counting any event.

Controlling the counters from software

By default, all available counters are enabled after reset. They can be individually enabled/disabled by overwriting the corresponding bit in the mcountinhibit CSR at address 0x320 as described in the RISC-V Privileged Specification, version 1.11 (see Machine Counter-Inhibit CSR, Section 3.1.13). In particular, to enable/disable mcycle(h), bit 0 must be written. For minstret(h), it is bit 2. For event counter mhpmcounterX(h), it is bit X.

The lower 32 bits of all counters can be accessed through the base register, whereas the upper 32 bits are accessed through the h-register. Reads to all these registers are non-destructive.

Parametrization at synthesis time

The mcycle(h) and minstret(h) counters are always available and 64 bit wide.

The event counters mhpmcounter3(h) - mhpmcounter31(h) are parametrizable. Their width can be parametrized between 1 and 64 bit through the WidthMHPMCounters parameter, which defaults to 40 bit wide counters.

The number of available event counters mhpmcounterX(h) can be controlled via the NumMHPMCounters parameter. By default (NumMHPMCounters set to 0), no counters are available to software. Set NumMHPMCounters to a value between 1 and 8 to make the counters mhpmcounter3(h) - mhpmcounter10(h) available as listed below. Setting NumMHPMCounters to values larger than 8 does not result in any more performance counters.

Unavailable counters always read 0.

The association of events with the mphmcounter registers is hardwired as listed in the following table.

Event Counter CSR Address Event ID/Bit Event Name
mcycle(h) 0xB00 (0xB80) 0 NumCycles
minstret(h) 0xB02 (0xB82) 2 NumInstrRet
mhpmcounter3(h) 0xB03 (0xB83) 3 NumCyclesLSU
mhpmcounter4(h) 0xB04 (0xB84) 4 NumCyclesIF
mhpmcounter5(h) 0xB05 (0xB85) 5 NumLoads
mhpmcounter6(h) 0xB06 (0xB86) 6 NumStores
mhpmcounter7(h) 0xB07 (0xB87) 7 NumJumps
mhpmcounter8(h) 0xB08 (0xB88) 8 NumBranches
mhpmcounter9(h) 0xB09 (0xB89) 9 NumBranchesTaken
mhpmcounter10(h) 0xB0A (0xB8A) 10 NumInstrRetC
mhpmcounter11(h) 0xB0B (0xB8B) 11 NumCyclesMulWait
mhpmcounter12(h) 0xB0C (0xB8C) 12 NumCyclesDivWait

Similarly, the event selector CSRs are hardwired as follows. The remaining event selector CSRs are tied to 0, i.e., no events are counted by the corresponding counters.

Event Selector CSR Address Reset Value Event ID/Bit
mhpmevent3(h) 0x323 0x0000_0008 3
mhpmevent4(h) 0x324 0x0000_0010 4
mhpmevent5(h) 0x325 0x0000_0020 5
mhpmevent6(h) 0x326 0x0000_0040 6
mhpmevent7(h) 0x327 0x0000_0080 7
mhpmevent8(h) 0x328 0x0000_0100 8
mhpmevent9(h) 0x329 0x0000_0200 9
mhpmevent10(h) 0x32A 0x0000_0400 10
mhpmevent11(h) 0x32B 0x0000_0800 11
mhpmevent12(h) 0x32C 0x0000_1000 12

FPGA Targets

For FPGA targets the performance counters constitute a particularily large structure. Implementing the maximum 29 event counters 32, 48 and 64 bit wide results in relative logic utilizations of the core of 100%, 111% and 129% respectively. The relative numbers of flip-flops are 100%, 125% and 150%. It is recommended to implement event counters of 32 bit width where possible.

For Xilinx FPGA devices featuring the DSP48E1 DSP slice or similar, counter logic can be absorbed into the DSP slice for widths up to 48 bits. The resulting relative logic utilizations with respect to the non-DSP 32 bit counter implementation are 83% and 89% respectively for 32 and 48 bit DSP counters. This comes at the expense of 1 DSP slice per counter. For 32 bit counters only, the corresponding flip-flops can be incorporated into the DSP's output pipeline register, resulting in a reduction of the number of flip-flops to 50%. In order to infer DSP slices for performance counters, define the preprocessor variable FPGA_XILINX.