Skip to content

Commit

Permalink
⚠️ Rework hardware performance monitor (HPM) events (#811)
Browse files Browse the repository at this point in the history
  • Loading branch information
stnolting committed Feb 17, 2024
2 parents 772e0e2 + 2839ab4 commit d375493
Show file tree
Hide file tree
Showing 9 changed files with 179 additions and 209 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ mimpid = 0x01040312 -> Version 01.04.03.12 -> v1.4.3.12

| Date | Version | Comment | Link |
|:----:|:-------:|:--------|:----:|
| 17.02.2024 | 1.9.5.3 | :warning: reworked CPU's hardware performance monitor (HPMs) events | [#811](https://github.com/stnolting/neorv32/pull/811) |
| 16.02.2024 | 1.9.5.2 | :warning: **revert** support for page faults (keep that in mmu branch for now) | [#809](https://github.com/stnolting/neorv32/pull/809) |
| 16.02.2024 | 1.9.5.1 | :sparkles: add two new generics to exclude certain PMP modes from synthesis | [#808](https://github.com/stnolting/neorv32/pull/808) |
| 16.02.2024 | [**:rocket:1.9.5**](https://github.com/stnolting/neorv32/releases/tag/v1.9.5) | **New release** | |
Expand Down
39 changes: 22 additions & 17 deletions docs/datasheet/cpu_csr.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -757,23 +757,28 @@ cycle even if more than one trigger event is observed.
[cols="^1,^3,^1,<9"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Event Description
| 0 | `HPMCNT_EVENT_CY` | r/w | active clock cycle (CPU not in sleep mode)
| 1 | - | r/- | _not implemented, always read as zero_
| 2 | `HPMCNT_EVENT_IR` | r/w | retired instruction (compressed or uncompressed)
| 3 | `HPMCNT_EVENT_CIR` | r/w | retired compressed instruction
| 4 | `HPMCNT_EVENT_WAIT_IF` | r/w | instruction fetch memory wait cycle
| 5 | `HPMCNT_EVENT_WAIT_II` | r/w | instruction issue pipeline wait cycle
| 6 | `HPMCNT_EVENT_WAIT_MC` | r/w | multi-cycle ALU operation wait cycle (like iterative shift operation)
| 7 | `HPMCNT_EVENT_LOAD` | r/w | memory data load operation
| 8 | `HPMCNT_EVENT_STORE` | r/w | memory data store operation
| 9 | `HPMCNT_EVENT_WAIT_LS` | r/w | load/store memory wait cycle
| 10 | `HPMCNT_EVENT_JUMP` | r/w | unconditional jump / jump-and-link
| 11 | `HPMCNT_EVENT_BRANCH` | r/w | conditional branch (_taken_ or _not taken_)
| 12 | `HPMCNT_EVENT_TBRANCH` | r/w | _taken_ conditional branch
| 13 | `HPMCNT_EVENT_TRAP` | r/w | entered trap (synchronous exception or interrupt)
| 14 | `HPMCNT_EVENT_ILLEGAL` | r/w | illegal instruction exception
|=======================
| Bit | Name [C] | R/W | Event Description
4+^| **RISC-V-compatible**
| 0 | `HPMCNT_EVENT_CY` | r/w | active clock cycle (CPU not in <<_sleep_mode>>)
| 1 | `HPMCNT_EVENT_TM` | r/- | _not implemented_, hardwired to zero
| 2 | `HPMCNT_EVENT_IR` | r/w | any executed instruction (16-bit/compressed or 32-bit/uncompressed)
4+^| **NEORV32-specific**
| 3 | `HPMCNT_EVENT_COMPR` | r/w | any executed 16-bit/compressed (<<_c_isa_extension>>) instruction
| 4 | `HPMCNT_EVENT_WAIT_DIS` | r/w | instruction dispatch wait cycle (wait for instruction prefetch-buffer refill (<<_cpu_control_unit>> IPB);
caused by a fence instruction, a control flow transfer or a instruction fetch bus wait cycle)
| 5 | `HPMCNT_EVENT_WAIT_ALU` | r/w | any delay/wait cycle caused by a _multi-cycle_ <<_cpu_arithmetic_logic_unit>> operation
| 6 | `HPMCNT_EVENT_BRANCH` | r/w | any executed branch instruction (unconditional, conditional-taken or conditional-not-taken)
| 7 | `HPMCNT_EVENT_BRANCHED` | r/w | any control transfer operation (unconditional jump, taken conditional branch or trap entry/exit)
| 8 | `HPMCNT_EVENT_LOAD` | r/w | any executed load operation (including atomic memory operations, <<_a_isa_extension>>)
| 9 | `HPMCNT_EVENT_STORE` | r/w | any executed store operation (including atomic memory operations, <<_a_isa_extension>>)
| 10 | `HPMCNT_EVENT_WAIT_LSU` | r/w | any memory/bus/cache/etc. delay/wait cycle while executing any load or store operation (caused by a data bus wait cycle))
| 11 | `HPMCNT_EVENT_TRAP` | r/w | starting processing of any trap (<<_traps_exceptions_and_interrupts>>)
|=======================

.Instruction Retiring ("Retired == Executed")
[IMPORTANT]
The CPU HPM/counter logic treats all executed instruction as "retired" even if they raise an exception,
cause an interrupt, trigger a privilege mode change or were not meant to retire (by the RISC-V spec.).


{empty} +
Expand Down
83 changes: 34 additions & 49 deletions rtl/core/neorv32_cpu_control.vhd
Original file line number Diff line number Diff line change
Expand Up @@ -132,13 +132,12 @@ architecture neorv32_cpu_control_rtl of neorv32_cpu_control is
-- instruction fetch engine --
type fetch_engine_state_t is (IF_RESTART, IF_REQUEST, IF_PENDING);
type fetch_engine_t is record
state : fetch_engine_state_t;
state_prev : fetch_engine_state_t;
restart : std_ulogic; -- buffered restart request (after branch)
pc : std_ulogic_vector(XLEN-1 downto 0);
reset : std_ulogic; -- restart request (after branch)
resp : std_ulogic; -- bus response
priv : std_ulogic; -- fetch privilege level
state : fetch_engine_state_t;
restart : std_ulogic; -- buffered restart request (after branch)
pc : std_ulogic_vector(XLEN-1 downto 0);
reset : std_ulogic; -- restart request (after branch)
resp : std_ulogic; -- bus response
priv : std_ulogic; -- fetch privilege level
end record;
signal fetch_engine : fetch_engine_t;

Expand Down Expand Up @@ -190,8 +189,6 @@ architecture neorv32_cpu_control_rtl of neorv32_cpu_control is
type execute_engine_t is record
state : execute_engine_state_t;
state_nxt : execute_engine_state_t;
state_prev : execute_engine_state_t;
state_prev2 : execute_engine_state_t;
ir : std_ulogic_vector(31 downto 0);
ir_nxt : std_ulogic_vector(31 downto 0);
is_ci : std_ulogic; -- current instruction is de-compressed instruction
Expand Down Expand Up @@ -359,15 +356,11 @@ begin
fetch_engine_fsm: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
fetch_engine.state <= IF_RESTART;
fetch_engine.state_prev <= IF_RESTART;
fetch_engine.restart <= '1'; -- set to reset IPB
fetch_engine.pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
fetch_engine.priv <= priv_mode_m_c; -- start in machine mode
fetch_engine.state <= IF_RESTART;
fetch_engine.restart <= '1'; -- set to reset IPB
fetch_engine.pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
fetch_engine.priv <= priv_mode_m_c; -- start in machine mode
elsif rising_edge(clk_i) then
-- previous state (for HPMs only) --
fetch_engine.state_prev <= fetch_engine.state;

-- restart request --
if (fetch_engine.state = IF_RESTART) then -- restart done
fetch_engine.restart <= '0';
Expand Down Expand Up @@ -620,25 +613,21 @@ begin
execute_engine_fsm_sync: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
ctrl <= ctrl_bus_zero_c;
execute_engine.state <= RESTART;
execute_engine.state_prev <= RESTART;
execute_engine.state_prev2 <= RESTART;
execute_engine.ir <= (others => '0');
execute_engine.is_ci <= '0';
execute_engine.pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
execute_engine.next_pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
execute_engine.link_pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
ctrl <= ctrl_bus_zero_c;
execute_engine.state <= RESTART;
execute_engine.ir <= (others => '0');
execute_engine.is_ci <= '0';
execute_engine.pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
execute_engine.next_pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
execute_engine.link_pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
elsif rising_edge(clk_i) then
-- control bus --
ctrl <= ctrl_nxt;

-- execute engine arbiter --
execute_engine.state <= execute_engine.state_nxt;
execute_engine.state_prev <= execute_engine.state;
execute_engine.state_prev2 <= execute_engine.state_prev;
execute_engine.ir <= execute_engine.ir_nxt;
execute_engine.is_ci <= execute_engine.is_ci_nxt;
execute_engine.state <= execute_engine.state_nxt;
execute_engine.ir <= execute_engine.ir_nxt;
execute_engine.is_ci <= execute_engine.is_ci_nxt;

-- current PC: address of instruction being executed --
if (execute_engine.pc_we = '1') then
Expand Down Expand Up @@ -2358,29 +2347,25 @@ begin
((csr.privilege = priv_mode_m_c) and (csr.mcyclecfg_minh = '0')) or -- not inhibited when in machine-mode
((csr.privilege = priv_mode_u_c) and (csr.mcyclecfg_uinh = '0')) -- not inhibited when in user-mode
) else '0';
cnt_event(hpmcnt_event_ir_c) <= '1' when (execute_engine.state = EXECUTE) and ( -- retired (=executed) instruction
cnt_event(hpmcnt_event_tm_c) <= '0'; -- unused/reserved (time)
cnt_event(hpmcnt_event_ir_c) <= '1' when (execute_engine.state = EXECUTE) and ( -- retired (==executed) instruction
((csr.privilege = priv_mode_m_c) and (csr.minstretcfg_minh = '0')) or -- not inhibited when in machine-mode
((csr.privilege = priv_mode_u_c) and (csr.minstretcfg_uinh = '0')) -- not inhibited when in user-mode
) else '0';
cnt_event(hpmcnt_event_tm_c) <= '0'; -- unused/reserved (time)

-- NEORV32-specific counter events (for HPM counters only) --
cnt_event(hpmcnt_event_cir_c) <= '1' when (execute_engine.state = EXECUTE) and (execute_engine.is_ci = '1') else '0'; -- executed compressed instruction
cnt_event(hpmcnt_event_wait_if_c) <= '1' when (fetch_engine.state = IF_PENDING) and (fetch_engine.state_prev = IF_PENDING) else '0'; -- instruction fetch memory wait cycle
cnt_event(hpmcnt_event_wait_ii_c) <= '1' when (execute_engine.state = DISPATCH) and (execute_engine.state_prev = DISPATCH) else '0'; -- instruction issue wait cycle
cnt_event(hpmcnt_event_wait_mc_c) <= '1' when (execute_engine.state = ALU_WAIT) else '0'; -- multi-cycle alu-operation wait cycle

cnt_event(hpmcnt_event_load_c) <= '1' when (ctrl.lsu_req = '1') and (ctrl.lsu_rw = '0') else '0'; -- load operation
cnt_event(hpmcnt_event_store_c) <= '1' when (ctrl.lsu_req = '1') and (ctrl.lsu_rw = '1') else '0'; -- store operation
cnt_event(hpmcnt_event_wait_ls_c) <= '1' when (execute_engine.state = MEM_WAIT) and (execute_engine.state_prev2 = MEM_WAIT) else '0'; -- load/store memory wait cycle

cnt_event(hpmcnt_event_jump_c) <= '1' when (execute_engine.state = BRANCH) and (execute_engine.ir(instr_opcode_lsb_c+2) = '1') else '0'; -- jump (unconditional)
cnt_event(hpmcnt_event_branch_c) <= '1' when (execute_engine.state = BRANCH) and (execute_engine.ir(instr_opcode_lsb_c+2) = '0') else '0'; -- branch (conditional, taken or not taken)
cnt_event(hpmcnt_event_tbranch_c) <= '1' when (execute_engine.state = BRANCHED) and (execute_engine.state_prev = BRANCH) and
(execute_engine.ir(instr_opcode_lsb_c+2) = '0') else '0'; -- taken branch (conditional)

cnt_event(hpmcnt_event_trap_c) <= '1' when (trap_ctrl.env_enter = '1') else '0'; -- entered trap
cnt_event(hpmcnt_event_illegal_c) <= '1' when (trap_ctrl.env_enter = '1') and (trap_ctrl.cause = trap_iil_c) else '0'; -- illegal operation
cnt_event(hpmcnt_event_compr_c) <= '1' when (execute_engine.state = EXECUTE) and (execute_engine.is_ci = '1') else '0'; -- executed compressed instruction
cnt_event(hpmcnt_event_wait_dis_c) <= '1' when (execute_engine.state = DISPATCH) and (issue_engine.valid = "00") else '0'; -- instruction dispatch wait cycle
cnt_event(hpmcnt_event_wait_alu_c) <= '1' when (execute_engine.state = ALU_WAIT) else '0'; -- multi-cycle ALU co-processor wait cycle

cnt_event(hpmcnt_event_branch_c) <= '1' when (execute_engine.state = BRANCH) else '0'; -- executed branch instruction
cnt_event(hpmcnt_event_branched_c) <= '1' when (execute_engine.state = BRANCHED) else '0'; -- control flow transfer

cnt_event(hpmcnt_event_load_c) <= '1' when (ctrl.lsu_req = '1') and (ctrl.lsu_rw = '0') else '0'; -- executed load operation
cnt_event(hpmcnt_event_store_c) <= '1' when (ctrl.lsu_req = '1') and (ctrl.lsu_rw = '1') else '0'; -- executed store operation
cnt_event(hpmcnt_event_wait_lsu_c) <= '1' when (ctrl.lsu_req = '0') and (execute_engine.state = MEM_WAIT) else '0'; -- load/store unit memory wait cycle

cnt_event(hpmcnt_event_trap_c) <= '1' when (trap_ctrl.env_enter = '1') else '0'; -- entered trap


-- ****************************************************************************************************************************
Expand Down
35 changes: 17 additions & 18 deletions rtl/core/neorv32_package.vhd
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ package neorv32_package is

-- Architecture Constants -----------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01090502"; -- hardware version
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01090503"; -- hardware version
constant archid_c : natural := 19; -- official RISC-V architecture ID
constant XLEN : natural := 32; -- native data path width

Expand Down Expand Up @@ -688,25 +688,24 @@ package neorv32_package is
constant priv_mode_m_c : std_ulogic := '1'; -- machine mode
constant priv_mode_u_c : std_ulogic := '0'; -- user mode

-- HPM Event System -----------------------------------------------------------------------
-- HPM Events -----------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
constant hpmcnt_event_cy_c : natural := 0; -- Active cycle
constant hpmcnt_event_tm_c : natural := 1; -- Time (unused/reserved)
constant hpmcnt_event_ir_c : natural := 2; -- Retired instruction
constant hpmcnt_event_cir_c : natural := 3; -- Retired compressed instruction
constant hpmcnt_event_wait_if_c : natural := 4; -- Instruction fetch memory wait cycle
constant hpmcnt_event_wait_ii_c : natural := 5; -- Instruction issue wait cycle
constant hpmcnt_event_wait_mc_c : natural := 6; -- Multi-cycle ALU-operation wait cycle
constant hpmcnt_event_load_c : natural := 7; -- Load operation
constant hpmcnt_event_store_c : natural := 8; -- Store operation
constant hpmcnt_event_wait_ls_c : natural := 9; -- Load/store memory wait cycle
constant hpmcnt_event_jump_c : natural := 10; -- Unconditional jump
constant hpmcnt_event_branch_c : natural := 11; -- Conditional branch (taken or not taken)
constant hpmcnt_event_tbranch_c : natural := 12; -- Conditional taken branch
constant hpmcnt_event_trap_c : natural := 13; -- Entered trap
constant hpmcnt_event_illegal_c : natural := 14; -- Illegal instruction exception
-- RISC-V-compliant --
constant hpmcnt_event_cy_c : natural := 0; -- active cycle
constant hpmcnt_event_tm_c : natural := 1; -- time (unused/reserved)
constant hpmcnt_event_ir_c : natural := 2; -- retired instruction
-- NEORV32-specific --
constant hpmcnt_event_compr_c : natural := 3; -- executed compressed instruction
constant hpmcnt_event_wait_dis_c : natural := 4; -- instruction dispatch wait cycle
constant hpmcnt_event_wait_alu_c : natural := 5; -- multi-cycle ALU co-processor wait cycle
constant hpmcnt_event_branch_c : natural := 6; -- executed branch instruction
constant hpmcnt_event_branched_c : natural := 7; -- control flow transfer
constant hpmcnt_event_load_c : natural := 8; -- load operation
constant hpmcnt_event_store_c : natural := 9; -- store operation
constant hpmcnt_event_wait_lsu_c : natural := 10; -- load-store unit memory wait cycle
constant hpmcnt_event_trap_c : natural := 11; -- entered trap
--
constant hpmcnt_event_size_c : natural := 15; -- length of this list
constant hpmcnt_event_size_c : natural := 12; -- length of this list

-- ****************************************************************************************************************************
-- Helper Functions
Expand Down
Loading

0 comments on commit d375493

Please sign in to comment.