Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚠️ Rework hardware performance monitor (HPM) events #811

Merged
merged 6 commits into from
Feb 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ mimpid = 0x01040312 -> Version 01.04.03.12 -> v1.4.3.12

| Date | Version | Comment | Link |
|:----:|:-------:|:--------|:----:|
| 17.02.2024 | 1.9.5.3 | :warning: reworked CPU's hardware performance monitor (HPMs) events | [#811](https://github.com/stnolting/neorv32/pull/811) |
| 16.02.2024 | 1.9.5.2 | :warning: **revert** support for page faults (keep that in mmu branch for now) | [#809](https://github.com/stnolting/neorv32/pull/809) |
| 16.02.2024 | 1.9.5.1 | :sparkles: add two new generics to exclude certain PMP modes from synthesis | [#808](https://github.com/stnolting/neorv32/pull/808) |
| 16.02.2024 | [**:rocket:1.9.5**](https://github.com/stnolting/neorv32/releases/tag/v1.9.5) | **New release** | |
Expand Down
39 changes: 22 additions & 17 deletions docs/datasheet/cpu_csr.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -757,23 +757,28 @@ cycle even if more than one trigger event is observed.
[cols="^1,^3,^1,<9"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Event Description
| 0 | `HPMCNT_EVENT_CY` | r/w | active clock cycle (CPU not in sleep mode)
| 1 | - | r/- | _not implemented, always read as zero_
| 2 | `HPMCNT_EVENT_IR` | r/w | retired instruction (compressed or uncompressed)
| 3 | `HPMCNT_EVENT_CIR` | r/w | retired compressed instruction
| 4 | `HPMCNT_EVENT_WAIT_IF` | r/w | instruction fetch memory wait cycle
| 5 | `HPMCNT_EVENT_WAIT_II` | r/w | instruction issue pipeline wait cycle
| 6 | `HPMCNT_EVENT_WAIT_MC` | r/w | multi-cycle ALU operation wait cycle (like iterative shift operation)
| 7 | `HPMCNT_EVENT_LOAD` | r/w | memory data load operation
| 8 | `HPMCNT_EVENT_STORE` | r/w | memory data store operation
| 9 | `HPMCNT_EVENT_WAIT_LS` | r/w | load/store memory wait cycle
| 10 | `HPMCNT_EVENT_JUMP` | r/w | unconditional jump / jump-and-link
| 11 | `HPMCNT_EVENT_BRANCH` | r/w | conditional branch (_taken_ or _not taken_)
| 12 | `HPMCNT_EVENT_TBRANCH` | r/w | _taken_ conditional branch
| 13 | `HPMCNT_EVENT_TRAP` | r/w | entered trap (synchronous exception or interrupt)
| 14 | `HPMCNT_EVENT_ILLEGAL` | r/w | illegal instruction exception
|=======================
| Bit | Name [C] | R/W | Event Description
4+^| **RISC-V-compatible**
| 0 | `HPMCNT_EVENT_CY` | r/w | active clock cycle (CPU not in <<_sleep_mode>>)
| 1 | `HPMCNT_EVENT_TM` | r/- | _not implemented_, hardwired to zero
| 2 | `HPMCNT_EVENT_IR` | r/w | any executed instruction (16-bit/compressed or 32-bit/uncompressed)
4+^| **NEORV32-specific**
| 3 | `HPMCNT_EVENT_COMPR` | r/w | any executed 16-bit/compressed (<<_c_isa_extension>>) instruction
| 4 | `HPMCNT_EVENT_WAIT_DIS` | r/w | instruction dispatch wait cycle (wait for instruction prefetch-buffer refill (<<_cpu_control_unit>> IPB);
caused by a fence instruction, a control flow transfer or a instruction fetch bus wait cycle)
| 5 | `HPMCNT_EVENT_WAIT_ALU` | r/w | any delay/wait cycle caused by a _multi-cycle_ <<_cpu_arithmetic_logic_unit>> operation
| 6 | `HPMCNT_EVENT_BRANCH` | r/w | any executed branch instruction (unconditional, conditional-taken or conditional-not-taken)
| 7 | `HPMCNT_EVENT_BRANCHED` | r/w | any control transfer operation (unconditional jump, taken conditional branch or trap entry/exit)
| 8 | `HPMCNT_EVENT_LOAD` | r/w | any executed load operation (including atomic memory operations, <<_a_isa_extension>>)
| 9 | `HPMCNT_EVENT_STORE` | r/w | any executed store operation (including atomic memory operations, <<_a_isa_extension>>)
| 10 | `HPMCNT_EVENT_WAIT_LSU` | r/w | any memory/bus/cache/etc. delay/wait cycle while executing any load or store operation (caused by a data bus wait cycle))
| 11 | `HPMCNT_EVENT_TRAP` | r/w | starting processing of any trap (<<_traps_exceptions_and_interrupts>>)
|=======================

.Instruction Retiring ("Retired == Executed")
[IMPORTANT]
The CPU HPM/counter logic treats all executed instruction as "retired" even if they raise an exception,
cause an interrupt, trigger a privilege mode change or were not meant to retire (by the RISC-V spec.).


{empty} +
Expand Down
83 changes: 34 additions & 49 deletions rtl/core/neorv32_cpu_control.vhd
Original file line number Diff line number Diff line change
Expand Up @@ -132,13 +132,12 @@ architecture neorv32_cpu_control_rtl of neorv32_cpu_control is
-- instruction fetch engine --
type fetch_engine_state_t is (IF_RESTART, IF_REQUEST, IF_PENDING);
type fetch_engine_t is record
state : fetch_engine_state_t;
state_prev : fetch_engine_state_t;
restart : std_ulogic; -- buffered restart request (after branch)
pc : std_ulogic_vector(XLEN-1 downto 0);
reset : std_ulogic; -- restart request (after branch)
resp : std_ulogic; -- bus response
priv : std_ulogic; -- fetch privilege level
state : fetch_engine_state_t;
restart : std_ulogic; -- buffered restart request (after branch)
pc : std_ulogic_vector(XLEN-1 downto 0);
reset : std_ulogic; -- restart request (after branch)
resp : std_ulogic; -- bus response
priv : std_ulogic; -- fetch privilege level
end record;
signal fetch_engine : fetch_engine_t;

Expand Down Expand Up @@ -190,8 +189,6 @@ architecture neorv32_cpu_control_rtl of neorv32_cpu_control is
type execute_engine_t is record
state : execute_engine_state_t;
state_nxt : execute_engine_state_t;
state_prev : execute_engine_state_t;
state_prev2 : execute_engine_state_t;
ir : std_ulogic_vector(31 downto 0);
ir_nxt : std_ulogic_vector(31 downto 0);
is_ci : std_ulogic; -- current instruction is de-compressed instruction
Expand Down Expand Up @@ -359,15 +356,11 @@ begin
fetch_engine_fsm: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
fetch_engine.state <= IF_RESTART;
fetch_engine.state_prev <= IF_RESTART;
fetch_engine.restart <= '1'; -- set to reset IPB
fetch_engine.pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
fetch_engine.priv <= priv_mode_m_c; -- start in machine mode
fetch_engine.state <= IF_RESTART;
fetch_engine.restart <= '1'; -- set to reset IPB
fetch_engine.pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
fetch_engine.priv <= priv_mode_m_c; -- start in machine mode
elsif rising_edge(clk_i) then
-- previous state (for HPMs only) --
fetch_engine.state_prev <= fetch_engine.state;

-- restart request --
if (fetch_engine.state = IF_RESTART) then -- restart done
fetch_engine.restart <= '0';
Expand Down Expand Up @@ -620,25 +613,21 @@ begin
execute_engine_fsm_sync: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
ctrl <= ctrl_bus_zero_c;
execute_engine.state <= RESTART;
execute_engine.state_prev <= RESTART;
execute_engine.state_prev2 <= RESTART;
execute_engine.ir <= (others => '0');
execute_engine.is_ci <= '0';
execute_engine.pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
execute_engine.next_pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
execute_engine.link_pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
ctrl <= ctrl_bus_zero_c;
execute_engine.state <= RESTART;
execute_engine.ir <= (others => '0');
execute_engine.is_ci <= '0';
execute_engine.pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
execute_engine.next_pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
execute_engine.link_pc <= CPU_BOOT_ADDR(XLEN-1 downto 2) & "00"; -- 32-bit aligned boot address
elsif rising_edge(clk_i) then
-- control bus --
ctrl <= ctrl_nxt;

-- execute engine arbiter --
execute_engine.state <= execute_engine.state_nxt;
execute_engine.state_prev <= execute_engine.state;
execute_engine.state_prev2 <= execute_engine.state_prev;
execute_engine.ir <= execute_engine.ir_nxt;
execute_engine.is_ci <= execute_engine.is_ci_nxt;
execute_engine.state <= execute_engine.state_nxt;
execute_engine.ir <= execute_engine.ir_nxt;
execute_engine.is_ci <= execute_engine.is_ci_nxt;

-- current PC: address of instruction being executed --
if (execute_engine.pc_we = '1') then
Expand Down Expand Up @@ -2358,29 +2347,25 @@ begin
((csr.privilege = priv_mode_m_c) and (csr.mcyclecfg_minh = '0')) or -- not inhibited when in machine-mode
((csr.privilege = priv_mode_u_c) and (csr.mcyclecfg_uinh = '0')) -- not inhibited when in user-mode
) else '0';
cnt_event(hpmcnt_event_ir_c) <= '1' when (execute_engine.state = EXECUTE) and ( -- retired (=executed) instruction
cnt_event(hpmcnt_event_tm_c) <= '0'; -- unused/reserved (time)
cnt_event(hpmcnt_event_ir_c) <= '1' when (execute_engine.state = EXECUTE) and ( -- retired (==executed) instruction
((csr.privilege = priv_mode_m_c) and (csr.minstretcfg_minh = '0')) or -- not inhibited when in machine-mode
((csr.privilege = priv_mode_u_c) and (csr.minstretcfg_uinh = '0')) -- not inhibited when in user-mode
) else '0';
cnt_event(hpmcnt_event_tm_c) <= '0'; -- unused/reserved (time)

-- NEORV32-specific counter events (for HPM counters only) --
cnt_event(hpmcnt_event_cir_c) <= '1' when (execute_engine.state = EXECUTE) and (execute_engine.is_ci = '1') else '0'; -- executed compressed instruction
cnt_event(hpmcnt_event_wait_if_c) <= '1' when (fetch_engine.state = IF_PENDING) and (fetch_engine.state_prev = IF_PENDING) else '0'; -- instruction fetch memory wait cycle
cnt_event(hpmcnt_event_wait_ii_c) <= '1' when (execute_engine.state = DISPATCH) and (execute_engine.state_prev = DISPATCH) else '0'; -- instruction issue wait cycle
cnt_event(hpmcnt_event_wait_mc_c) <= '1' when (execute_engine.state = ALU_WAIT) else '0'; -- multi-cycle alu-operation wait cycle

cnt_event(hpmcnt_event_load_c) <= '1' when (ctrl.lsu_req = '1') and (ctrl.lsu_rw = '0') else '0'; -- load operation
cnt_event(hpmcnt_event_store_c) <= '1' when (ctrl.lsu_req = '1') and (ctrl.lsu_rw = '1') else '0'; -- store operation
cnt_event(hpmcnt_event_wait_ls_c) <= '1' when (execute_engine.state = MEM_WAIT) and (execute_engine.state_prev2 = MEM_WAIT) else '0'; -- load/store memory wait cycle

cnt_event(hpmcnt_event_jump_c) <= '1' when (execute_engine.state = BRANCH) and (execute_engine.ir(instr_opcode_lsb_c+2) = '1') else '0'; -- jump (unconditional)
cnt_event(hpmcnt_event_branch_c) <= '1' when (execute_engine.state = BRANCH) and (execute_engine.ir(instr_opcode_lsb_c+2) = '0') else '0'; -- branch (conditional, taken or not taken)
cnt_event(hpmcnt_event_tbranch_c) <= '1' when (execute_engine.state = BRANCHED) and (execute_engine.state_prev = BRANCH) and
(execute_engine.ir(instr_opcode_lsb_c+2) = '0') else '0'; -- taken branch (conditional)

cnt_event(hpmcnt_event_trap_c) <= '1' when (trap_ctrl.env_enter = '1') else '0'; -- entered trap
cnt_event(hpmcnt_event_illegal_c) <= '1' when (trap_ctrl.env_enter = '1') and (trap_ctrl.cause = trap_iil_c) else '0'; -- illegal operation
cnt_event(hpmcnt_event_compr_c) <= '1' when (execute_engine.state = EXECUTE) and (execute_engine.is_ci = '1') else '0'; -- executed compressed instruction
cnt_event(hpmcnt_event_wait_dis_c) <= '1' when (execute_engine.state = DISPATCH) and (issue_engine.valid = "00") else '0'; -- instruction dispatch wait cycle
cnt_event(hpmcnt_event_wait_alu_c) <= '1' when (execute_engine.state = ALU_WAIT) else '0'; -- multi-cycle ALU co-processor wait cycle

cnt_event(hpmcnt_event_branch_c) <= '1' when (execute_engine.state = BRANCH) else '0'; -- executed branch instruction
cnt_event(hpmcnt_event_branched_c) <= '1' when (execute_engine.state = BRANCHED) else '0'; -- control flow transfer

cnt_event(hpmcnt_event_load_c) <= '1' when (ctrl.lsu_req = '1') and (ctrl.lsu_rw = '0') else '0'; -- executed load operation
cnt_event(hpmcnt_event_store_c) <= '1' when (ctrl.lsu_req = '1') and (ctrl.lsu_rw = '1') else '0'; -- executed store operation
cnt_event(hpmcnt_event_wait_lsu_c) <= '1' when (ctrl.lsu_req = '0') and (execute_engine.state = MEM_WAIT) else '0'; -- load/store unit memory wait cycle

cnt_event(hpmcnt_event_trap_c) <= '1' when (trap_ctrl.env_enter = '1') else '0'; -- entered trap


-- ****************************************************************************************************************************
Expand Down
35 changes: 17 additions & 18 deletions rtl/core/neorv32_package.vhd
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ package neorv32_package is

-- Architecture Constants -----------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01090502"; -- hardware version
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01090503"; -- hardware version
constant archid_c : natural := 19; -- official RISC-V architecture ID
constant XLEN : natural := 32; -- native data path width

Expand Down Expand Up @@ -688,25 +688,24 @@ package neorv32_package is
constant priv_mode_m_c : std_ulogic := '1'; -- machine mode
constant priv_mode_u_c : std_ulogic := '0'; -- user mode

-- HPM Event System -----------------------------------------------------------------------
-- HPM Events -----------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
constant hpmcnt_event_cy_c : natural := 0; -- Active cycle
constant hpmcnt_event_tm_c : natural := 1; -- Time (unused/reserved)
constant hpmcnt_event_ir_c : natural := 2; -- Retired instruction
constant hpmcnt_event_cir_c : natural := 3; -- Retired compressed instruction
constant hpmcnt_event_wait_if_c : natural := 4; -- Instruction fetch memory wait cycle
constant hpmcnt_event_wait_ii_c : natural := 5; -- Instruction issue wait cycle
constant hpmcnt_event_wait_mc_c : natural := 6; -- Multi-cycle ALU-operation wait cycle
constant hpmcnt_event_load_c : natural := 7; -- Load operation
constant hpmcnt_event_store_c : natural := 8; -- Store operation
constant hpmcnt_event_wait_ls_c : natural := 9; -- Load/store memory wait cycle
constant hpmcnt_event_jump_c : natural := 10; -- Unconditional jump
constant hpmcnt_event_branch_c : natural := 11; -- Conditional branch (taken or not taken)
constant hpmcnt_event_tbranch_c : natural := 12; -- Conditional taken branch
constant hpmcnt_event_trap_c : natural := 13; -- Entered trap
constant hpmcnt_event_illegal_c : natural := 14; -- Illegal instruction exception
-- RISC-V-compliant --
constant hpmcnt_event_cy_c : natural := 0; -- active cycle
constant hpmcnt_event_tm_c : natural := 1; -- time (unused/reserved)
constant hpmcnt_event_ir_c : natural := 2; -- retired instruction
-- NEORV32-specific --
constant hpmcnt_event_compr_c : natural := 3; -- executed compressed instruction
constant hpmcnt_event_wait_dis_c : natural := 4; -- instruction dispatch wait cycle
constant hpmcnt_event_wait_alu_c : natural := 5; -- multi-cycle ALU co-processor wait cycle
constant hpmcnt_event_branch_c : natural := 6; -- executed branch instruction
constant hpmcnt_event_branched_c : natural := 7; -- control flow transfer
constant hpmcnt_event_load_c : natural := 8; -- load operation
constant hpmcnt_event_store_c : natural := 9; -- store operation
constant hpmcnt_event_wait_lsu_c : natural := 10; -- load-store unit memory wait cycle
constant hpmcnt_event_trap_c : natural := 11; -- entered trap
--
constant hpmcnt_event_size_c : natural := 15; -- length of this list
constant hpmcnt_event_size_c : natural := 12; -- length of this list

-- ****************************************************************************************************************************
-- Helper Functions
Expand Down
Loading
Loading