Skip to content

Commit

Permalink
Merge pull request #281 from stnolting/rework_pmp
Browse files Browse the repository at this point in the history
⚠️ Rework physical memory protection (PMP) [NAPOT -> TOR]
  • Loading branch information
stnolting authored Feb 27, 2022
2 parents 4fb9ab1 + 099ba20 commit d616301
Show file tree
Hide file tree
Showing 27 changed files with 700 additions and 806 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ The version number is globally defined by the `hw_version_c` constant in the mai

| Date (*dd.mm.yyyy*) | Version | Comment |
|:----------:|:-------:|:--------|
| 26.02.2022 | 1.6.8.6 | :warning: :lock: **reworked Physical Memory Protection (PMP)**: replacing `NAPOT` mode by `TOR` mode and fixing several minor PMP CSR-access bugs; maximum number of PMP regions is now limited to 16 entries; :warning: removed **BUSKEEPER's NULL address check** (introduced in version `1.6.5.4`) - use a single PMP entry instead; see [PR #281](https://github.com/stnolting/neorv32/pull/281) |
| 25.02.2022 | 1.6.8.5 | minor BUSMUX (bus multiplexer for CPU's instruction and data buses) and CPU control edits (pipeline front-end) |
| 24.02.2022 | 1.6.8.4 | :bug: **fixed bug in `mip` CSR** (introduced in version `1.6.4.6` with [#236](https://github.com/stnolting/neorv32/pull/236)): to clear/ack a pending interrupt software needs to **clear** the according `mip` bit; see [PR #280](https://github.com/stnolting/neorv32/pull/280) |
| 24.02.2022 | 1.6.8.3 | reworked CPU's data path (use a few _wide_ multiplexers instead of many small ones); [PR #279](https://github.com/stnolting/neorv32/pull/279) |
Expand Down
87 changes: 43 additions & 44 deletions docs/datasheet/cpu.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -258,8 +258,9 @@ side-effects to maintain RISC-V compatibility.

.Physical Memory Protection
[IMPORTANT]
The physical memory protection (see section <<_machine_physical_memory_protection_csrs>>)
only supports the modes _OFF_ and _NAPOT_ yet and a minimal granularity of 8 bytes per region.
The RISC-V-compatible NEORV32 <<_machine_physical_memory_protection_csrs>> only implements the **TOR**
(top of region) mode and only up to 16 PMP regions. Furthermore, the <<_pmpcfg>>'s
_lock bits_ only lock the according PMP entry and not the entries below.
.Atomic Memory Operations
[IMPORTANT]
Expand Down Expand Up @@ -715,62 +716,60 @@ the CPU take a look at the memory-mapped <<_custom_functions_subsystem_cfs>>.
==== **`PMP`** Physical Memory Protection
The NEORV32 physical memory protection (PMP) is compatible to the RISC-V PMP specifications. It can be used
to constrain memory read/write/execute rights for each available privilege level.
The NEORV32 physical memory protection (PMP) provides an elementary memory protection mechanism that can be used
to constrain read, write and execute rights of arbitrary memory regions. The PMP is compatible
to the _RISC-V Privileged Architecture Specifications_. For detailed information see the according spec.'s sections.
The NEORV32 PMP only supports _NAPOT_ mode yet and a minimal region size (granularity) of 8 bytes. Larger
minimal sizes can be configured via the top `PMP_MIN_GRANULARITY` generic to reduce hardware requirements.
The physical memory protection system is implemented when the `PMP_NUM_REGIONS` configuration generic is >0.
In this case the following additional CSRs are available:
[IMPORTANT]
The NEORV32 PMP only supports **TOR** (top of region) mode, which basically is a "base-and-bound" concept, and only
up to 16 PMP regions.
* `pmpcfg*` (0..15, depending on configuration): PMP configuration registers
* `pmpaddr*` (0..63, depending on configuration): PMP address registers
The physical memory protection logic is implemented if the <<_pmp_num_regions>> configuration generic is greater
than zero. This generic also defines the total number of available configurable protection
regions. The minimal granularity of a protected region is defined by the <<_pmp_min_granularity>> generic. Larger
granularity will reduce hardware complexity but will also decrease granularity as the minimal region sizes increases.
The default value is 4 bytes, which allows a minimal region size of 4 bytes.
[TIP]
See section <<_machine_physical_memory_protection_csrs>> for more information regarding the PMP CSRs.
If implemented the PMP provides the following additional CSRs:
The actual number of regions and the minimal region granularity are defined via the top entity
`PMP_MIN_GRANULARITY` and `PMP_NUM_REGIONS` generics. `PMP_MIN_GRANULARITY` defines the minimal available
granularity of each region in bytes. `PMP_NUM_REGIONS` defines the total number of implemented regions and thus, the
number of available `pmpcfg*` and `pmpaddr*` CSRs.
* <<_pmpcfg>> 0..3 (depending on configuration): PMP configuration registers, 4 entries per CSR
* <<_pmpaddr>> 0..15 (depending on configuration): PMP address registers
When implementing more PMP regions that a _certain critical limit_ *an additional register stage
is automatically inserted* into the CPU's memory interfaces to reduce critical path length. Unfortunately, this will also
increase the latency of instruction fetches and data access by +1 cycle.
The critical limit can be adapted for custom use by a constant from the main VHDL package file
(`rtl/core/neorv32_package.vhd`). The default value is 8:
**Operation Summary**
[source,vhdl]
----
-- "critical" number of PMP regions --
constant pmp_num_regions_critical_c : natural := 8;
----
Any CPU access address (from the instruction fetch or data access interface) is tested if it matches _any_
of the specified PMP regions. If there is a match, the configured access rights are enforced:
**Operation**
* a write access (store) will fail if no **write** attribute is set
* a read access (load) will fail if no **read** attribute is set
* an instruction fetch access will fail if no **execute** attribute is set
Any CPU memory access address (from the instruction fetch or data access interface) is tested if it is accessing _any_
of the specified PMP regions(configured via `pmpaddr*` and enabled via `pmpcfg*`). If an
address matches one of these regions, the configured access rights (attributes in `pmpcfg*`) are enforced:
If an access to a protected region does not have the according access rights it will raise the according
instruction/load/store _bus access fault_ exception.
* a write access (store) will fail if no write attribute is set
* a read access (load) will fail if no read attribute is set
* an instruction fetch access will fail if no execute attribute is set
By default, all PMP checks are enforced for user-mode only. However, PMP rules can also be enforced for
machine-mode when the according PMP region has the "LOCK" bit set. This will also prevent any write access
to according region's PMP CSRs until the CPU is reset.
If an access to a protected region does not have the according access rights it will raise the according
instruction/load/store _access fault_ exception.
.PMP Example Program
[TIP]
A simple PMP example program can be found in `sw/example/demo_pmp`.
By default, all PMP checks are enforced for user-level programs only. If you wish to enforce the physical
memory protection also for machine-level programs you need to set the _locked bit_ in the according
`pmpcfg*` configuration CSR.
[IMPORTANT]
After updating the address configuration registers `pmpaddr*` the system requires up to 33 cycles for
internal (iterative) computations before the configuration becomes valid.
**Impact on Critical Path**
[NOTE]
For more information regarding RISC-V physical memory protection see the official _The RISC-V
Instruction Set Manual - Volume II: Privileged Architecture_ specifications.
When implementing more PMP regions that a "_certain critical limit_" an **additional register stage** is automatically
inserted into the CPU's memory interfaces to keep impact on the critical path as short as minimal as possible.
Unfortunately, this will also increase the latency of instruction fetches and data access by one cycle.
The _critical limit_ can be modified by a constant from the main VHDL package file
(`rtl/core/neorv32_package.vhd`, default value = 8):
[source,vhdl]
----
-- "critical" number of PMP regions --
constant pmp_num_regions_critical_c : natural := 8;
----
Expand Down
47 changes: 24 additions & 23 deletions docs/datasheet/cpu_csr.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,8 @@ CSRs with the following notes ...
| 0x343 | <<_mtval>> | _CSR_MTVAL_ | r/- | Machine bad address or instruction | `R`
| 0x344 | <<_mip>> | _CSR_MIP_ | r/w | Machine interrupt pending register | `X`
6+^| **<<_machine_physical_memory_protection_csrs>>**
| 0x3a0 .. 0x3af | <<_pmpcfg, `pmpcfg0`>> .. <<_pmpcfg, `pmpcfg15`>> | _CSR_PMPCFG0_ .. _CSR_PMPCFG15_ | r/w | Physical memory protection config. for region 0..63 | `C`
| 0x3b0 .. 0x3ef | <<_pmpaddr, `pmpaddr0`>> .. <<_pmpaddr, `pmpaddr63`>> | _CSR_PMPADDR0_ .. _CSR_PMPADDR63_ | r/w | Physical memory protection addr. register region 0..63 |
| 0x3a0 .. 0x3af | <<_pmpcfg, `pmpcfg0`>> .. <<_pmpcfg, `pmpcfg3`>> | _CSR_PMPCFG0_ .. _CSR_PMPCFG3_ | r/w | Physical memory protection config. for region 0..15 | `C`
| 0x3b0 .. 0x3ef | <<_pmpaddr, `pmpaddr0`>> .. <<_pmpaddr, `pmpaddr15`>> | _CSR_PMPADDR0_ .. _CSR_PMPADDR15_ | r/w | Physical memory protection addr. register region 0..15 |
6+^| **<<_machine_counter_and_timer_csrs>>**
| 0xb00 | <<_mcycleh, `mcycle`>> | _CSR_MCYCLE_ | r/w | Machine cycle counter low word |
| 0xb02 | <<_minstreth, `minstret`>> | _CSR_MINSTRET_ | r/w | Machine instruction-retired counter low word |
Expand Down Expand Up @@ -476,43 +476,43 @@ interrupt-triggering processor module.
:sectnums:
==== Machine Physical Memory Protection CSRs

The available physical memory protection logic is configured via the _PMP_NUM_REGIONS_ and
_PMP_MIN_GRANULARITY_ top entity generics. _PMP_NUM_REGIONS_ defines the number of implemented
The available physical memory protection logic is configured via the <<_pmp_num_regions>> and
<<_pmp_min_granularity>> top entity generics. <<_pmp_num_regions>> defines the number of implemented
protection regions and thus, the availability of the according `pmpcfg*` and `pmpaddr*` CSRs.
See section <<_pmp_physical_memory_protection>> for more information.

[NOTE]
If trying to access an PMP-related CSR beyond _PMP_NUM_REGIONS_ **no illegal instruction
exception** is triggered. The according CSRs are read-only (writes are ignored) and always return zero.

[IMPORTANT]
The RISC-V-compatible NEORV32 physical memory protection only implements the **NAPOT**
(naturally aligned power-of-two region) mode yet with a minimal region granularity of 8 bytes.
If trying to access an PMP-related CSR beyond <<_pmp_num_regions>> **no illegal instruction
exception** is triggered. The according CSRs are read-only (writes are ignored) and always return zero.
However, any access beyond `pmpcfg3` or `pmpaddr15` (if <<_pmp_num_regions>> is 16, which is the maximum
value) will raise an illegal instruction exception.


:sectnums!:
===== **`pmpcfg`**
===== **`neorv32_cpu_pmp_get_num_regions`**

[cols="4,27,>7"]
[frame="topbot",grid="none"]
|=======================
| 0x3a0 - 0x3af| **Physical memory protection configuration registers** | `pmpcfg0` - `pmpcfg15`
| 0x3a0 - 0x3a3| **Physical memory protection configuration registers** | `pmpcfg0` - `pmpcfg3`
3+| Reset value: _0x00000000_
3+| The `pmpcfg*` CSRs are compatible to the RISC-V specifications. They are used to configure the protected
regions, where each `pmpcfg*` CSR provides configuration bits for four regions. The following bits (for the
first PMP configuration entry) are implemented (all remaining bits are always zero and are read-only):
regions, where each `pmpcfg*` CSR provides configuration bits for four regions (8-bit per region).
The actual number of available `pmpcfg` CSRs and CSR entries is defined by the <<_pmp_num_regions>> generic.
|=======================

.Physical memory protection configuration register entry
.Physical memory protection configuration register entry (1 out of 4)
[cols="^1,^3,^1,<11"]
[options="header",grid="rows"]
|=======================
| Bit | RISC-V name | R/W | Function
| 7 | _L_ | r/w | lock bit, can only be cleared by CPU reset
| 6:5 | - | r/- | reserved, read as zero
| 4:3 | _A_ | r/w | mode configuration; only OFF (`00`) and NAPOT (`11`) are supported
| 2 | _X_ | r/w | execute permission
| 1 | _W_ | r/w | write permission
| 0 | _R_ | r/w | read permission
| 7 | `L` | r/w | lock bit, prevents further write accesses, also enforces access rights in machine-mode, can only be cleared by CPU reset
| 6:5 | - | r/- | reserved, read as zero
| 4:3 | `A` | r/w | mode configuration; only **OFF** (`00`) and **TOR** (`01`) modes are supported, any other value will map back to OFF/TOR
| 2 | `X` | r/w | execute permission
| 1 | `W` | r/w | write permission
| 0 | `R` | r/w | read permission
|=======================


Expand All @@ -522,14 +522,15 @@ first PMP configuration entry) are implemented (all remaining bits are always ze
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|=======================
| 0x3b0 - 0x3ef| **Physical memory protection address registers** | `pmpaddr0` - `pmpaddr63`
| 0x3b0 - 0x3bf| **Physical memory protection address registers** | `pmpaddr0` - `pmpaddr15`
3+| Reset value: _UNDEFINED_
3+| The `pmpaddr*` CSRs are compatible to the RISC-V specifications. They are used to configure the PMP region's base
address and the region size.
address and the region size. Note that the the two LSBs (`1:0`) of each `pmpaddr` register are hardwired to zero. Hence, the minimal
region size is 4 bytes. The actual number of available `pmpaddr` CSRs is defined by the <<_pmp_num_regions>> generic.
|=======================

[NOTE]
When configuring PMP make sure to set `pmpaddr*` before activating the according region via
When configuring the PMP make sure to set `pmpaddr*` before activating the according region via
`pmpcfg*`. When changing the PMP configuration, deactivate the according region via `pmpcfg*`
before modifying `pmpaddr*`.

Expand Down
7 changes: 4 additions & 3 deletions docs/datasheet/soc.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -490,7 +490,7 @@ See section <<_pmp_physical_memory_protection>> for more information.
[frame="all",grid="none"]
|======
| **PMP_NUM_REGIONS** | _natural_ | 0
3+| Total number of implemented protections regions (0..64). If this generics is zero no physical memory
3+| Total number of implemented protection regions (0..16). If this generics is zero no physical memory
protection logic will be implemented at all.
|======

Expand All @@ -501,8 +501,9 @@ protection logic will be implemented at all.
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **PMP_MIN_GRANULARITY** | _natural_ | 64*1024
3+| Minimal region granularity in bytes. Has to be a power of two. Has to be at least 8 bytes.
| **PMP_MIN_GRANULARITY** | _natural_ | 4
3+| Minimal region granularity in bytes. Has to be a power of two and has to be at least 4 bytes. A larger granularity
will reduce hardware utilization and impact on critical path but will also reduce the minimal region size.
|======


Expand Down
21 changes: 2 additions & 19 deletions docs/datasheet/soc_buskeeper.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -46,28 +46,11 @@ the _BUSKEEPER_ERR_FLAG_ bit remains zero (since the error signal is not trigger
the CPU's PMP logic).


**NULL Address Check**

The bus keeper can ensure that no accesses are permitted to NULL addresses (`addr = 0x00000000`). These kind of
access often occur when using uninitialized pointers. If the _BUSKEEPER_NULL_CHECK_EN_ bit is set, any access to
address zero (instruction fetch, load data, store data) will raise an according bus exception. This flag
automatically clears on a hardware reset.

If a NULL address access has been detected the _BUSKEEPER_ERR_FLAG_ flag is set and the _BUSKEEPER_ERR_TYPE_
flag is cleared indicating a "Device Error".

[NOTE]
Address 0 is normally used by the IMEM and contains boot code instructions that are executed _once_ right after
hardware reset. Hence, activating the bus keeper's NULL check in application code will not corrupt code execution
at all.


.BUSKEEPER register map (`struct NEORV32_BUSKEEPER`)
[cols="<2,<2,<4,^1,<4"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.3+<| `0xffffff7C` .3+<| `NEORV32_BUSKEEPER.CTRL` <|`0` _BUSKEEPER_ERR_TYPE_ ^| r/- | Bus error type, valid if _BUSKEEPER_ERR_FLAG_
<|`16` _BUSKEEPER_NULL_CHECK_EN_ ^| r/w <| Enable NULL address check when set
<|`31` _BUSKEEPER_ERR_FLAG_ ^| r/- <| Sticky error flag, clears after read or write access
.2+<| `0xffffff7C` .2+<| `NEORV32_BUSKEEPER.CTRL` <|`0` _BUSKEEPER_ERR_TYPE_ ^| r/- <| Bus error type, valid if _BUSKEEPER_ERR_FLAG_
<|`31` _BUSKEEPER_ERR_FLAG_ ^| r/c <| Sticky error flag, clears after read or write access
|=======================
37 changes: 9 additions & 28 deletions rtl/core/neorv32_bus_keeper.vhd
Original file line number Diff line number Diff line change
Expand Up @@ -74,11 +74,8 @@ architecture neorv32_bus_keeper_rtl of neorv32_bus_keeper is
constant lo_abb_c : natural := index_size_f(buskeeper_size_c); -- low address boundary bit

-- Control register --
constant ctrl_err_type_c : natural := 0; -- r/-: error type LSB: 0=device error, 1=access timeout
constant ctrl_nul_check_en_c : natural := 16; -- r/w: enable NULL address check
constant ctrl_err_flag_c : natural := 31; -- r/c: bus error encountered, sticky; cleared by writing zero
--
signal ctrl_null_check_en : std_ulogic;
constant ctrl_err_type_c : natural := 0; -- r/-: error type LSB: 0=device error, 1=access timeout
constant ctrl_err_flag_c : natural := 31; -- r/c: bus error encountered, sticky; cleared by writing zero

-- error codes --
constant err_device_c : std_ulogic := '0'; -- device access error
Expand All @@ -88,9 +85,6 @@ architecture neorv32_bus_keeper_rtl of neorv32_bus_keeper is
signal err_flag : std_ulogic;
signal err_type : std_ulogic;

-- NULL address check --
signal null_check : std_ulogic;

-- access control --
signal acc_en : std_ulogic; -- module access enable
signal wren : std_ulogic; -- word write enable
Expand Down Expand Up @@ -124,26 +118,19 @@ begin
rw_access: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
ack_o <= '-';
data_o <= (others => '-');
ctrl_null_check_en <= '0'; -- required
err_flag <= '0'; -- required
err_type <= '0';
ack_o <= '-';
data_o <= (others => '-');
err_flag <= '0'; -- required
err_type <= '0';
elsif rising_edge(clk_i) then
-- bus handshake --
ack_o <= wren or rden;

-- write access --
if (wren = '1') then
ctrl_null_check_en <= data_i(ctrl_nul_check_en_c);
end if;

-- read access --
data_o <= (others => '0');
if (rden = '1') then
data_o(ctrl_err_type_c) <= err_type;
data_o(ctrl_nul_check_en_c) <= ctrl_null_check_en;
data_o(ctrl_err_flag_c) <= err_flag;
data_o(ctrl_err_type_c) <= err_type;
data_o(ctrl_err_flag_c) <= err_flag;
end if;
--
if (control.bus_err = '1') then -- sticky error flag
Expand Down Expand Up @@ -176,14 +163,11 @@ begin
control.timeout <= std_ulogic_vector(to_unsigned(max_proc_int_response_time_c, index_size_f(max_proc_int_response_time_c)+1));
if (bus_rden_i = '1') or (bus_wren_i = '1') then
control.pending <= '1';
if (null_check = '1') then -- invalid access to NULL address
control.bus_err <= '1';
end if;
end if;
-- access monitor: PENDING --
else
control.timeout <= std_ulogic_vector(unsigned(control.timeout) - 1); -- countdown timer
if (bus_err_i = '1') or (control.bus_err = '1') then -- error termination by bus system
if (bus_err_i = '1') then -- error termination by bus system
control.err_type <= err_device_c; -- device error
control.bus_err <= '1';
control.pending <= '0';
Expand All @@ -201,9 +185,6 @@ begin
end if;
end process keeper_control;

-- NULL address check --
null_check <= '1' when (ctrl_null_check_en = '1') and (or_reduce_f(addr_i) = '0') else '0';

-- signal bus error to CPU --
err_o <= control.bus_err;

Expand Down
Loading

0 comments on commit d616301

Please sign in to comment.