Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚠️ Rework SYSINFO module #659

Merged
merged 11 commits into from
Jul 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ mimpid = 0x01040312 -> Version 01.04.03.12 -> v1.4.3.12

| Date (*dd.mm.yyyy*) | Version | Comment |
|:-------------------:|:-------:|:--------|
| 28.07.2023 | 1.8.7.3 | :warning: reworked **SYSINFO** module; clean-up address space layout; clean-up assertion notes; [#659](https://github.com/stnolting/neorv32/pull/659) |
| 27.07.2023 | 1.8.7.2 | :bug: make sure that IMEM/DMEM size is always a power of two; [#658](https://github.com/stnolting/neorv32/pull/658) |
| 27.07.2023 | 1.8.7.1 | :warning: remove `CUSTOM_ID` generic; cleanup and re-layout `NEORV32_SYSINFO.SOC` bits; (:bug:) fix gateway's generics (`positive` -> `natural` as these generics are allowed to be zero); [#657](https://github.com/stnolting/neorv32/pull/657) |
| 26.07.2023 | [**:rocket:1.8.7**](https://github.com/stnolting/neorv32/releases/tag/v1.8.7) | **New release** |
Expand Down
227 changes: 111 additions & 116 deletions docs/datasheet/cpu.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -166,9 +166,7 @@ CPU back-end for actual execution. Execution is conducted by a state-machine tha
includes the <<_control_and_status_registers_csrs>> as well as the trap controller.


// ####################################################################################################################
:sectnums:
=== Sleep Mode
==== Sleep Mode

The NEORV32 CPU provides a single sleep mode that can be entered to power-down the core reducing dynamic
power consumption. Sleep mode in entered by executing the `wfi` instruction. When in sleep mode, all CPU-internal
Expand All @@ -184,9 +182,7 @@ The CPU automatically wakes up from sleep mode if a debug session is started via
a simple `nop` when the CPU is _in_ debug-mode or during single-stepping.


// ####################################################################################################################
:sectnums:
=== Full Virtualization
==== Full Virtualization

Just like the RISC-V ISA, the NEORV32 aims to provide _maximum virtualization_ capabilities on CPU and SoC level to
allow a high standard of **execution safety**. The CPU supports **all** traps specified by the official RISC-V
Expand All @@ -197,6 +193,115 @@ out-of-order operations that might have to be reverted). This allows a defined a
at any time improving overall execution safety.


<<<
// ####################################################################################################################
:sectnums:
=== Bus Interface

The NEORV32 CPU provides separated instruction fetch and data access interfaces making it a **Harvard Architecture**:
the instruction fetch interface (`i_bus_*` signals) is used for fetching instructions and the data access interface
(`d_bus_*` signals) is used to access data via load and store operations. Each of these interfaces can access an address
space of up to 2^32^ bytes (4GB).

The bus interface uses two custom interface types: `bus_req_t` is used to propagate the bus access **requests**. These
signals are driven by the _accessing_ device (i.e. the CPU core). `bus_rsp_t` is used to return the bus **response** and
is driven by the _accessed_ device or bus system (i.e. a processor-internal memory or IO device).

.Bus Interface - Request Bus (`bus_req_t`)
[cols="^1,^1,<6"]
[options="header",grid="rows"]
|=======================
| Signal | Width | Description
| `addr` | 32 | Access address (byte addressing)
| `data` | 32 | Write data
| `ben` | 4 | Byte-enable for each byte in `data`
| `we` | 1 | **Write** request trigger (single-shot)
| `re` | 1 | **Read** request trigger (single-shot)
| `src` | 1 | Access source (`0` = instruction fetch, `1` = load/store)
| `priv` | 1 | Set if privileged (M-mode) access
| `rvso` | 1 | Set if current access is a reservation-set operation (atomic `lr` or `sc` instruction)
|=======================

.Bus Interface - Response Bus (`bus_rsp_t`)
[cols="^1,^1,<6"]
[options="header",grid="rows"]
|=======================
| Signal | Width | Description
| `data` | 32 | Read data (single-shot)
| `ack` | 1 | Transfer acknowledge / success (single-shot)
| `err` | 1 | Transfer error / fail (single-shot)
|=======================

.Processor Bus System
[NOTE]
This type of bus system is used for used for the entire NEORV32 processor to construct the <<_address_space>>.


:sectnums:
==== Bus Interface Protocol

Bus transaction are entirely triggered by the request bus. A new bus request is initiated either by the `re` signal
(= read request) or by the `we` signal (= write request). These signals are mutually exclusive. In case of a request,
the according signal is high for exactly one clock cycle. The transaction is completed when the accessed device returns
a response via the response interface: `ack` is high for exactly one cycle if the transaction was completed successfully.
`err` is high for exactly one cycle if the transaction failed to complete. These two signals are also mutually exclusive.
In case of a read access the read data is returned together with the `ack` signal. Otherwise, the return data signal is
kept at all-zero allowing wired-or interconnection of all response buses.

The figure below shows three exemplary bus accesses:

[start=1]
. A read access to address `A_addr` returning `rdata` after several cycles (slow response; `ACK` arrives after several cycles).
. A write access to address `B_addr` writing `wdata` (fastest response; `ACK` arrives right in the next cycle).
. A failing read access to address `C_addr` (slow response; `ERR` arrives after several cycles).

.Three Exemplary Bus Transactions
image::bus_interface.png[700]

.Signal State
[NOTE]
All signals of the request bus interface (except for the read/write transfer triggers)
remain stable until the bus access is completed.


:sectnums:
==== Atomic Accesses

The load-reservate (`lr.w`) and store-conditional (`sc.w`) instructions from the <<_a_isa_extension>> execute as standard
load/store bus transactions but with the `rvso` ("reservation set operation") signal being set. It is the task of the
<<_reservation_set_controller>> to handle these LR/SC bus transactions accordingly.

.Reservation Set Controller
[NOTE]
See section <<_address_space>> / <<_reservation_set_controller>> for more information.

.Read-Modify-Write Operations
[IMPORTANT]
Read-modify-write operations (line an atomic swap / `amoswap.w`) are **not** supported. However, the NEORV32
<<_core_libraries>> provide an emulation wrapper for those unsupported instructions that is
based on LR/SC pairs. A demo/program can be found in `sw/example/atomic_test`.

The figure below shows three exemplary bus accesses. For easier understanding the current state of the reservation set
is added as `rvs_valid` signal.

[start=1]
. A load-reservate (LR) instruction using `addr` as address. This instruction returns the loaded data `rdata` and also
registers a reservation for the address `addr` (`rvs_valid` becomes set).
. A store-conditional (SC) instruction attempts to write `wdata1` to `addr`. This SC operation **succeeds**, so `wdata1`
is actually written to `addr`. The successful operation is indicated by a 1 being returned via the `rsp.data` signal
together with the `ack`. As the LR/SC is completed the registered reservation is invalidated (`rvs_valid` becomes cleared).
. Another store-conditional (SC) instruction attempts to write `wdata2` to `addr`. As the reservation set is already invalidated
(`rvs_valid` is `0`) the store access fails, so `wdata2` is **not** written to `addr`. The failed operation is indicated by a 0
being returned via the `rsp.data` signal together with the `ack`.

.Three Exemplary LR/SC Bus Transactions
image::bus_interface_atomic.png[700]

.SC Status
[NOTE]
The normal "load data" mechanism is used to return success/failure of the `sc.w` instruction to the CPU (via `rsp.data`).


<<<
// ####################################################################################################################
:sectnums:
Expand Down Expand Up @@ -796,113 +901,3 @@ address misaligned" exception are not resumable in most cases. These exception m
For 32-bit-only instructions (= no `C` extension) the misaligned instruction exception is raised if bit 1 of the fetch
address is set (i.e. not on a 32-bit boundary). If the `C` extension is implemented there will never be a misaligned
instruction exception _at all_. In both cases bit 0 of the program counter (and all related CSRs) is hardwired to zero.



<<<
// ####################################################################################################################
:sectnums:
==== Bus Interface

The NEORV32 CPU provides separated instruction fetch and data access interfaces making it a **Harvard Architecture**:
the instruction fetch interface (`i_bus_*` signals) is used for fetching instructions and the data access interface
(`d_bus_*` signals) is used to access data via load and store operations. Each of these interfaces can access an address
space of up to 2^32^ bytes (4GB).

The bus interface uses two custom interface types: `bus_req_t` is used to propagate the bus access **requests**. These
signals are driven by the _accessing_ device (i.e. the CPU core). `bus_rsp_t` is used to return the bus **response** and
is driven by the _accessed_ device or bus system (i.e. a processor-internal memory or IO device).

.Bus Interface - Request Bus (`bus_req_t`)
[cols="^1,^1,<6"]
[options="header",grid="rows"]
|=======================
| Signal | Width | Description
| `addr` | 32 | Access address (byte addressing)
| `data` | 32 | Write data
| `ben` | 4 | Byte-enable for each byte in `data`
| `we` | 1 | **Write** request trigger (single-shot)
| `re` | 1 | **Read** request trigger (single-shot)
| `src` | 1 | Access source (`0` = instruction fetch, `1` = load/store)
| `priv` | 1 | Set if privileged (M-mode) access
| `rvso` | 1 | Set if current access is a reservation-set operation (atomic `lr` or `sc` instruction)
|=======================

.Bus Interface - Response Bus (`bus_rsp_t`)
[cols="^1,^1,<6"]
[options="header",grid="rows"]
|=======================
| Signal | Width | Description
| `data` | 32 | Read data (single-shot)
| `ack` | 1 | Transfer acknowledge / success (single-shot)
| `err` | 1 | Transfer error / fail (single-shot)
|=======================

.Processor Bus System
[NOTE]
This type of bus system is used for used for the entire NEORV32 processor to construct the <<_address_space>>.


:sectnums:
===== Bus Interface Protocol

Bus transaction are entirely triggered by the request bus. A new bus request is initiated either by the `re` signal
(= read request) or by the `we` signal (= write request). These signals are mutually exclusive. In case of a request,
the according signal is high for exactly one clock cycle. The transaction is completed when the accessed device returns
a response via the response interface: `ack` is high for exactly one cycle if the transaction was completed successfully.
`err` is high for exactly one cycle if the transaction failed to complete. These two signals are also mutually exclusive.
In case of a read access the read data is returned together with the `ack` signal. Otherwise, the return data signal is
kept at all-zero allowing wired-or interconnection of all response buses.

The figure below shows three exemplary bus accesses:

[start=1]
. A read access to address `A_addr` returning `rdata` after several cycles (slow response; `ACK` arrives after several cycles).
. A write access to address `B_addr` writing `wdata` (fastest response; `ACK` arrives right in the next cycle).
. A failing read access to address `C_addr` (slow response; `ERR` arrives after several cycles).

.Three Exemplary Bus Transactions
image::bus_interface.png[700]

.Signal State
[NOTE]
All signals of the request bus interface (except for the read/write transfer triggers)
remain stable until the bus access is completed.


:sectnums:
===== Atomic Accesses

The load-reservate (`lr.w`) and store-conditional (`sc.w`) instructions from the <<_a_isa_extension>> execute as standard
load/store bus transactions but with the `rvso` ("reservation set operation") signal being set. It is the task of the
<<_reservation_set_controller>> to handle these LR/SC bus transactions accordingly.

.Reservation Set Controller
[NOTE]
See section <<_address_space>> / <<_reservation_set_controller>> for more information.

.Read-Modify-Write Operations
[IMPORTANT]
Read-modify-write operations (line an atomic swap / `amoswap.w`) are **not** supported. However, the NEORV32
<<_core_libraries>> provide an emulation wrapper for those unsupported instructions that is
based on LR/SC pairs. A demo/program can be found in `sw/example/atomic_test`.

The figure below shows three exemplary bus accesses. For easier understanding the current state of the reservation set
is added as `rvs_valid` signal.

[start=1]
. A load-reservate (LR) instruction using `addr` as address. This instruction returns the loaded data `rdata` and also
registers a reservation for the address `addr` (`rvs_valid` becomes set).
. A store-conditional (SC) instruction attempts to write `wdata1` to `addr`. This SC operation **succeeds**, so `wdata1`
is actually written to `addr`. The successful operation is indicated by a 1 being returned via the `rsp.data` signal
together with the `ack`. As the LR/SC is completed the registered reservation is invalidated (`rvs_valid` becomes cleared).
. Another store-conditional (SC) instruction attempts to write `wdata2` to `addr`. As the reservation set is already invalidated
(`rvs_valid` is `0`) the store access fails, so `wdata2` is **not** written to `addr`. The failed operation is indicated by a 0
being returned via the `rsp.data` signal together with the `ack`.

.Three Exemplary LR/SC Bus Transactions
image::bus_interface_atomic.png[700]

.SC Status
[NOTE]
The normal "load data" mechanism is used to return success/failure of the `sc.w` instruction to the CPU (via `rsp.data`).
25 changes: 13 additions & 12 deletions docs/datasheet/soc.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -468,12 +468,12 @@ image::address_space.png[900]
[options="header",grid="rows"]
|=======================
| # | Region | PMAs | Description
| 1 | Instruction address space | `rwx` | For instructions (=code) and constants. A configurable section of this address space can used by the internal <<_instruction_memory_imem>>.
| 2 | Data address space | `rwx` | For application runtime data (heap, stack, etc.). A configurable section of this address space can be used by the internal <<_data_memory_dmem>>). Code can also be executed from data memory.
| 1 | Internal IMEM address space | `rwx` | For instructions (=code) and constants; mapped to the internal <<_instruction_memory_imem>>.
| 2 | Internal DMEM address space | `rwx` | For application runtime data (heap, stack, etc.); mapped to the internal <<_data_memory_dmem>>).
| 3 | Memory-mapped XIP flash | `r-x` | Memory-mapped access to the <<_execute_in_place_module_xip>> SPI flash.
| 4 | Bootloader address space | `r-x` | Read-only memory for the internal <<_bootloader_rom_bootrom>> containing the default <<_bootloader>>.
| 5 | IO/peripheral address space | `rwx` | Processor-internal peripherals / IO devices.
| 6 | The "void" | `rwx` | Unmapped address space. All accesses to this region(s) are redirected to the <<_processor_external_memory_interface_wishbone>>.
| 6 | The "**void**" | `rwx` | Unmapped address space. All accesses to this region(s) are redirected to the <<_processor_external_memory_interface_wishbone>> (if implemented).
|=======================

The CPU can access all of the 32-bit address space from the instruction fetch interface and also from the data access
Expand Down Expand Up @@ -503,14 +503,14 @@ customizable memory map implemented via VHDL constants in the main package file
[source,vhdl]
----
-- Main Address Regions ---
constant mem_ispace_base_c : std_ulogic_vector(31 downto 0) := x"00000000";
constant mem_dspace_base_c : std_ulogic_vector(31 downto 0) := x"80000000";
constant mem_xip_base_c : std_ulogic_vector(31 downto 0) := x"e0000000";
constant mem_xip_size_c : natural := 256*1024*1024;
constant mem_boot_base_c : std_ulogic_vector(31 downto 0) := x"ffffc000";
constant mem_boot_size_c : natural := 8*1024;
constant mem_io_base_c : std_ulogic_vector(31 downto 0) := x"ffffe000";
constant mem_io_size_c : natural := 8*1024;
constant mem_imem_base_c : std_ulogic_vector(31 downto 0) := x"00000000"; -- IMEM size via generic
constant mem_dmem_base_c : std_ulogic_vector(31 downto 0) := x"80000000"; -- DMEM size via generic
constant mem_xip_base_c : std_ulogic_vector(31 downto 0) := x"e0000000";
constant mem_xip_size_c : natural := 256*1024*1024;
constant mem_boot_base_c : std_ulogic_vector(31 downto 0) := x"ffffc000";
constant mem_boot_size_c : natural := 8*1024;
constant mem_io_base_c : std_ulogic_vector(31 downto 0) := x"ffffe000";
constant mem_io_size_c : natural := 8*1024;
----

Besides the delegation of bus requests the gateway also implements a bus monitor (aka "the bus keeper") that tracks all
Expand Down Expand Up @@ -585,7 +585,8 @@ Context changes, interrupts, traps, etc. do not effect nor invalidate the reserv
The controller supports only a single global reservation set. By default this reservation set "monitors" a word-aligned
4-byte granule. However, the granularity can be customized via the `AMO_RVS_GRANULARITY` top entity generic (see
<<_processor_top_entity_generics>>) to cover an arbitrarily large naturally aligned address region. The only constraint is
that the size of the address region has to be a power of two.
that the size of the address region has to be a power of two. The configured granularity can be determined by software via
the <<_system_configuration_information_memory_sysinfo>> module.

.Physical Memory Attributes
[NOTE]
Expand Down
10 changes: 5 additions & 5 deletions docs/datasheet/soc_bootrom.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@ This boot ROM module provides a read-only memory that contain the executable ima
is automatically set to the beginning of the bootloader ROM. See sections <<_address_space>> and
<<_boot_configuration>> for more information regarding the processor's different boot scenarios.

.Memory Size
[IMPORTANT]
If the configured boot ROM size is **not** a power of two the actual memory size will be auto-adjusted to
the next power of two (e.g. configuring a memory size of 6kB will result in a physical memory size of 8kB).

.Bootloader Image
[IMPORTANT]
The boot ROM is initialized during synthesis with the default bootloader image
(`rtl/core/neorv32_bootloader_image.vhd`).


.Read-Only Access
[NOTE]
Any write access to the BOOTROM will raise a _store access fault_ exception.
5 changes: 2 additions & 3 deletions docs/datasheet/soc_dmem.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,8 @@

Implementation of the processor-internal data memory is enabled via the processor's `MEM_INT_DMEM_EN`
generic. The size in bytes is defined via the `MEM_INT_DMEM_SIZE` generic. If the DMEM is implemented,
the memory is mapped into the data memory space and located right at the beginning of the data memory
space (default `dspace_base_c` = 0x80000000), see <<_address_space>>. The DMEM is always implemented
as true RAM.
it is mapped to base address `0x80000000` by default (see section <<_address_space>>).
The DMEM is always implemented as true RAM.

.Memory Size
[IMPORTANT]
Expand Down
3 changes: 1 addition & 2 deletions docs/datasheet/soc_imem.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,7 @@

Implementation of the processor-internal instruction memory is enabled via the processor's
`MEM_INT_IMEM_EN` generic. The size in bytes is defined via the `MEM_INT_IMEM_SIZE` generic. If the
IMEM is implemented, the memory is mapped into the instruction memory space and located right at the
beginning of the instruction memory space (default `ispace_base_c` = 0x00000000), see <<_address_space>>.
IMEM is implemented, it is mapped to base address `0x00000000` by default (see section <<_address_space>>).

By default the IMEM is implemented as true RAM so the content can be modified during run time. This is
required when using a bootloader that can update the content of the IMEM at any time. If you do not need
Expand Down
Loading
Loading