Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OrangeCrab #98

Merged
merged 6 commits into from
Jul 7, 2021
Merged

OrangeCrab #98

merged 6 commits into from
Jul 7, 2021

Conversation

umarcor
Copy link
Collaborator

@umarcor umarcor commented Jul 4, 2021

This is work in progress for supporting ECP5 devices, and specifically the OrangeCrab board.

@jeremyherbert, can you please have a look? Currently, the implementation is successful and a bitstream is generated in CI. However, please, do NOT try that bitstream yet. First, we need to ensure that the design is correct for your board:

Jeremy, can you review neorv32_OrangeCrab_BoardTop_MinimalBoot.vhd and enhance it so that it makes sense (the UART works)? The command for generating the bitstream is make -C setups/examples BOARD=OrangeCrab MinimalBoot.

If you instantiate hard IP (such as a PLL), we need to add setups/osflow/devices/sb_ecp5_components.* sources. That might need overriding DEVICE_SRC, when calling setups/OrangeCrab/Makefile from setups/examples/Makefile. Please, let me know if you need help with that.

This was referenced Jul 4, 2021
@umarcor umarcor force-pushed the orangecrab branch 2 times, most recently from 45d48a3 to 203b628 Compare July 4, 2021 20:15
@umarcor
Copy link
Collaborator Author

umarcor commented Jul 4, 2021

This PR is now rebased on top of #99. The last three commits in this PR are the changes required for adding any new board or a new example for an already supported board.

@jeremyherbert
Copy link
Contributor

I had a little bit of time to look at this. Most of the orangecrab r0.2 boards in the wild (including mine) use the LFE5U-25F-8MG285C device, so the correct PNRFLAGS would be:

PNRFLAGS    ?= --25k --package CSFBGA285 --ignore-loops --timing-allow-fail

At this point, the PnR fails because all of the dual port RAM blocks are used up in this device:

...
Info: Promoting globals...
Info:     promoting clock net OrangeCrab_CLK$TRELLIS_IO_IN to global network
Info: Checksum: 0x6e8541b6

Info: Annotating ports with timing budgets for target frequency 12.00 MHz
Info: Checksum: 0x9e07b48a

Info: Device utilisation:
Info: 	       TRELLIS_SLICE:  3841/12144    31%
Info: 	          TRELLIS_IO:    10/  197     5%
Info: 	                DCCA:     1/   56     1%
Info: 	              DP16KD:    66/   56   117%
Info: 	          MULT18X18D:     0/   28     0%
Info: 	              ALU54B:     0/   14     0%
Info: 	             EHXPLLL:     0/    2     0%
Info: 	             EXTREFB:     0/    1     0%
Info: 	                DCUA:     0/    1     0%
Info: 	           PCSCLKDIV:     0/    2     0%
Info: 	             IOLOGIC:     0/  128     0%
Info: 	            SIOLOGIC:     0/   69     0%
Info: 	                 GSR:     0/    1     0%
Info: 	               JTAGG:     0/    1     0%
Info: 	                OSCG:     0/    1     0%
Info: 	               SEDGA:     0/    1     0%
Info: 	                 DTR:     0/    1     0%
Info: 	             USRMCLK:     0/    1     0%
Info: 	             CLKDIVF:     0/    4     0%
Info: 	           ECLKSYNCB:     0/   10     0%
Info: 	             DLLDELD:     0/    8     0%
Info: 	              DDRDLL:     0/    4     0%
Info: 	             DQSBUFM:     0/    8     0%
Info: 	     TRELLIS_ECLKBUF:     0/    8     0%
Info: 	        ECLKBRIDGECS:     0/    2     0%

Info: Placed 10 cells based on constraints.
ERROR: Unable to place cell 'neorv32_inst.neorv32_inst.neorv32_int_dmem_inst_true.neorv32_int_dmem_inst.1434.1.0.0', no BELs remaining to implement cell type 'DP16KD'
0 warnings, 1 error
ecppack neorv32_OrangeCrab_r02-25F_MinimalBoot.cfg neorv32_OrangeCrab_r02-25F_MinimalBoot.bit
Failed to open input file
make[3]: *** [PnR_Bit.mk:9: neorv32_OrangeCrab_r02-25F_MinimalBoot.bit] Error 1
make[3]: Leaving directory '/home/jeremy/Desktop/umarcor/neorv32/setups/osflow'
make[2]: *** [Makefile:17: run] Error 2
make[2]: Leaving directory '/home/jeremy/Desktop/umarcor/neorv32/setups/examples'
make[1]: *** [Makefile:50: OrangeCrab] Error 2
make[1]: Leaving directory '/home/jeremy/Desktop/umarcor/neorv32/setups/examples'
make: *** [Makefile:64: MinimalBoot] Error 2
make: Leaving directory '/home/jeremy/Desktop/umarcor/neorv32/setups/examples'

I tried to find the best place to adjust this size, but my VHDL is very limited (I'm more of a verilog guy).

@jeremyherbert
Copy link
Contributor

I should also add, this is obviously an issue with the inferred RAMs being way too large, the 25F device should have plenty of RAM. See specs here: https://1bitsquared.com/products/orangecrab

@umarcor
Copy link
Collaborator Author

umarcor commented Jul 5, 2021

@jeremyherbert, you are correct. The default sizes of dmem and imem (64*1024) are too large. Those are to be used with the SPRAM hard blocks in UP5K devices. I updated the instantiation in neorv32_OrangeCrab_BoardTop_MinimalBoot for setting those to 32*1024. It now seems to succeed.

the 25F device should have plenty of RAM

The 25F has 126KB or BRAM, which is very slightly below the default 128KB in the MinimalBoot template. That is still much more that the 2KB minimum we have run tests with. Hence, I set both to 32KB arbitrarily.

@umarcor
Copy link
Collaborator Author

umarcor commented Jul 5, 2021

I tried to find the best place to adjust this size, but my VHDL is very limited (I'm more of a verilog guy).

See https://github.com/stnolting/neorv32/blob/master/rtl/templates/processor/neorv32_ProcessorTop_MinimalBoot.vhd#L42-L91. Those are the generic parameters defined in the template (entity) we are using.

When instantiating that template/entity, BoardTop files typically override the clock frequency only (see https://github.com/stnolting/neorv32/blob/master/setups/examples/neorv32_Fomu_BoardTop_MinimalBoot.vhd#L133), but you can override any of those parameters: https://github.com/stnolting/neorv32/pull/98/files#diff-6b40cb886766bdba185127a98b9d943808cbe81c670ff6ff8cbf3423fa40ed0bR80-R84.

@umarcor
Copy link
Collaborator Author

umarcor commented Jul 5, 2021

The OrangeCrab has an external DDR with 128 MB. Hence, the mid-term goal should be to append a controller to the external bus of the NEORV32 and have the IMEM and/or DMEM assigned there.

Note that we can have mixed-language designs. Therefore, we might use the same strategy as in antonblanchard/microwatt: they use litedram (see enjoy-digital/litedram).

@stnolting
Copy link
Owner

This is work in progress for supporting ECP5 devices, and specifically the OrangeCrab board.

Awesome work! 👍

@jeremyherbert
Thanks for the testing this! Btw, the utilization report is interesting since this is the first time somebody (as far as I know) used an EPC5 FPGA.
I am wondering: EPC5 provides distributed RAM I think. It this included in TRELLIS_SLICE somehow? Or did the synthesis did not use distributed RAM at all? 🤔

@umarcor

The 25F has 126KB or BRAM, which is very slightly below the default 128KB in the MinimalBoot template. That is still much more that the 2KB minimum we have run tests with. Hence, I set both to 32KB arbitrarily.

As far as I know, sysMEM primitives in the EPC5 are 2Kbytes (+ bits for something like EEC but I think synthesis cannot use these extra bits for something meaningful here) each. According to the report, Jeremy's FPGA provides 56 of these blocks making 56*2=112kB. 🤔

When instantiating that template/entity, BoardTop files typically override the clock frequency only (see https://github.com/stnolting/neorv32/blob/master/setups/examples/neorv32_Fomu_BoardTop_MinimalBoot.vhd#L133), but you can override any of those parameters: https://github.com/stnolting/neorv32/pull/98/files#diff-6b40cb886766bdba185127a98b9d943808cbe81c670ff6ff8cbf3423fa40ed0bR80-R84.

Thanks for clearing this. @jeremyherbert The core's configuration parameters for memory size are:

MEM_INT_IMEM_SIZE : natural := 16*1024; -- size of processor-internal instruction memory in bytes

and
MEM_INT_DMEM_SIZE : natural := 8*1024; -- size of processor-internal data memory in bytes

@stnolting
Copy link
Owner

The OrangeCrab has an external DDR with 128 MB. Hence, the mid-term goal should be to append a controller to the external bus of the NEORV32 and have the IMEM and/or DMEM assigned there.

That would be awesome!

Note that we can have mixed-language designs. Therefore, we might use the same strategy as in antonblanchard/microwatt: they use litedram (see enjoy-digital/litedram).

That looks very good! I'm very happy that the memory controller supports Wishbone so there is no need for extra bridging logic.
I will check out the documentation. That would be also a nice add-on for the current Quartus and Vivado examples.

@umarcor
Copy link
Collaborator Author

umarcor commented Jul 5, 2021

I am wondering: EPC5 provides distributed RAM I think. It this included in TRELLIS_SLICE somehow? Or did the synthesis did not use distributed RAM at all? 🤔

I believe that is decided by the synthesis tool. If it sees it fit as a BRAM, it does not try to use the distributed RAM. If it cannot fit into a BRAM, then distributed RAM is used (e.g. reading multiple registers from a bank at the same time).

As far as I know, sysMEM primitives in the EPC5 are 2Kbytes (+ bits for something like EEC but I think synthesis cannot use these extra bits for something meaningful here) each. According to the report, Jeremy's FPGA provides 56 of these blocks making 56*2=112kB. 🤔

I took the number from 1008 Kb - Embedded Block RAM in https://1bitsquared.com/products/orangecrab: 1008 / 8 = 126.

The nextpnr log in the CI run reports 18/56 to be used. If we are using 2 x 32KB, that should be 8 BRAMs for each of imem/dmem, and other 2 BRAMs for the bootrom? So, 32KB == 8 BRAMs? Something does not seem correct...

@umarcor
Copy link
Collaborator Author

umarcor commented Jul 5, 2021

@stnolting: (112 / 8) * 9 = 126. Hence the maximum usable size is 112KB, which takes 126KB physically, due to the additional EEC bit for each byte.

@stnolting
Copy link
Owner

stnolting commented Jul 5, 2021

I believe that is decided by the synthesis tool. If it sees it fit as a BRAM, it does not try to use the distributed RAM. If it cannot fit into a BRAM, then distributed RAM is used (e.g. reading multiple registers from a bank at the same time).

I am not sure here. I think Vivado does it the other way around: if a RAM is larger than a certain threshold it is mapped to BRAM. If a BRAM description does some non-defualt thinks (like entire memory reset) it is mapped to (lots and lots of) distributed RAM.

I took the number from 1008 Kb - Embedded Block RAM in https://1bitsquared.com/products/orangecrab: 1008 / 8 = 126.

😄 👍

The nextpnr log in the CI run reports 18/56 to be used. If we are using 2 x 32KB, that should be 8 BRAMs for each of imem/dmem, and other 2 BRAMs for the bootrom? So, 32KB == 8 BRAMs? Something does not seem correct...

The default bootloader uses 4kB of memory making a total of 32kB + 32kB + 4kB = 68kB. With 2kB in each sysMEM this should be 34 BRAMs. So yeah, there is something going wrong. I will also have a look at the implementation report.

edit
Maybe synthesis needs 4 additional BRAMs for the CPU register file...

@umarcor
Copy link
Collaborator Author

umarcor commented Jul 5, 2021

I believe that is decided by the synthesis tool. If it sees it fit as a BRAM, it does not try to use the distributed RAM. If it cannot fit into a BRAM, then distributed RAM is used (e.g. reading multiple registers from a bank at the same time).

I am not sure here. I think Vivado does it the other way around: if a RAM is larger than a certain threshold it is mapped to BRAM. If a BRAM description does some non-defualt thinks (like entire memory reset) it is mapped to (lots and lots of) distributed RAM.

Note that synthesis and implementation tools do support specifying it explicitly, either as a global option or through attributes in the VHDL sources. However, the specific mechanism depends on the tool. See the Tip in https://ghdl.github.io/ghdl/using/Synthesis.html#synthesis-options (https://github.com/ghdl/ghdl/blob/master/src/ghdldrv/ghdlsynth.adb#L138-L239).

@stnolting
Copy link
Owner

DMEM and bootloader ROM mappings seem right. But I cannot find any IMEM related mapping results 🤔

neorv32_inst: entity work.neorv32_ProcessorTop_MinimalBoot
generic map (
CLOCK_FREQUENCY => f_clock_c, -- clock frequency of clk_i in Hz
MEM_INT_IMEM_EN => false,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False??? Is this a typo?

Copy link
Collaborator Author

@umarcor umarcor Jul 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. My bad 😟 . I'll fix that immediately.

Copy link
Collaborator Author

@umarcor umarcor Jul 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Now synthesis results are ok! 34/56. Thanks for finding that stupid mistake...

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries 😄

@stnolting
Copy link
Owner

Looks fine now! 🎉

Seems like the register file is not mapped to BRAM, but is is identified as RAM:

../../rtl/core/neorv32_cpu_regfile.vhd:75:10:note: found RAM "reg_file", width: 32 bits, depth: 32
  signal reg_file     : reg_file_t;

I wonder if it is mapped to distributed RAM (but there is nothing in the report) or if it is implemented using plain logic? 🤔

@umarcor
Copy link
Collaborator Author

umarcor commented Jul 6, 2021

I wonder if it is mapped to distributed RAM (but there is nothing in the report) or if it is implemented using plain logic? 🤔

Search regfile in the log. Results 14, 15 and 16 are the following:

2021-07-05T14:54:01.1540932Z 2.24.4. Executing OPT_MEM_FEEDBACK pass (finding memory read-to-write feedback paths).
2021-07-05T14:54:01.1594152Z 
2021-07-05T14:54:01.1594758Z 2.24.5. Executing MEMORY_SHARE pass (consolidating $memrd/$memwr cells).
2021-07-05T14:54:01.1631434Z Consolidating read ports of memory neorv32_OrangeCrab_BoardTop_MinimalBoot.neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150 by address:
2021-07-05T14:54:01.3021972Z Processing neorv32_OrangeCrab_BoardTop_MinimalBoot.neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150:
2021-07-05T14:54:01.3022962Z   Properties: ports=3 bits=1024 rports=2 wports=1 dbits=32 abits=5 words=32
2021-07-05T14:54:01.3023626Z   Checking rule #1 for bram type $__ECP5_PDPW16KD (variant 1):
2021-07-05T14:54:01.3024706Z     Bram geometry: abits=9 dbits=36 wports=0 rports=0
2021-07-05T14:54:01.3025265Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3026598Z     Metrics for $__ECP5_PDPW16KD: awaste=480 dwaste=4 bwaste=17408 waste=17408 efficiency=5
2021-07-05T14:54:01.3027844Z     Rule #1 for bram type $__ECP5_PDPW16KD (variant 1) rejected: requirement 'min bits 2048' not met.
2021-07-05T14:54:01.3028609Z   Checking rule #2 for bram type $__ECP5_PDPW16KD (variant 1):
2021-07-05T14:54:01.3029396Z     Bram geometry: abits=9 dbits=36 wports=0 rports=0
2021-07-05T14:54:01.3031972Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3032854Z     Metrics for $__ECP5_PDPW16KD: awaste=480 dwaste=4 bwaste=17408 waste=17408 efficiency=5
2021-07-05T14:54:01.3034404Z     Rule for bram type $__ECP5_PDPW16KD (variant 1) rejected: requirement 'attribute syn_ramstyle="block_ram" ...' not met.
2021-07-05T14:54:01.3035566Z   Checking rule #3 for bram type $__ECP5_PDPW16KD (variant 1):
2021-07-05T14:54:01.3036399Z     Bram geometry: abits=9 dbits=36 wports=0 rports=0
2021-07-05T14:54:01.3037350Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3038237Z     Metrics for $__ECP5_PDPW16KD: awaste=480 dwaste=4 bwaste=17408 waste=17408 efficiency=5
2021-07-05T14:54:01.3040042Z     Rule #3 for bram type $__ECP5_PDPW16KD (variant 1) rejected: requirement 'max wports 0' not met.
2021-07-05T14:54:01.3041078Z   Checking rule #4 for bram type $__ECP5_DP16KD (variant 1):
2021-07-05T14:54:01.3041652Z     Bram geometry: abits=10 dbits=18 wports=0 rports=0
2021-07-05T14:54:01.3042447Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3043124Z     Metrics for $__ECP5_DP16KD: awaste=992 dwaste=4 bwaste=17984 waste=17984 efficiency=2
2021-07-05T14:54:01.3044159Z     Rule #4 for bram type $__ECP5_DP16KD (variant 1) rejected: requirement 'min efficiency 5' not met.
2021-07-05T14:54:01.3044902Z   Checking rule #4 for bram type $__ECP5_DP16KD (variant 2):
2021-07-05T14:54:01.3045622Z     Bram geometry: abits=11 dbits=9 wports=0 rports=0
2021-07-05T14:54:01.3046398Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3047224Z     Metrics for $__ECP5_DP16KD: awaste=2016 dwaste=4 bwaste=18272 waste=18272 efficiency=1
2021-07-05T14:54:01.3048533Z     Rule #4 for bram type $__ECP5_DP16KD (variant 2) rejected: requirement 'min efficiency 5' not met.
2021-07-05T14:54:01.3049420Z   Checking rule #4 for bram type $__ECP5_DP16KD (variant 3):
2021-07-05T14:54:01.3050172Z     Bram geometry: abits=12 dbits=4 wports=0 rports=0
2021-07-05T14:54:01.3050784Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3051494Z     Metrics for $__ECP5_DP16KD: awaste=4064 dwaste=0 bwaste=16256 waste=16256 efficiency=0
2021-07-05T14:54:01.3052686Z     Rule #4 for bram type $__ECP5_DP16KD (variant 3) rejected: requirement 'min efficiency 5' not met.
2021-07-05T14:54:01.3053700Z   Checking rule #4 for bram type $__ECP5_DP16KD (variant 4):
2021-07-05T14:54:01.3054237Z     Bram geometry: abits=13 dbits=2 wports=0 rports=0
2021-07-05T14:54:01.3054797Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3055451Z     Metrics for $__ECP5_DP16KD: awaste=8160 dwaste=0 bwaste=16320 waste=16320 efficiency=0
2021-07-05T14:54:01.3056603Z     Rule #4 for bram type $__ECP5_DP16KD (variant 4) rejected: requirement 'min efficiency 5' not met.
2021-07-05T14:54:01.3057545Z   Checking rule #4 for bram type $__ECP5_DP16KD (variant 5):
2021-07-05T14:54:01.3058112Z     Bram geometry: abits=14 dbits=1 wports=0 rports=0
2021-07-05T14:54:01.3058707Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3059545Z     Metrics for $__ECP5_DP16KD: awaste=16352 dwaste=0 bwaste=16352 waste=16352 efficiency=0
2021-07-05T14:54:01.3060822Z     Rule #4 for bram type $__ECP5_DP16KD (variant 5) rejected: requirement 'min efficiency 5' not met.
2021-07-05T14:54:01.3061479Z   Checking rule #5 for bram type $__ECP5_DP16KD (variant 1):
2021-07-05T14:54:01.3061988Z     Bram geometry: abits=10 dbits=18 wports=0 rports=0
2021-07-05T14:54:01.3062547Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3063352Z     Metrics for $__ECP5_DP16KD: awaste=992 dwaste=4 bwaste=17984 waste=17984 efficiency=2
2021-07-05T14:54:01.3064741Z     Rule for bram type $__ECP5_DP16KD (variant 1) rejected: requirement 'attribute syn_ramstyle="block_ram" ...' not met.
2021-07-05T14:54:01.3065526Z   Checking rule #5 for bram type $__ECP5_DP16KD (variant 2):
2021-07-05T14:54:01.3066250Z     Bram geometry: abits=11 dbits=9 wports=0 rports=0
2021-07-05T14:54:01.3067221Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3067916Z     Metrics for $__ECP5_DP16KD: awaste=2016 dwaste=4 bwaste=18272 waste=18272 efficiency=1
2021-07-05T14:54:01.3069106Z     Rule for bram type $__ECP5_DP16KD (variant 2) rejected: requirement 'attribute syn_ramstyle="block_ram" ...' not met.
2021-07-05T14:54:01.3070519Z   Checking rule #5 for bram type $__ECP5_DP16KD (variant 3):
2021-07-05T14:54:01.3071085Z     Bram geometry: abits=12 dbits=4 wports=0 rports=0
2021-07-05T14:54:01.3071815Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3072632Z     Metrics for $__ECP5_DP16KD: awaste=4064 dwaste=0 bwaste=16256 waste=16256 efficiency=0
2021-07-05T14:54:01.3073850Z     Rule for bram type $__ECP5_DP16KD (variant 3) rejected: requirement 'attribute syn_ramstyle="block_ram" ...' not met.
2021-07-05T14:54:01.3074675Z   Checking rule #5 for bram type $__ECP5_DP16KD (variant 4):
2021-07-05T14:54:01.3075207Z     Bram geometry: abits=13 dbits=2 wports=0 rports=0
2021-07-05T14:54:01.3075765Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3076422Z     Metrics for $__ECP5_DP16KD: awaste=8160 dwaste=0 bwaste=16320 waste=16320 efficiency=0
2021-07-05T14:54:01.3077450Z     Rule for bram type $__ECP5_DP16KD (variant 4) rejected: requirement 'attribute syn_ramstyle="block_ram" ...' not met.
2021-07-05T14:54:01.3078161Z   Checking rule #5 for bram type $__ECP5_DP16KD (variant 5):
2021-07-05T14:54:01.3078693Z     Bram geometry: abits=14 dbits=1 wports=0 rports=0
2021-07-05T14:54:01.3079252Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3079905Z     Metrics for $__ECP5_DP16KD: awaste=16352 dwaste=0 bwaste=16352 waste=16352 efficiency=0
2021-07-05T14:54:01.3081081Z     Rule for bram type $__ECP5_DP16KD (variant 5) rejected: requirement 'attribute syn_ramstyle="block_ram" ...' not met.
2021-07-05T14:54:01.3082195Z   Checking rule #6 for bram type $__ECP5_DP16KD (variant 1):
2021-07-05T14:54:01.3082783Z     Bram geometry: abits=10 dbits=18 wports=0 rports=0
2021-07-05T14:54:01.3083547Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3084400Z     Metrics for $__ECP5_DP16KD: awaste=992 dwaste=4 bwaste=17984 waste=17984 efficiency=2
2021-07-05T14:54:01.3085685Z     Rule #6 for bram type $__ECP5_DP16KD (variant 1) rejected: requirement 'max wports 0' not met.
2021-07-05T14:54:01.3086681Z   Checking rule #6 for bram type $__ECP5_DP16KD (variant 2):
2021-07-05T14:54:01.3087407Z     Bram geometry: abits=11 dbits=9 wports=0 rports=0
2021-07-05T14:54:01.3088438Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3089260Z     Metrics for $__ECP5_DP16KD: awaste=2016 dwaste=4 bwaste=18272 waste=18272 efficiency=1
2021-07-05T14:54:01.3090396Z     Rule #6 for bram type $__ECP5_DP16KD (variant 2) rejected: requirement 'max wports 0' not met.
2021-07-05T14:54:01.3091226Z   Checking rule #6 for bram type $__ECP5_DP16KD (variant 3):
2021-07-05T14:54:01.3091882Z     Bram geometry: abits=12 dbits=4 wports=0 rports=0
2021-07-05T14:54:01.3092857Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3093844Z     Metrics for $__ECP5_DP16KD: awaste=4064 dwaste=0 bwaste=16256 waste=16256 efficiency=0
2021-07-05T14:54:01.3094729Z     Rule #6 for bram type $__ECP5_DP16KD (variant 3) rejected: requirement 'max wports 0' not met.
2021-07-05T14:54:01.3095371Z   Checking rule #6 for bram type $__ECP5_DP16KD (variant 4):
2021-07-05T14:54:01.3095878Z     Bram geometry: abits=13 dbits=2 wports=0 rports=0
2021-07-05T14:54:01.3096433Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3097052Z     Metrics for $__ECP5_DP16KD: awaste=8160 dwaste=0 bwaste=16320 waste=16320 efficiency=0
2021-07-05T14:54:01.3097947Z     Rule #6 for bram type $__ECP5_DP16KD (variant 4) rejected: requirement 'max wports 0' not met.
2021-07-05T14:54:01.3098574Z   Checking rule #6 for bram type $__ECP5_DP16KD (variant 5):
2021-07-05T14:54:01.3099097Z     Bram geometry: abits=14 dbits=1 wports=0 rports=0
2021-07-05T14:54:01.3099655Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:01.3100367Z     Metrics for $__ECP5_DP16KD: awaste=16352 dwaste=0 bwaste=16352 waste=16352 efficiency=0
2021-07-05T14:54:01.3101490Z     Rule #6 for bram type $__ECP5_DP16KD (variant 5) rejected: requirement 'max wports 0' not met.
2021-07-05T14:54:01.3102106Z   No acceptable bram resources found.
2021-07-05T14:54:02.9432064Z 2.28. Executing MEMORY_BRAM pass (mapping $mem cells to block memories).
2021-07-05T14:54:02.9476499Z Processing neorv32_OrangeCrab_BoardTop_MinimalBoot.neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150:
2021-07-05T14:54:02.9477776Z   Properties: ports=3 bits=1024 rports=2 wports=1 dbits=32 abits=5 words=32
2021-07-05T14:54:02.9478434Z   Checking rule #1 for bram type $__TRELLIS_DPR16X4 (variant 1):
2021-07-05T14:54:02.9479210Z     Bram geometry: abits=4 dbits=4 wports=0 rports=0
2021-07-05T14:54:02.9480026Z     Estimated number of duplicates for more read ports: dups=1
2021-07-05T14:54:02.9481293Z     Metrics for $__TRELLIS_DPR16X4: awaste=0 dwaste=0 bwaste=0 waste=0 efficiency=100
2021-07-05T14:54:02.9482011Z     Rule #1 for bram type $__TRELLIS_DPR16X4 (variant 1) accepted.
2021-07-05T14:54:02.9482766Z     Mapping to bram type $__TRELLIS_DPR16X4 (variant 1):
2021-07-05T14:54:02.9483337Z       Write port #0 is in clock domain \OrangeCrab_CLK.
2021-07-05T14:54:02.9484241Z         Mapped to bram port B1.
2021-07-05T14:54:02.9484737Z       Read port #0 is in clock domain \OrangeCrab_CLK.
2021-07-05T14:54:02.9485210Z         Mapped to bram port A1.1.
2021-07-05T14:54:02.9485698Z       Read port #1 is in clock domain \OrangeCrab_CLK.
2021-07-05T14:54:02.9486191Z         Failed to map read port #1.
2021-07-05T14:54:02.9486704Z       Growing more read ports by duplicating bram cells.
2021-07-05T14:54:02.9487288Z       Read port #0 is in clock domain \OrangeCrab_CLK.
2021-07-05T14:54:02.9488059Z         Mapped to bram port A1.1.
2021-07-05T14:54:02.9488727Z       Read port #1 is in clock domain \OrangeCrab_CLK.
2021-07-05T14:54:02.9489157Z         Mapped to bram port A1.2.
2021-07-05T14:54:02.9489639Z       Updated properties: dups=2 waste=0 efficiency=50
2021-07-05T14:54:02.9491217Z Extracted data FF from read port 0 of neorv32_OrangeCrab_BoardTop_MinimalBoot.neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150: $\neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150$rdreg[0]
2021-07-05T14:54:02.9493079Z Extracted data FF from read port 1 of neorv32_OrangeCrab_BoardTop_MinimalBoot.neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150: $\neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150$rdreg[1]
2021-07-05T14:54:02.9495247Z       Creating $__TRELLIS_DPR16X4 cell at grid position <0 0 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.0.0.0
2021-07-05T14:54:02.9496630Z       Creating $__TRELLIS_DPR16X4 cell at grid position <0 0 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.0.0.1
2021-07-05T14:54:02.9498119Z       Creating $__TRELLIS_DPR16X4 cell at grid position <0 1 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.0.1.0
2021-07-05T14:54:02.9500121Z       Creating $__TRELLIS_DPR16X4 cell at grid position <0 1 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.0.1.1
2021-07-05T14:54:02.9501410Z       Creating $__TRELLIS_DPR16X4 cell at grid position <1 0 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.1.0.0
2021-07-05T14:54:02.9502675Z       Creating $__TRELLIS_DPR16X4 cell at grid position <1 0 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.1.0.1
2021-07-05T14:54:02.9503967Z       Creating $__TRELLIS_DPR16X4 cell at grid position <1 1 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.1.1.0
2021-07-05T14:54:02.9505389Z       Creating $__TRELLIS_DPR16X4 cell at grid position <1 1 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.1.1.1
2021-07-05T14:54:02.9516714Z       Creating $__TRELLIS_DPR16X4 cell at grid position <2 0 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.2.0.0
2021-07-05T14:54:02.9517768Z       Creating $__TRELLIS_DPR16X4 cell at grid position <2 0 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.2.0.1
2021-07-05T14:54:02.9518796Z       Creating $__TRELLIS_DPR16X4 cell at grid position <2 1 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.2.1.0
2021-07-05T14:54:02.9519829Z       Creating $__TRELLIS_DPR16X4 cell at grid position <2 1 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.2.1.1
2021-07-05T14:54:02.9520851Z       Creating $__TRELLIS_DPR16X4 cell at grid position <3 0 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.3.0.0
2021-07-05T14:54:02.9521880Z       Creating $__TRELLIS_DPR16X4 cell at grid position <3 0 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.3.0.1
2021-07-05T14:54:02.9522907Z       Creating $__TRELLIS_DPR16X4 cell at grid position <3 1 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.3.1.0
2021-07-05T14:54:02.9523918Z       Creating $__TRELLIS_DPR16X4 cell at grid position <3 1 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.3.1.1
2021-07-05T14:54:02.9524945Z       Creating $__TRELLIS_DPR16X4 cell at grid position <4 0 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.4.0.0
2021-07-05T14:54:02.9525959Z       Creating $__TRELLIS_DPR16X4 cell at grid position <4 0 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.4.0.1
2021-07-05T14:54:02.9526993Z       Creating $__TRELLIS_DPR16X4 cell at grid position <4 1 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.4.1.0
2021-07-05T14:54:02.9528021Z       Creating $__TRELLIS_DPR16X4 cell at grid position <4 1 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.4.1.1
2021-07-05T14:54:02.9529033Z       Creating $__TRELLIS_DPR16X4 cell at grid position <5 0 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.5.0.0
2021-07-05T14:54:02.9532341Z       Creating $__TRELLIS_DPR16X4 cell at grid position <5 0 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.5.0.1
2021-07-05T14:54:02.9533395Z       Creating $__TRELLIS_DPR16X4 cell at grid position <5 1 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.5.1.0
2021-07-05T14:54:02.9534608Z       Creating $__TRELLIS_DPR16X4 cell at grid position <5 1 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.5.1.1
2021-07-05T14:54:02.9535791Z       Creating $__TRELLIS_DPR16X4 cell at grid position <6 0 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.6.0.0
2021-07-05T14:54:02.9536857Z       Creating $__TRELLIS_DPR16X4 cell at grid position <6 0 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.6.0.1
2021-07-05T14:54:02.9537910Z       Creating $__TRELLIS_DPR16X4 cell at grid position <6 1 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.6.1.0
2021-07-05T14:54:02.9538968Z       Creating $__TRELLIS_DPR16X4 cell at grid position <6 1 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.6.1.1
2021-07-05T14:54:02.9540101Z       Creating $__TRELLIS_DPR16X4 cell at grid position <7 0 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.7.0.0
2021-07-05T14:54:02.9541150Z       Creating $__TRELLIS_DPR16X4 cell at grid position <7 0 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.7.0.1
2021-07-05T14:54:02.9542214Z       Creating $__TRELLIS_DPR16X4 cell at grid position <7 1 0>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.7.1.0
2021-07-05T14:54:02.9544130Z       Creating $__TRELLIS_DPR16X4 cell at grid position <7 1 1>: neorv32_inst.neorv32_inst.neorv32_cpu_inst.neorv32_cpu_regfile_inst.7150.7.1.1

See also:

2021-07-05T14:54:16.9078397Z Info: Logic utilisation before packing:
2021-07-05T14:54:16.9078985Z Info:     Total LUT4s:      5736/24288    23%
2021-07-05T14:54:16.9079598Z Info:         logic LUTs:   4916/24288    20%
2021-07-05T14:54:16.9079988Z Info:         carry LUTs:    628/24288     2%
2021-07-05T14:54:16.9080389Z Info:           RAM LUTs:    128/12144     1%
2021-07-05T14:54:16.9080953Z Info:          RAMW LUTs:     64/ 6072     1%
2021-07-05T14:54:16.9081231Z 
2021-07-05T14:54:16.9081768Z Info:      Total DFFs:      3484/24288    14%

@jeremyherbert
Copy link
Contributor

I had some success this morning, I got to the blinking boot LED and also have UART (19200 baud, 8N1) outputting the following:

<< NEORV32 Bootloader >>

BLDV: Jun 30 2021
HWV:  0x01050710
CLK:  0x016e3600
MISA: 0x40801105
ZEXT: 0x00000041
PROC: 0x0067000d
IMEM: 0x00004000 bytes @0x00000000
DMEM: 0x00002000 bytes @0x80000000

Autoboot in 8s. Press key to abort.
Loading... 
ERROR_3

I am doing this all over JTAG, not via the OrangeCrab USB bootloader and the SPI flash, so I assume that is what the ERROR_3 is about. The changes I made were:

  1. Adjust PNRFLAGS to the 25F device in the makefile
  2. Adjust RGB LED outputs to also output gpio_o[0] to the red LED
  3. Reduce DMEM/IMEM size
  4. Instantiate PLL using parameters from ecppll (currently set to 24MHz) and connect to neorv32 core
  5. Change the reset line logic for the neorv32 (both OrangeCrab and neorv32 are active low)
  6. Not necessary, but I connected the PLL LOCK output to a GPIO so I could confirm it was indeed locked

Note: The constraints file for the OrangeCrab doesn't match the silkscreen for some pins (pin 9 is one of them). The correct constraints are in this pinout picture: https://gregdavill.github.io/OrangeCrab/r0.2/docs/pinout/

I filed a bug for that here: orangecrab-fpga/orangecrab-examples#20

Below is the VHDL I used for the BoardTop. It's a bit hacky as I was just messing around to get stuff to work. I would create a pull request, but I am not too familiar with VHDL formatting guidelines, so I was hoping you could make these changes in your branch in a neat way. I chose 24MHz as the clock arbitrarily; according to the PnR it should run up to ~70MHz.

-- #################################################################################################
-- # << NEORV32 - Example setup including the bootloader, for the OrangeCrab (c) Board >>          #
-- # ********************************************************************************************* #
-- # BSD 3-Clause License                                                                          #
-- #                                                                                               #
-- # Copyright (c) 2021, Stephan Nolting. All rights reserved.                                     #
-- #                                                                                               #
-- # Redistribution and use in source and binary forms, with or without modification, are          #
-- # permitted provided that the following conditions are met:                                     #
-- #                                                                                               #
-- # 1. Redistributions of source code must retain the above copyright notice, this list of        #
-- #    conditions and the following disclaimer.                                                   #
-- #                                                                                               #
-- # 2. Redistributions in binary form must reproduce the above copyright notice, this list of     #
-- #    conditions and the following disclaimer in the documentation and/or other materials        #
-- #    provided with the distribution.                                                            #
-- #                                                                                               #
-- # 3. Neither the name of the copyright holder nor the names of its contributors may be used to  #
-- #    endorse or promote products derived from this software without specific prior written      #
-- #    permission.                                                                                #
-- #                                                                                               #
-- # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS   #
-- # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF               #
-- # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE    #
-- # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,     #
-- # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE #
-- # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED    #
-- # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING     #
-- # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED  #
-- # OF THE POSSIBILITY OF SUCH DAMAGE.                                                            #
-- # ********************************************************************************************* #
-- # The NEORV32 Processor - https://github.com/stnolting/neorv32              (c) Stephan Nolting #
-- #################################################################################################

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity neorv32_OrangeCrab_BoardTop_MinimalBoot is
  port (
    -- Clock and Reset inputs
    OrangeCrab_CLK : in std_logic;
    OrangeCrab_RST_N : in std_logic;
    -- LED outputs
    OrangeCrab_LED_RGB_R : out std_logic;
    OrangeCrab_LED_RGB_G : out std_logic;
    OrangeCrab_LED_RGB_B : out std_logic;
    -- UART0
    OrangeCrab_GPIO_0 : in  std_logic;
    OrangeCrab_GPIO_1 : out std_logic;
    -- USB Pins (which should be statically driven if not being used)
    OrangeCrab_USB_D_P   : out std_logic;
    OrangeCrab_USB_D_N   : out std_logic;
    OrangeCrab_USB_DP_PU : out std_logic;
    OrangeCrab_GPIO_9 : out std_logic
  );
end entity;

architecture neorv32_OrangeCrab_BoardTop_MinimalBoot_rtl of neorv32_OrangeCrab_BoardTop_MinimalBoot is

  -- configuration --
  constant f_clock_c : natural := 24000000; -- PLL output clock frequency in Hz

  -- internal IO connection --
  signal con_pwm  : std_logic_vector(2 downto 0);
  signal con_gpio_o : std_ulogic_vector(3 downto 0);

  signal clk_o: std_logic;


  component EHXPLLL
  generic
  (
    CLKI_DIV         : integer := 1;
    CLKFB_DIV        : integer := 1;
    CLKOP_DIV        : integer := 8;
    CLKOS_DIV        : integer := 8;
    CLKOS2_DIV       : integer := 8;
    CLKOS3_DIV       : integer := 8;
    CLKOP_ENABLE     : string  := "ENABLED";
    CLKOS_ENABLE     : string  := "DISABLED";
    CLKOS2_ENABLE    : string  := "DISABLED";
    CLKOS3_ENABLE    : string  := "DISABLED";
    CLKOP_CPHASE     : integer := 0;
    CLKOS_CPHASE     : integer := 0;
    CLKOS2_CPHASE    : integer := 0;
    CLKOS3_CPHASE    : integer := 0;
    CLKOP_FPHASE     : integer := 0;
    CLKOS_FPHASE     : integer := 0;
    CLKOS2_FPHASE    : integer := 0;
    CLKOS3_FPHASE    : integer := 0;
    FEEDBK_PATH      : string  := "CLKOP";
    CLKOP_TRIM_POL   : string  := "RISING";
    CLKOP_TRIM_DELAY : integer := 0;
    CLKOS_TRIM_POL   : string  := "RISING";
    CLKOS_TRIM_DELAY : integer := 0;
    OUTDIVIDER_MUXA  : string  := "DIVA";
    OUTDIVIDER_MUXB  : string  := "DIVB";
    OUTDIVIDER_MUXC  : string  := "DIVC";
    OUTDIVIDER_MUXD  : string  := "DIVD";
    PLL_LOCK_MODE    : integer := 0;
    PLL_LOCK_DELAY   : integer := 200;
    STDBY_ENABLE     : string  := "DISABLED";
    REFIN_RESET      : string  := "DISABLED";
    SYNC_ENABLE      : string  := "DISABLED";
    INT_LOCK_STICKY  : string  := "ENABLED";
    DPHASE_SOURCE    : string  := "DISABLED";
    PLLRST_ENA       : string  := "DISABLED";
    INTFB_WAKE       : string  := "DISABLED" 
  );
  port
  (
    CLKI, CLKFB,
    RST, STDBY, PLLWAKESYNC,
    PHASESEL1, PHASESEL0, PHASEDIR, PHASESTEP, PHASELOADREG,
    ENCLKOP, ENCLKOS, ENCLKOS2, ENCLKOS3 : IN std_logic := 'X';
    CLKOP, CLKOS, CLKOS2, CLKOS3, LOCK, INTLOCK,
    REFCLK, CLKINTFB : OUT std_logic := 'X' 
  );
  end component;

begin

  -- Assign USB pins to "0" so as to disconnect OrangeCrab from
  -- the host system.  Otherwise it would try to talk to
  -- us over USB, which wouldn't work since we have no stack.
  OrangeCrab_USB_D_P   <= '0';
  OrangeCrab_USB_D_N   <= '0';
  OrangeCrab_USB_DP_PU <= '0';

  -- The core of the problem ----------------------------------------------------------------
  -- -------------------------------------------------------------------------------------------

  PLL_inst: EHXPLLL
  generic map
  (
    CLKI_DIV        =>  2, -- from `ecppll -i 48 -o 24`
    CLKFB_DIV       =>  1, 

    CLKOP_DIV       =>  25
  )
  port map
  (
    CLKI => OrangeCrab_CLK, 
    CLKFB => clk_o,
    ENCLKOP => '1', 
    CLKOP  => clk_o,
    LOCK => OrangeCrab_GPIO_9
  );

  neorv32_inst: entity work.neorv32_ProcessorTop_MinimalBoot
  generic map (
    CLOCK_FREQUENCY => f_clock_c,  -- clock frequency of clk_i in Hz
    MEM_INT_IMEM_SIZE => 16*1024,
    MEM_INT_DMEM_SIZE => 8*1024
  )
  port map (
    -- Global control --
    clk_i      => std_ulogic(clk_o),
    rstn_i     => std_ulogic(OrangeCrab_RST_N),

    -- GPIO --
    gpio_o     => con_gpio_o,

    -- primary UART --
    uart_txd_o => OrangeCrab_GPIO_1, -- UART0 send data
    uart_rxd_i => OrangeCrab_GPIO_0, -- UART0 receive data
    uart_rts_o => open, -- hw flow control: UART0.RX ready to receive ("RTR"), low-active, optional
    uart_cts_i => '0',  -- hw flow control: UART0.TX allowed to transmit, low-active, optional

    -- PWM (to on-board RGB LED) --
    pwm_o      => con_pwm
  );

  OrangeCrab_LED_RGB_R <= not con_gpio_o(0);
  OrangeCrab_LED_RGB_G <= con_pwm(1);
  OrangeCrab_LED_RGB_B <= con_pwm(2);

end architecture;

Here are the GPIO pins in the constraint file that match the silkscreen (note the commented out ones have no marking on the silkscreen)

LOCATE COMP "OrangeCrab_GPIO_0" SITE "N17";
IOBUF  PORT "OrangeCrab_GPIO_0" IO_TYPE=LVCMOS33;
IOBUF  PORT "OrangeCrab_GPIO_0" PULLMODE=DOWN;
LOCATE COMP "OrangeCrab_GPIO_1" SITE "M18";
IOBUF  PORT "OrangeCrab_GPIO_1" IO_TYPE=LVCMOS33;
IOBUF  PORT "OrangeCrab_GPIO_1" PULLMODE=DOWN;

#LOCATE COMP "OrangeCrab_GPIO_2" SITE "B10";
#IOBUF  PORT "OrangeCrab_GPIO_2" IO_TYPE=LVCMOS33;
#IOBUF  PORT "OrangeCrab_GPIO_2" PULLMODE=DOWN;
#LOCATE COMP "OrangeCrab_GPIO_3" SITE "B9";
#IOBUF  PORT "OrangeCrab_GPIO_3" IO_TYPE=LVCMOS33;
#IOBUF  PORT "OrangeCrab_GPIO_3" PULLMODE=DOWN;
#LOCATE COMP "OrangeCrab_GPIO_4" SITE "C8";
#IOBUF  PORT "OrangeCrab_GPIO_4" IO_TYPE=LVCMOS33;
#IOBUF  PORT "OrangeCrab_GPIO_4" PULLMODE=DOWN;

LOCATE COMP "OrangeCrab_GPIO_5" SITE "B10";
IOBUF  PORT "OrangeCrab_GPIO_5" IO_TYPE=LVCMOS33;
IOBUF  PORT "OrangeCrab_GPIO_5" PULLMODE=DOWN;
LOCATE COMP "OrangeCrab_GPIO_6" SITE "B9";
IOBUF  PORT "OrangeCrab_GPIO_6" IO_TYPE=LVCMOS33;
IOBUF  PORT "OrangeCrab_GPIO_6" PULLMODE=DOWN;

#LOCATE COMP "OrangeCrab_GPIO_7" SITE "H2";
#IOBUF  PORT "OrangeCrab_GPIO_7" IO_TYPE=LVCMOS33;
#IOBUF  PORT "OrangeCrab_GPIO_7" PULLMODE=DOWN;
#LOCATE COMP "OrangeCrab_GPIO_8" SITE "J2";
#IOBUF  PORT "OrangeCrab_GPIO_8" IO_TYPE=LVCMOS33;
#IOBUF  PORT "OrangeCrab_GPIO_8" PULLMODE=DOWN;

LOCATE COMP "OrangeCrab_GPIO_9" SITE "C8";
IOBUF  PORT "OrangeCrab_GPIO_9" IO_TYPE=LVCMOS33;
IOBUF  PORT "OrangeCrab_GPIO_9" PULLMODE=DOWN;
LOCATE COMP "OrangeCrab_GPIO_10" SITE "B8";
IOBUF  PORT "OrangeCrab_GPIO_10" IO_TYPE=LVCMOS33;
IOBUF  PORT "OrangeCrab_GPIO_10" PULLMODE=DOWN;
LOCATE COMP "OrangeCrab_GPIO_11" SITE "A8";
IOBUF  PORT "OrangeCrab_GPIO_11" IO_TYPE=LVCMOS33;
IOBUF  PORT "OrangeCrab_GPIO_11" PULLMODE=DOWN;
LOCATE COMP "OrangeCrab_GPIO_12" SITE "H2";
IOBUF  PORT "OrangeCrab_GPIO_12" IO_TYPE=LVCMOS33;
IOBUF  PORT "OrangeCrab_GPIO_12" PULLMODE=DOWN;
LOCATE COMP "OrangeCrab_GPIO_13" SITE "J2";
IOBUF  PORT "OrangeCrab_GPIO_13" IO_TYPE=LVCMOS33;
IOBUF  PORT "OrangeCrab_GPIO_13" PULLMODE=DOWN;

@jeremyherbert
Copy link
Contributor

Could I also suggest generating the svf file as well as the bit file? It would be handy for us JTAG users.

@stnolting
Copy link
Owner

@umarcor

Thanks for clearing the distributed RAM issue! :hear:

@jeremyherbert

I had some success this morning, I got to the blinking boot LED and also have UART (19200 baud, 8N1) outputting the following:
[...]
I am doing this all over JTAG, not via the OrangeCrab USB bootloader and the SPI flash, so I assume that is what the ERROR_3 is about

I'm happy to hear that! :)
Correct, ERROR_3 means the bootloader was trying to access SPI flash but failed fetching an executable from there.

The changes I made were:
Adjust PNRFLAGS to the 25F device in the makefile
Adjust RGB LED outputs to also output gpio_o[0] to the red LED
Reduce DMEM/IMEM size
Instantiate PLL using parameters from ecppll (currently set to 24MHz) and connect to neorv32 core
Change the reset line logic for the neorv32 (both OrangeCrab and neorv32 are active low)
Not necessary, but I connected the PLL LOCK output to a GPIO so I could confirm it was indeed locked

Sounds reasonable. You could also use the PLL's lockes signal to drive the processor's reset and connect the PLL reset to an external button. But this is just an idea and absolutely not critical here.

Below is the VHDL I used for the BoardTop. It's a bit hacky as I was just messing around to get stuff to work. I would create a pull request, but I am not too familiar with VHDL formatting guidelines, so I was hoping you could make these changes in your branch in a neat way. I chose 24MHz as the clock arbitrarily; according to the PnR it should run up to ~70MHz.

I am fine with this. This looks like a very nice start setup 👍
What do you think, @umarcor ?

Could I also suggest generating the svf file as well as the bit file? It would be handy for us JTAG users.

According to this https://gitlab.raptorengineering.com/dormito/litex/-/commit/b014c7194be669d8c471c11074e775b0d6e6d471 it should be possible. But I am an absolute beginner when it comes to Trellis 😅

@stnolting stnolting marked this pull request as ready for review July 6, 2021 17:07
@umarcor umarcor force-pushed the orangecrab branch 3 times, most recently from 4730a2b to b109890 Compare July 6, 2021 19:13
@umarcor
Copy link
Collaborator Author

umarcor commented Jul 6, 2021

The following tasks are done (some before and other after Jeremy's update):

  • Adjust PNRFLAGS to the 25F device in the makefile.
  • Adjust RGB LED outputs to or con_gpio_o(0) to the red LED.
  • Reduce DMEM/IMEM size.
  • Add setups/osflow/devices/ecp5/ecp5_components.vhd.
    • Instantiate PLL using parameters from ecppll (currently set to 24MHz) and connect to neorv32 core.
  • Change the reset line logic for the neorv32 (both OrangeCrab and neorv32 are active low).
  • Not necessary, but I connected the PLL LOCK output to a GPIO so I could confirm it was indeed locked.
  • Add svf target. make -C setups/examples/ BOARD=OrangeCrab TASK='clean bit svf' MinimalBoot generates both the bitstream and the SVF file.

However, I did not modify the constraints file. The modifications provided by Jeremy are inconsistent, in the sense that some of the bits of OrangeCrab_GPIO_* would be undefined, or the same pin would be assigned to multiple bits. Anyway, the example should work because I used GPIO_4, corresponding to pin C8, which is the one that Jeremy was using as GPIO_9.

@umarcor umarcor force-pushed the orangecrab branch 2 times, most recently from acd6915 to 3024afe Compare July 7, 2021 01:12
@jeremyherbert
Copy link
Contributor

Ok, I fetched all of these changes and built it, loaded the newly generated svf and it works great:

<< NEORV32 Bootloader >>

BLDV: Jun 30 2021
HWV:  0x01050710
CLK:  0x016e3600
MISA: 0x40801105
ZEXT: 0x00000041
PROC: 0x0067000d
IMEM: 0x00004000 bytes @0x00000000
DMEM: 0x00002000 bytes @0x80000000

Autoboot in 8s. Press key to abort.
Aborted.

Available CMDs:
 h: Help
 r: Restart
 u: Upload
 s: Store to flash
 l: Load from flash
 e: Execute
CMD:> u
Awaiting neorv32_exe.bin... OK
CMD:> e
Booting...


                                                                                       ##                                       
                                                                                       ##         ##   ##   ##                  
 ##     ##   #########   ########    ########   ##      ##   ########    ########      ##       ################                
####    ##  ##          ##      ##  ##      ##  ##      ##  ##      ##  ##      ##     ##     ####            ####              
## ##   ##  ##          ##      ##  ##      ##  ##      ##          ##         ##      ##       ##   ######   ##                
##  ##  ##  #########   ##      ##  #########   ##      ##      #####        ##        ##     ####   ######   ####              
##   ## ##  ##          ##      ##  ##    ##     ##    ##           ##     ##          ##       ##   ######   ##                
##    ####  ##          ##      ##  ##     ##     ##  ##    ##      ##   ##            ##     ####            ####              
##     ##    #########   ########   ##      ##      ##       ########   ##########     ##       ################                
                                                                                       ##         ##   ##   ##                  
                                                                                       ##                                       
Hello world! :)

Two small issues:

  1. The file that is generated is called neorv32_OrangeCrab_r02-85F_MinimalBoot but this should have 25F in the place of 85F. It is actually being built for the 25F.
  2. Please add the --compress flag to ecppack, it makes the svf load on to the device 2-3x faster

I think that once these two are sorted, this should be good to merge.

And one problem for me to work on is that I have not tested anything to do with the flash (in this case, W25Q128JVP). Does this core/bootloader support QSPI? I ask because I think the default USB bootloader on the OrangeCrab switches the flash to QSPI mode, so if this original bootloader is kept, then the neorv32 core will not be able to communicate with the flash if it can't do QSPI. As far as I can tell there is no way to reset the flash back to normal SPI from the FPGA, it must be power cycled (which will then invoke the original bootloader to switch the flash into QSPI, etc).

Ideally I would like to keep the USB bootloader, and then store the neorv32 bit file at an offset in the flash past this bootloader, and then store the actual executable at a further offset in the flash (it's a 16MByte flash, so there should be plenty of room for all of this). I assume that the steps to do this are something like the following

  1. Adjust the bootloader code to read/write into an offset of the flash (which will be original bootloader size + FPGA bitfile size)
  2. and/or adjust the core to add this offset to all flash read/writes

Any ideas, suggestions or pointers here would be appreciated.

PS: with respect to the constraints file, I would suggest adding a note to the readme or similar about this. The main issue is that, for example, the silkscreen pin numbers do not have a pin 4 anywhere; the only numbered GPIO pins on the silkscreen are 0, 1, 5, 6, 9, 10, 11, 12, 13; the rest have names like SCL, SDA or are muxed into an external ADC. I personally would suggest deleting the lines that were commented out in my previous comment, and just using the ones that were not commented as these match the silkscreen. But I am nonetheless happy whichever way makes the most sense to everyone.

@jeremyherbert
Copy link
Contributor

I see that I should have read the documentation closer around the flash offset: https://stnolting.github.io/neorv32/ug/#_customizing_the_internal_bootloader

@jeremyherbert
Copy link
Contributor

I should have also read the flash datasheet further as well… it seems that SPI commands are always accepted. So I’m thinking that all I need to do to get application loading from flash is to rebuild the bootloader with the correct flash offset, then copy that newly built executable into the vhd file which contains the bootloader, and it should work from there?

@umarcor
Copy link
Collaborator Author

umarcor commented Jul 7, 2021

Ok, I fetched all of these changes and built it, loaded the newly generated svf and it works great:

🎉 🎉 🎉

Two small issues:

1. The file that is generated is called `neorv32_OrangeCrab_r02-85F_MinimalBoot` but this should have 25F in the place of 85F. It is actually being built for the 25F.

2. Please add the `--compress` flag to `ecppack`, it makes the svf load on to the device 2-3x faster

I think that once these two are sorted, this should be good to merge.

I fixed both of those. The SVF is now 522K, instead of 1.3M.

@stnolting, I think this is ready to merge.

PS: with respect to the constraints file, I would suggest adding a note to the readme or similar about this. The main issue is that, for example, the silkscreen pin numbers do not have a pin 4 anywhere; the only numbered GPIO pins on the silkscreen are 0, 1, 5, 6, 9, 10, 11, 12, 13; the rest have names like SCL, SDA or are muxed into an external ADC. I personally would suggest deleting the lines that were commented out in my previous comment, and just using the ones that were not commented as these match the silkscreen. But I am nonetheless happy whichever way makes the most sense to everyone.

My main point is that it does not make much sense to care about fixing it in this repo only. It is particularly confusing to have GPIO as an array of 13 bits, if ~8 of them have other purposes. My proposal is:

  1. Have it clarified/fixed in PCF files do not match the actual pinout for GPIOs orangecrab-fpga/orangecrab-examples#20.
  2. Update https://github.com/gregdavill/OrangeCrab-examples/blob/main/verilog/orangecrab_r0.2.pcf and https://github.com/hdl/constraints/blob/main/board/OrangeCrab/constraints.lpf.
  3. Update here.

In fact, my expectation is to submodule hdl/constraints at some point. That's why I'm picking the constraints files from there, and maintaining the naming scheme. Overall, maintaining constraint files and board metadata should be out of scope of this repository. We do need those resources, but the effort devoted to it in the context of NEORV32 should be minimal.

@stnolting
Copy link
Owner

@jeremyherbert

Ok, I fetched all of these changes and built it, loaded the newly generated svf and it works great:

Good to hear! 👍 🎉

it seems that SPI commands are always accepted.

I think that all SPI flashs always support "normal" (= single-bit) SPI as a fallback.

So I’m thinking that all I need to do to get application loading from flash is to rebuild the bootloader with the correct flash offset, then copy that newly built executable into the vhd file which contains the bootloader, and it should work from there?

That's right! But you do not need to copy anything - the makefile will take care of all that.
You can do the following:

neorv32/sw/bootloader$ make USER_FLAGS+=-DSPI_BOOT_BASE_ADDR=0x12345678 clean_all bootloader

This will override the default SPI boot address with 0x12345678, re-compile the bootloader sources and also install the executable image to the according VHDL file. You just need to re-synthesize the design.

ℹ️ All default defines that shall be overriden have to be added to the USER_FLAGS variable using the -D prefix.

@stnolting stnolting merged commit 0718411 into stnolting:master Jul 7, 2021
@umarcor umarcor deleted the orangecrab branch July 7, 2021 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants