Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProASIC3 Starter Kit Port of NEORV32 #2

Closed
salmansheikh opened this issue Oct 2, 2020 · 19 comments
Closed

ProASIC3 Starter Kit Port of NEORV32 #2

salmansheikh opened this issue Oct 2, 2020 · 19 comments
Labels
wontfix This will not be worked on

Comments

@salmansheikh
Copy link

I got the design programmed into a MicroSemi A3PE-Starter-Kit board but nothing was showing on the UART and then I realized when I ran synplify in LiberoSOC v11.9, it said it would only run at ~ 31MHz and now when I ran SmartTime its showing a clock of 26.076MHz. That explains why nothing is working. So, one choice is to put in a PLL and take 40MHz clock from oscillator on the board going to FPGA and drop to 25MHz and use that for the clock. But what can I do to the design to speed it up. I know the ProASIC3 1.5M gate is >> than a Lattice FPGA. Maybe not as fast as the Xilinx Arty running at 100MHz. Could it be the RAM modules or some other parts I should run through the IP Catalog to generate more optimized area/timing components for the ProASIC3?

proasic3

@stnolting
Copy link
Owner

I'm not really familiar with Microsemi. If your FPGA is a low-power architecture similar to the Lattice iCE40 family, then 25MHz might be quite "fast". I am using a Lattice iCE40 UltraPlus and currently the maximum frequency for the NEORV32 is somewhere around 24 MHz.

Anyway, you should check the synthesis results for the critical path. Maybe Libero has a problems with mapping the register file or the internal memories.

@salmansheikh
Copy link
Author

salmansheikh commented Oct 2, 2020

I will also try on a Xilinx VC707 board I have. But if it works at > 24MHz, I am going to push to use it over a ColdFire V1 core we bought (at work) that is giving me issues. Might put a NEORV32 in space ;)

It is low power but the LVDS can run to 350MHz and 66MHz 64-bit PCI can be implemented so, I think it should go a little faster than the Lattice. I am using a board with an A3PE1500 with 1.5M gates.

proasic3e

@stnolting
Copy link
Owner

So it is a low-power FPGA and from what I have seen it only provides 3-input LUTs - so you need more levels of logic for each combinatorial function. Also, there is no dedicated carry logic which will slow down large arithmetic circuits.
What does the timing report say? Can you figure out where the critical path is?

@salmansheikh
Copy link
Author

okay, another question. My design says its achieving 29MHz clock rate. So, I have a 40MHz on my dev board and our final design is supposed to be 24MHz system clock. I used a PLL to take the 40Mhz oscillator to 24MHz and use that to drive the neorv32 which should be fine. Do I make CLOCK_FREQUENCY 40M or 25MHz. The 25MHz is going to the logic, 40MHz only to the PLL. I don't see CLOCK_FREQUENCY used except forsysinfo_mem(0) variable.

@stnolting
Copy link
Owner

The CLOCK_FREQUENCY generic is used to pass the actual operating frequency of the processor setup (clk_i signal) to the software. An application can determine the actual clock speed via the SYSINFO's SYSINFO_CLK register.

For the hardware, the CLOCK_FREQUENCY generic is irrelevant. But the default bootloader uses this generic to configure the UART baud rate for the actually used clock frequency.

I'm using this approach to have a bootloader, that works independently of the actual hardware setup (including the actual clock speed). If the clock speed was defined directly in the bootloader's source code, one would have to recompile it every time the system uses a different clock speed than my default setup.

@salmansheikh
Copy link
Author

salmansheikh commented Oct 7, 2020 via email

@stnolting
Copy link
Owner

If the 24 MHz signal is connected to the processor's clk_i signal then CLOCK_FREQUENCY should be 24000000.

What configuration are you using for the processor (generics)?
If the bootloader is enabled, you should see a blinking light when connecting an LED to pin 0 of the gpio_o port.

@salmansheikh
Copy link
Author

salmansheikh commented Oct 7, 2020 via email

@stnolting
Copy link
Owner

grafik

I think there is something missing in your last post...?! 🤔

@salmansheikh
Copy link
Author

salmansheikh commented Oct 13, 2020 via email

@stale
Copy link

stale bot commented Dec 12, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Dec 12, 2020
@stale stale bot closed this as completed Dec 19, 2020
@salmansheikh
Copy link
Author

I finally got around to porting the neorv32 to the ProASIC board (Microsemi A3PE-STARTER-KIT REV A). I got the bootloader up but the neorv32_exe.bin files I am uploading fail. I am generating them in WSL Ubuntu 20.04 and then copying them to my Windows host and using Tera Term VT to send the files but get an ERROR_0 error. It shouldn't matter that the tools are in linux and I am uploading the binaries in windows, right?

neorv32_proasic3
neorv32_pro

@stnolting
Copy link
Owner

Great to hear! 👍

It shouldn't matter that the tools are in linux and I am uploading the binaries in windows, right?

Right, that should not be a problem. I use the same setup without problems.

but get an ERROR_0

Seems like there is a problem with the executable itself. Which program have you compiled?

By the way: I highly encourage you to update to the recent version of the processor. Version 1.4.3.3 is more than a year old and still had a lot of bugs. 😉

@salmansheikh
Copy link
Author

salmansheikh commented Jan 13, 2022 via email

@stnolting
Copy link
Owner

okay, i will update.

👍 Get in touch if you have any compatibility problems.

Then I will eventually work on the memory interfaces (have to learn how to use wishbone bus) for the daughter cards (I have two of them) the first one with 2MB of SRAM and the top one with 2MB MRAM and 32KB EEPROM

If you (some of) these memories have a serial interface you can use the processor's 📚 SPI module to connect them. Furthermore, the latest version of the processor also contains an 📚 execute in place (XIP) module (via SPI) that allows to use a serial flash for direct code execution.

Finally, I have 2 custom IP blocks (microsequencer and NAND flash) that I want to add to the system

You could use the processor's 📚 custom functions subsystem (CFS) for that. Basically, this subsystem is a blank tightly-coupled module that can be used to implement custom co-processors and interfaces.

I want to prove we can do the same with a RISC-V for future projects...

Sounds like an interesting project! 👍

@salmansheikh
Copy link
Author

salmansheikh commented Jan 13, 2022 via email

@stnolting
Copy link
Owner

I started the minimal synthesis last night and found it took 3 hrs 24 minutes and 3M core cells even though the device only has 35K cells.

I know a similar behavior from Intel Quartus when I synthesize a design, which uses more memory than there is available in the FPGA (the tool tries to build memories from LUT+FF when the BRAM resources are exhausted). So what sizes of IMEM and DMEM did you configure?

Before I downloaded the latest version, the synplify tool kept giving issues with the sda_data_io and sda_clk_io even though I wasn't using them. Kept saying can't be constants.

So this issues is resolved now that you updated to latest version??

[image: InkedneoRV32_LI.jpg]
[image: minimal_my_foot.png]
[image: gates_look.png]

I think you cannot attach files when responding via email 🤔

@salmansheikh
Copy link
Author

Not sure if sda_data/clk_io signals was resolved in latest version. I think I copied the changes to the top level of instantiating those signals from the old version into the new. I will try removing them to see if its still an issue. These inputs on the rtl_gates graphic below in Synplify are showing like a 64 bit gpio_i going into the design. If its truly optimized away, they shouldn't show.

I did 64K default for both memories. I don't think its the memory. Only 36 of 60 of the Block RAMS. Lots of core IO cells.

Target Part: A3PE1500_PQFP208_STD
Report for cell neorv32_ProcessorTop_MinimalBoot.neorv32_processortop_minimalboot_rtl
Core Cell usage:
cell count area count*area
AND2 739 1.0 739.0
AND2A 17 1.0 17.0
AND3 163 1.0 163.0
AND3A 1 1.0 1.0
AO1 874 1.0 874.0
AO13 34 1.0 34.0
AO18 15 1.0 15.0
AO1A 163 1.0 163.0
AO1B 6 1.0 6.0
AO1C 64 1.0 64.0
AO1D 12 1.0 12.0
AOI1 88 1.0 88.0
AOI1A 24 1.0 24.0
AOI1B 46 1.0 46.0
AX1 64 1.0 64.0
AX1A 1 1.0 1.0
AX1B 12 1.0 12.0
AX1C 20 1.0 20.0
AX1D 2 1.0 2.0
AX1E 29 1.0 29.0
AXO3 1 1.0 1.0
BUFF 283 1.0 283.0
CLKINT 5 0.0 0.0
GND 25 0.0 0.0
INV 3 1.0 3.0
MAJ3 25 1.0 25.0
MIN3 23 1.0 23.0
MX2 938492 1.0 938492.0
MX2A 481 1.0 481.0
MX2B 60 1.0 60.0
MX2C 493 1.0 493.0
NOR2 398 1.0 398.0
NOR2A 5682 1.0 5682.0
NOR2B 44740 1.0 44740.0
NOR3 83 1.0 83.0
NOR3A 622 1.0 622.0
NOR3B 15037 1.0 15037.0
NOR3C 65562 1.0 65562.0
OA1 280 1.0 280.0
OA1A 100 1.0 100.0
OA1B 81 1.0 81.0
OA1C 23 1.0 23.0
OAI1 14 1.0 14.0
OR2 627 1.0 627.0
OR2A 661 1.0 661.0
OR2B 998 1.0 998.0
OR3 1322 1.0 1322.0
OR3A 36 1.0 36.0
OR3B 155 1.0 155.0
OR3C 316 1.0 316.0
PLL 1 0.0 0.0
VCC 25 0.0 0.0
XA1 33 1.0 33.0
XA1A 17 1.0 17.0
XA1B 2 1.0 2.0
XA1C 2 1.0 2.0
XAI1 3 1.0 3.0
XAI1A 1 1.0 1.0
XNOR2 289 1.0 289.0
XNOR3 34 1.0 34.0
XO1 14 1.0 14.0
XO1A 3 1.0 3.0
XOR2 332 1.0 332.0
XOR3 43 1.0 43.0

          DFN1 29615      1.0    29615.0
        DFN1C0    98      1.0       98.0
        DFN1C1    73      1.0       73.0
        DFN1E0  6813      1.0     6813.0
      DFN1E0C0    41      1.0       41.0
        DFN1E1 919115      1.0   919115.0
      DFN1E1C0    56      1.0       56.0
      DFN1E1P0    26      1.0       26.0
        DFN1P0    12      1.0       12.0
        RAM4K9     4      0.0        0.0
     RAM512X18    32      0.0        0.0
               -----          ----------
         TOTAL 2035686           2035594.0

IO Cell usage:
cell count
INBUF 3
OUTBUF 8
-----
TOTAL 11

Core Cells : 2035594 of 38400 (5301%)
IO Cells : 11

RAM/ROM Usage Summary
Block Rams : 36 of 60 (60%)

minimal_my_foot
gates_look
InkedneoRV32_LI

@stnolting
Copy link
Owner

stnolting commented Jan 13, 2022

I did 64K default for both memories. I don't think its the memory. Only 36 of 60 of the Block RAMS. Lots of core IO cells.

According to the A3PE1500 datasheet the FPGA contains 270 kBit of RAM - that makes ~33 kByte. So 2x 64kB memories won't fit. Can you try a smaller memory configuration? For example IMEM = 16kB and DMEM = 4kB

stnolting pushed a commit that referenced this issue Jan 28, 2024
NULL assertion fix in FIFO, was PR #766
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants