-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix - eliminates potential shift of data by 4 bytes. #313
Conversation
Thank you very much for the fix! I will have a closer look at your modifications. The support for dynamic memory allocation is still in an early stage. Right now, the maximum heap size is defined in a static way (in the linker script), which does not reflect the actual application's requirements. Furthermore, this default heap size unnecessarily increases the size of the .bss segment - even if no dynamic memory is used at all. Do you have any idea how to make this more "portable" and "user friendly"? |
I have tested your modifications with some simple newlib/malloc setup. I cannot see any differences in the symbol tables and dynamic memory allocation seems to run without problems even when using the original crt0 and linker script. The modification you made in crt0 seems to fix a rare corner case, where there is no initialization data. This is indeed a bug that should be fixed! 👍 The modifications in the linker script do not seems to alter any functionality, but using I saw that you are using a rather old version of the processor and it's framework (v1.6.7.8). This was the very first version that supported newlib's dynamic memory allocation. Maybe the problems you were encountering were fixed in a later version. Could you update your processor version to the latest version and/or could you provide some minimal example that fails memory allocation? |
Thank you so much for considering my pull request. I will certainly bump my
version of NeoRV32 to see if it makes any difference. (And let's hope that
it didn't get any bigger in terms of LUTs.. ;)
The modification in crt0 that I proposed is not really that much of a
corner case. Any application where the data segment is loaded directly
(without the 'AT' construction in the linker script) will have equal
LOADADDR and ADDR pointers. This is valid when using a bootloader. In that
case there is no need to keep a copy of the unmodified data segment, as the
entire application including the data segment will be reloaded upon reset.
The failing memory allocation was a result of a shift in the data
segment address (off by 4). The pre-initialized data structure that malloc
uses was therefore incorrect, which causes it to end up in an endless loop.
It is just one of the many things that could have gone wrong; it
just showed up in malloc by chance. I'll try to reproduce this with a
smaller application, as you requested. In my case it happened with a binary
of roughly 700 kB. If I had looked better in the disassembly, I would have
noticed that x11 and x12 got loaded with nearly the same address (just
before __crt0_copy_data_loop), but not exactly the same address.
PS.. your crt0.s jumps directly to main, and skips crt0_begin / premain.
This causes the static constructors of C++ objects not to be executed. ;-)
… Message ID: ***@***.***>
|
The opposite should be the case! There were a lot of logic optimizations since v1.6.7.8 (especially in the CPU core) - plus some performance optimizations. And most important there were some (severe) bug fixes... See CHANGELOG.md. Btw, it would be really great if you could rebase your branch as well! ;)
Good point! I was talking about the default setup using the default toolchain (with the default linker script). It is also possible to use different, non-NEORV32-specific start-up files and linker scripts. This has been tested using a Segger Studio based setup and also using Zephyr OS.
However, I still cannot identify the cause of this offset (yet)... Anyway, your modifications of the linker script seem just fine!
I am no expert when it comes to low-level boot mechanisms and all the C runtime related. I had to learn all the basics and I am quite happy the core is able to boot. 😉 And I am absolutely no expert when it comes to C++ and all it's constructors and stuff like that. There was just a comment on our gitter channel today targeting that issue. It would be really great to support C++ applications - at least a bare minimum level, but as I said, I am not really familiar with that. |
I'm no expert either, but is it usual for RISC-V targets to replace the C libraries It seems to me that neorv's |
I am not sure, but I think this is highly platform-dependent. In terms of the NEORV32, the start-up code does some than "just" setting up stack, bss and data segments: besides doing a "software reset" of all IO devices, we also configure certain CSRs to setup a very simple "early trap handler". The newlib crt0 you have linked says
Maybe this is intended just for user-mode programs (where we cannot access all those CSRs)? I don't know.
So you mean calling newlib's crt0 from NEORV32's crt0? This might work, but I am not sure right now if the makefiles actually include those start-up files. I think we are using a "free standing" setup here. However, adding calls
Where did you find |
I see, although I'm not really sure what they mean to say with "user mode". newlib is for embedded / standalone targets only, as far as I know? OTOH this code is part of libgloss, so I really don't know...
Maybe that's a better solution. These functions are in the libc, so they should be available. (I'm not sure if atexit is available in the embedded toolchain. In C++, people often use the GCC flag If we want to register
Here's an example linker script from stackoverflow for STM32:
I guess it should be similar for our usecase. The main aspects are that the linker scripts provides symbols bracketing the sections so the startup code can iterate through the constructors. Furthermore the symbols need to be SORTed (as it's possible to assign priorities to constructors / destructors) and they need to be KEEP (they may appear unreferenced to the linker, but should not be removed).
It was mentioned by @GideonZ in a comment. Maybe it's used in some other toolchains? |
Hi Johannes, Stephan,
Interesting conversation; allow me to join in. It is indeed true that the
current crt0.s does not reference the __init_array and __fini_array areas
and therefore the static C++ constructors are not called (nor the
destructors upon exit, if that would ever happen). I am currently calling
them in my main function before anything else, but this is a silly
temporary workaround. It works tho.
You mention that these startup code functions are defined in libgloss. Yes,
I think so, too. Libgloss is automatically linked when you link your .elf
file using gcc. I don't exactly know from the top of my head how the
makefiles from NeoRV32 handle this, but when I tried to link with gcc, and
libgloss was automatically linked to, I ran into issues with sbrk and other
low level stuff. The emitted code used "ecall" to invoke "kernel" routines,
so I ran into traps in the very first malloc.
So now I am linking with LD instead, and specify --nostartfiles, and simply
link the assembled @stnolting 's crt0.s (crt0.o) into the .elf. This
works. Maybe the system calls in libgloss are defined with
attribute(__weak__), so they may get overridden by linking syscalls.c. I am
not sure, because I did many things to get my application to run on RiscV,
so I might not have tried all combinations.
I am still running into random crashes with FreeRTOS, but that's for
another thread.
Gideon
|
This should be fixed since version 1.6.7.8 I think. There is a newlib example program Thanks for the input @GideonZ and @jpf91! I think it should not be too difficult to add the init and fini arrays to the linker script and to call the according functions from crt0. However, I think this would add additional code (from the C runtime) even if a plain C program does not use any constructor or destructor stuff. Btw, this is a plain C example that defines functions to be called from void __attribute__((constructor)) test_constructor(void) {
neorv32_uart1_setup(19200, PARITY_NONE, FLOW_CONTROL_NONE);
neorv32_uart1_printf("\nHello CONstructor!\n\n");
}
void __attribute__((destructor)) test_destructor(void) {
neorv32_uart1_setup(19200, PARITY_NONE, FLOW_CONTROL_NONE);
neorv32_uart1_printf("\nHello DEstructor!\n\n");
} Anyway, we should open another issue for adding/discussing support of constructors and destructors. |
I have fixed the typo and updated this branch from main. I am ok with merging this. What do you think @GideonZ? |
Hi!
It took me about a day to find, but I ran into the issue that the way the linker script provided the symbols for the start/end of the data segment may cause corruption. I found that depending on the size of the text segment, the variable __crt0_copy_data_src_begin was off by 4 bytes. (4 bytes less than __crt0_copy_data_dst_begin in my case where the load address and run address were the same). In my case, this caused malloc to hang. The suggested change does not show this behavior.
Tested by inserting a variable number of 'nop' instructions to modify the size of the text segment.
Also, a small addition to crt0.s; this copy does not need to take place when src and dst are the same.
Thanks for the NeoRV32!!
Best regards,
Gideon